In our previous post, we walked through creating a basic Dockerfile. However, we noticed a significant issue: the resulting image for our simple "Hello World" app was a hefty 1.62GB (uncompressed)! That’s not ideal for efficient deployment and resource utilization.
Today, we're diving into how to dramatically reduce your Docker image size by carefully choosing the right base image. You might be surprised by how much impact this single decision can have.
The Impact of Your Base Image
When you build a Docker image, you're essentially layering instructions on top of a foundational image – the base image. All your commands, dependencies, and application code get added on to this base. Consequently, the size and contents of your base image have a direct impact on the final size of your container image.
As an example, in our previous attempt, using FROM node:latest
resulted in a 1.62GB (uncompressed) image. That's a lot of bloat for a tiny Node.js application!
Just by switching our base image to FROM node:alpine
, we saw the size drop to 244MB (uncompressed). That's a huge improvement, but can we do better? Absolutely!
Understanding Different Base Image Types
Let's explore the common types of base images and when to consider using them:
Standard Images: These are the full-fledged OS-based images like Ubuntu, Debian, or others. They come packed with a wide range of libraries and tools. While convenient, these images tend to be large due to all the extra baggage they carry. Unless you're unsure about your application's OS dependencies or are in a real rush, it's best to avoid them for production containers due to their size and resource consumption.
Alpine Images: These images are based on the super lightweight Alpine Linux distribution. They are much smaller than standard images because they contain only the bare minimum packages needed to run your application. They are ideal for most use-cases and are one of the best choices when starting out with Docker optimization. However, be sure to test your application thoroughly when first switching to Alpine images, as they might lack OS-level dependencies that your application unexpectedly relies on.
Slim Images: While the name might suggest otherwise, slim images can sometimes be larger than Alpine images, but still much smaller than standard ones. They often include only the packages and dependencies required to run specific applications, so it's worth exploring these images if their package set fits your use case. They can be based on various distributions like Alpine, CentOS, or Debian.
-
Distroless Images: These are specially designed for multi-stage Docker builds. They contain only your application and its runtime dependencies, excluding package managers, shells, and other common Linux utilities. This makes them incredibly small and helps improve the security posture of your containers. Distroless images are perfect for production deployments after you understand the entire application stack and dependencies.
As Google, the creator of distroless images, puts it, "Distroless images contain only your application and its runtime dependencies." You can read more about them here.
Choosing the Right Image: A Practical Approach
Deciding which image to use might seem daunting, but a simple approach helps here:
- Start with Alpine: Begin with an Alpine-based image like
node:alpine
for Node.js applications. - Inspect and Verify: Examine the image's layers and included packages on Docker Hub. This can give you insights into the image's composition. For example, you can check the layers of a specific image tag like
node:current-alpine3.20
here. - Add Dependencies Manually: If an Alpine image lacks required dependencies, you can install them manually within your Dockerfile. You can even create your own custom base image from scratch if needed.
- Advance to Distroless: For production, after thorough testing, and when the full application dependencies are well-known, consider using distroless images for maximum size reduction and security.
Let's Compare Image Sizes
To illustrate the point, let's take a look at the image sizes we saw in our example, using different base images. We'll show both uncompressed and compressed sizes for comparison:
Base Image & Build Setup | Image Tag | Uncompressed Docker Image | Compressed Docker Image |
---|---|---|---|
node:20-alpine (build) & gcr.io/distroless/nodejs20-debian12
|
mycontainer:distroless |
191 MB | 49.61 MB |
node:slim |
mycontainer:slim |
364 MB | 78.52 MB |
node:alpine |
mycontainer:alpine |
244 MB | 56.68 MB |
node:latest |
mycontainer:default |
1.62 GB | 381.73 MB |
Note: The uncompressed size can be checked by using the command docker image ls | grep mycontainer
. The compressed size can be seen when the image is pushed to a container registry or by using docker save mycontainer:distroless | gzip -c | wc -c
to see the compressed file size in bytes.
Whats the difference in compressed and uncompressed size?
Optimization through Distroless (Multi-stage Build)
Now, let's see how to optimize it even further using a Distroless image. Here's an updated Dockerfile:
FROM node:20-alpine AS build-env
WORKDIR /app
COPY package.json yarn.lock ./
ENV NODE_ENV=production
RUN yarn install --frozen-lockfile --production
RUN du -sh /app/node_modules && exit 1
COPY index.js ./
FROM gcr.io/distroless/nodejs20-debian12
WORKDIR /app
COPY --from=build-env /app /app
CMD ["index.js"]
In this approach:
- We use
node:20-alpine
as a builder image to install our dependencies, copy the source, and prepare the application for production. - We then copy all necessary artifacts (
/app
) into a distroless imagegcr.io/distroless/nodejs20-debian12
which only contains the absolute runtime requirements.
This method creates a final image with a minimal footprint, as demonstrated in the table.
Conclusion
Choosing the right base image is a critical step in optimizing your Docker images. Start with Alpine images, manually add needed dependencies, and ultimately aim for distroless images in production. By understanding the trade-offs of each image type, you can greatly reduce your image size, improving efficiency, performance, and resource consumption. Stay tuned for more Docker tips and tricks in future blog posts!
The table clearly highlights the dramatic size difference when using different base images and build strategies. The multi-stage distroless approach yields the smallest final image, making it ideal for production deployments.
Top comments (0)