DEV Community

Atharva Unde
Atharva Unde

Posted on • Originally published at blog.atharvaunde.com

Base Images: The Secret to Smaller Docker Images

In our previous post, we walked through creating a basic Dockerfile. However, we noticed a significant issue: the resulting image for our simple "Hello World" app was a hefty 1.62GB (uncompressed)! That’s not ideal for efficient deployment and resource utilization.

Today, we're diving into how to dramatically reduce your Docker image size by carefully choosing the right base image. You might be surprised by how much impact this single decision can have.

The Impact of Your Base Image

When you build a Docker image, you're essentially layering instructions on top of a foundational image – the base image. All your commands, dependencies, and application code get added on to this base. Consequently, the size and contents of your base image have a direct impact on the final size of your container image.

As an example, in our previous attempt, using FROM node:latest resulted in a 1.62GB (uncompressed) image. That's a lot of bloat for a tiny Node.js application!

Just by switching our base image to FROM node:alpine, we saw the size drop to 244MB (uncompressed). That's a huge improvement, but can we do better? Absolutely!

Understanding Different Base Image Types

Let's explore the common types of base images and when to consider using them:

  1. Standard Images: These are the full-fledged OS-based images like Ubuntu, Debian, or others. They come packed with a wide range of libraries and tools. While convenient, these images tend to be large due to all the extra baggage they carry. Unless you're unsure about your application's OS dependencies or are in a real rush, it's best to avoid them for production containers due to their size and resource consumption.

  2. Alpine Images: These images are based on the super lightweight Alpine Linux distribution. They are much smaller than standard images because they contain only the bare minimum packages needed to run your application. They are ideal for most use-cases and are one of the best choices when starting out with Docker optimization. However, be sure to test your application thoroughly when first switching to Alpine images, as they might lack OS-level dependencies that your application unexpectedly relies on.

  3. Slim Images: While the name might suggest otherwise, slim images can sometimes be larger than Alpine images, but still much smaller than standard ones. They often include only the packages and dependencies required to run specific applications, so it's worth exploring these images if their package set fits your use case. They can be based on various distributions like Alpine, CentOS, or Debian.

  4. Distroless Images: These are specially designed for multi-stage Docker builds. They contain only your application and its runtime dependencies, excluding package managers, shells, and other common Linux utilities. This makes them incredibly small and helps improve the security posture of your containers. Distroless images are perfect for production deployments after you understand the entire application stack and dependencies.

    As Google, the creator of distroless images, puts it, "Distroless images contain only your application and its runtime dependencies." You can read more about them here.

Choosing the Right Image: A Practical Approach

Deciding which image to use might seem daunting, but a simple approach helps here:

  • Start with Alpine: Begin with an Alpine-based image like node:alpine for Node.js applications.
  • Inspect and Verify: Examine the image's layers and included packages on Docker Hub. This can give you insights into the image's composition. For example, you can check the layers of a specific image tag like node:current-alpine3.20 here.
  • Add Dependencies Manually: If an Alpine image lacks required dependencies, you can install them manually within your Dockerfile. You can even create your own custom base image from scratch if needed.
  • Advance to Distroless: For production, after thorough testing, and when the full application dependencies are well-known, consider using distroless images for maximum size reduction and security.

Let's Compare Image Sizes

To illustrate the point, let's take a look at the image sizes we saw in our example, using different base images. We'll show both uncompressed and compressed sizes for comparison:

Base Image & Build Setup Image Tag Uncompressed Docker Image Compressed Docker Image
node:20-alpine (build) & gcr.io/distroless/nodejs20-debian12 mycontainer:distroless 191 MB 49.61 MB
node:slim mycontainer:slim 364 MB 78.52 MB
node:alpine mycontainer:alpine 244 MB 56.68 MB
node:latest mycontainer:default 1.62 GB 381.73 MB

Note: The uncompressed size can be checked by using the command docker image ls | grep mycontainer. The compressed size can be seen when the image is pushed to a container registry or by using docker save mycontainer:distroless | gzip -c | wc -c to see the compressed file size in bytes.

Whats the difference in compressed and uncompressed size?

Optimization through Distroless (Multi-stage Build)

Now, let's see how to optimize it even further using a Distroless image. Here's an updated Dockerfile:

FROM node:20-alpine AS build-env
WORKDIR /app

COPY package.json yarn.lock ./
ENV NODE_ENV=production
RUN yarn install --frozen-lockfile --production
RUN du -sh /app/node_modules && exit 1
COPY index.js ./

FROM gcr.io/distroless/nodejs20-debian12
WORKDIR /app

COPY --from=build-env /app /app
CMD ["index.js"]
Enter fullscreen mode Exit fullscreen mode

In this approach:

  1. We use node:20-alpine as a builder image to install our dependencies, copy the source, and prepare the application for production.
  2. We then copy all necessary artifacts (/app) into a distroless image gcr.io/distroless/nodejs20-debian12 which only contains the absolute runtime requirements.

This method creates a final image with a minimal footprint, as demonstrated in the table.

Conclusion

Choosing the right base image is a critical step in optimizing your Docker images. Start with Alpine images, manually add needed dependencies, and ultimately aim for distroless images in production. By understanding the trade-offs of each image type, you can greatly reduce your image size, improving efficiency, performance, and resource consumption. Stay tuned for more Docker tips and tricks in future blog posts!

The table clearly highlights the dramatic size difference when using different base images and build strategies. The multi-stage distroless approach yields the smallest final image, making it ideal for production deployments.

Top comments (0)