Code Chronicles #14: Mastering Docker - Tips, Tricks, and Best Practices from the Trenches

In this Docker sequel, we will embark on a more advanced journey, covering topics that would bring your Docker understanding and usage to the next level. We'll be discussing application logging, BuildKit, multi-stage builds, container networking, and more, along with several practical Dockerfile and Docker Compose examples.

Deep Dive into Docker Logging

Docker's built-in logging mechanisms provide us with a standardized way of tracking our applications' behavior inside Docker containers. The importance of effective logging cannot be overstated, especially when debugging issues in a containerized environment. Here are some crucial points to remember:

Standard Output (stdout) and Standard Error (stderr): Docker captures anything directed towards stdout or stderr and makes them accessible via the docker logs <container_id> command. This feature means you can direct your application logs to these standard outputs to make them easily accessible. For example, if you are running a .NET application, you could use Console.WriteLine() or Console.Error.WriteLine() to send logs to stdout and stderr, respectively. In the real application though, you'd usually point your logger to the console output.
Choosing the Right Logging Driver: Docker supports multiple logging drivers such as json-file, syslog, journald, gelf, fluentd, awslogs, splunk, etwlogs, gcplogs, logentries, etc. Each has its advantages and disadvantages, and the choice depends on your specific use case. The default logging driver is json-file, which writes logs in a JSON format. You can configure the logging driver at the daemon level, or you can configure it per container when you run a container.
Log Rotation: Docker can automatically rotate logs to prevent filling up storage space. Log rotation is particularly crucial for long-running containers where logs can become excessively large over time.
Docker Log Commands: Apart from the basic docker logs <container_id>, Docker provides a rich set of log commands. For example, you can use the -f option for real-time log tailing (docker logs -f <container_id>). Similarly, the --since option allows you to view logs since a particular timestamp (docker logs --since=2023-08-01T13:23:37 <container_id>).

Building Images with Docker's BuildKit

Docker's next-generation image builder, BuildKit, offers several improvements over the traditional build process, such as increased performance, better cache management, and more features. Here are some points to delve deeper into BuildKit:

Enabling/Disabling BuildKit: Docker has enabled BuildKit by default since Docker 19.03. However, in certain situations, you may need to revert to the legacy build process. To do this, you can disable BuildKit by setting the DOCKER_BUILDKIT environment variable to 0 when you run your build command (DOCKER_BUILDKIT=0 docker build .).
Build Secrets: One of the exciting features of BuildKit is the support for build-time secrets. This feature allows you to securely use secrets like SSH private keys or other credentials without leaking them into the final image. The secrets are mounted temporarily for the build process and are not included in the final image layer, hence preventing secret leakage.
Concurrent Builds: BuildKit supports concurrent, parallelized build processes. It enables faster image building, especially when you have multiple independent build stages in your Dockerfile.
Caching Improvements: BuildKit introduces a new layer caching mechanism that helps reuse build cache from previous builds. It can detect changes in the context and the Dockerfile more accurately and use cached layers whenever possible.

Remember that while BuildKit introduces many new features, it might not be compatible with all scenarios or build environments, so test your build process thoroughly before switching.

Embracing Multi-stage Builds

With Docker, you can define multiple build stages and utilize the same Dockerfile to produce different images with varying targets. This becomes particularly useful when you have different environments or build scenarios for your application.

For instance, you might need an image for deploying your service, while another image executes during the CI/CD pipeline to pack and version a .NET project containing IntegrationEvents definition into a NuGet package and publish it onto your private artifactory to share with other teams.

Let's illustrate this with an example:

# Base build image
FROM mcr.microsoft.com/dotnet/sdk:7.0 AS build
WORKDIR /app
COPY . .
RUN dotnet publish ./src/Company.Project/Company.Project.csproj -r linux-x64 --self-contained false --configuration Release -o /app/published-app

# NuGet packages build image
FROM build as build-nupkgs
ARG PACKAGE_VERSION
RUN dotnet pack src/IntegrationEvents/Company.Project.IntegrationEvents --configuration Release --output ./nupkgs /p:PackageVersion=${PACKAGE_VERSION:-1.0.0}

# NuGet packages publish image
FROM build-nupkgs as publish-nupkgs
ARG NUGET_API_KEY
ARG NUGET_SOURCE
CMD dotnet nuget push ./nupkgs/* -k $NUGET_API_KEY -s $NUGET_SOURCE

# Base runtime image
FROM mcr.microsoft.com/dotnet/aspnet:7.0
WORKDIR /app
COPY --from=build /app/published-app /app
ENTRYPOINT ["dotnet", "Company.Project.dll"]

In this Dockerfile, we have defined multiple build stages:

The build stage uses the .NET SDK base image to compile the application.
The build-nupkgs stage builds the NuGet packages using the artifacts from the build stage.
The publish-nupkgs stage pushes the NuGet packages to a NuGet server.
The final stage creates a runtime image that only contains the published application and the .NET runtime.

To build the Docker images, you would use commands like these:

# Building service image
docker build . -t ${dockerImageTag}

# Building packages image
docker build . -t ${packagesImageTag} \
      --build-arg PACKAGE_VERSION=${version} \
      --target publish-nupkgs

# Run the packages image to execute NuGet package publishing process
docker run --rm \
      -e NUGET_API_KEY=${apiKey} \
      -e NUGET_SOURCE=${artifactoryUrl} \
      ${packagesImageTag}

This approach allows you to have a single Dockerfile for multiple purposes. You can build and push your NuGet packages using Docker, and also create your runtime Docker image using the same Dockerfile.

This strategy offers flexibility and efficiency as it reduces duplication and makes the Dockerfile easier to manage. Moreover, it also allows different teams to use the same Dockerfile for different tasks.

Minimizing Docker Image Size

Keeping Docker images as lightweight as possible is vital for efficient storage, faster image pulls, and optimized runtime performance. Here are some tips to help you minimize the size of your Docker images:

Combine RUN Instructions: Docker creates a new layer for each RUN command in your Dockerfile. By using the && operator, you can chain commands together in a single RUN command, thereby reducing the number of layers.

For example, instead of:

RUN apt-get update
RUN apt-get install -y <package>
RUN apt-get clean
RUN rm -rf /var/lib/apt/lists/*

You can do:

RUN apt-get update && \
    apt-get install -y <package> && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

Remove Redundant Files: If certain files or packages are only needed during the build phase and are not required in the final image, you should remove them in the same RUN command. This approach ensures the added files do not end up in the final built layer.
Remove unnecessary files: Any files that are not necessary for running your application should not be included in the final image. This includes build artifacts, caches, test data, and documentation. You can use .dockerignore file to exclude unnecessary files and directories from the Docker build context.

Mastering Docker Networking

Docker's networking capabilities enable efficient communication between containers. When you need a standalone container to communicate with containers in a Docker Compose network, you can add the standalone container to the Docker Compose network by using the --network flag. This option allows containers to communicate via their names as hostnames, making it easy for other containers in the same network to connect to them.

If you need to interact with your containers from your local machine, such as using Postman or a browser, you'll connect via localhost:<exposed_port>. For those on Windows or Mac trying to connect to the container from another container not in the same network, the host.docker.internal:<exposed_port> address should be used.

To add an existing standalone container to an existing network on Docker, you can follow these steps:

Find the Network ID: You need the Network ID of the network you want to connect your container to. You can list all networks using the command docker network ls. This command will display a list of all networks along with their IDs.

$ docker network ls

NETWORK ID     NAME      DRIVER    SCOPE
7fca4eb8c647   bridge    bridge    local
a106329dffcf   host      host      local
930b2143a7c6   mynet     bridge    local

In the example above, the Network ID of 'mynet' network is 930b2143a7c6.

Find the Container ID: Next, you need to find the ID of the container that you want to add to the network. You can list all running containers using docker ps command.

$ docker ps

CONTAINER ID   IMAGE     COMMAND   CREATED          STATUS          PORTS     NAMES
c3f279d17e0a   myapp     "run.sh"  About a minute ago   Up About a minute   8080/tcp  my_container

In the example above, the Container ID is c3f279d17e0a.

Connect the Container to the Network: Once you have the Network ID and the Container ID, you can connect them using the docker network connect command.

docker network connect 930b2143a7c6 c3f279d17e0a

After these steps, your standalone container c3f279d17e0a should now be part of the 'mynet' network and should be able to communicate with other containers in the same network.

Remember, each Docker network creates a separate subnet for its connected containers, isolating them from other networks. This behavior enhances the security and manageability of multi-container applications.

Docker ENTRYPOINT and CMD

Docker ENTRYPOINT and CMD are instructions in the Dockerfile that define the starting point of a container. They dictate what command gets executed when a container is started from the image.

The ENTRYPOINT instruction in a Dockerfile specifies the executable that will be run when a Docker container is started. It has two forms:

The exec form is ENTRYPOINT ["executable", "param1", "param2"]. This does not invoke a command shell, which means that you can't use shell processing features like variable substitution.
The shell form is ENTRYPOINT command param1 param2 and does invoke a command shell, enabling features like shell processing.

For instance, in a .NET Core Dockerfile, you may have an ENTRYPOINT instruction that looks like this:

ENTRYPOINT ["dotnet", "Company.Project.dll"]

On the other hand, CMD provides defaults for an executing container and can include an executable. If CMD is used to provide default arguments for the ENTRYPOINT instruction, both can be specified in the Dockerfile like this:

ENTRYPOINT ["dotnet", "Company.Project.dll"]
CMD ["--arg1", "value1"]

In this scenario, if you run the container with no command-line arguments, Docker will start the application with dotnet Company.Project.dll --arg1 value1. However, if you pass command-line arguments to docker run, they will replace --arg1 value1. So, if you run docker run <image> --arg2 value2, Docker will start the application with dotnet Company.Project.dll --arg2 value2.

An essential distinction between ENTRYPOINT and CMD is that CMD instructions can be easily overridden by supplying arguments to the docker run command. In contrast, ENTRYPOINT defines a container's main command, allowing that container to be used as an executable.

Also, keep in mind that ENTRYPOINT can be overridden at runtime by using the --entrypoint flag with the docker run command. This is useful when you need to run different executables or scripts within the same Docker image. For example, you could have an image for a .NET Core application that has an ENTRYPOINT for running the application with dotnet Company.Project.dll, but then use --entrypoint to run a shell in the container for debugging purposes.

The docker-entrypoint.sh script is often used to perform routine bootstrapping tasks before the primary process is started. This is especially useful in container environments where you want to set up certain aspects at runtime, such as environment-specific configurations or linking containers together.

Here's an example docker-entrypoint.sh script for a .NET application:

#!/bin/sh
set -e 

# You can add a custom set of commands to execute here

# Execute command
exec "$@"

And the Dockerfile would use it like this:

FROM mcr.microsoft.com/dotnet/runtime:7.0
WORKDIR /app
COPY . .
COPY docker-entrypoint.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/docker-entrypoint.sh
ENTRYPOINT ["docker-entrypoint.sh", "dotnet", "Company.Project.dll"]

In this case, the docker-entrypoint.sh script is used to perform any necessary bootstrapping tasks, then it uses exec "$@" to pass control to the CMD specified at runtime or in the Dockerfile.

A crucial aspect to remember is that a Docker container will stop running once its main process, defined by the ENTRYPOINT and/or CMD, has finished. If you need a Docker container to keep running, you should make sure the main process doesn't finish, or you can use a workaround such as tailing the Docker logs indefinitely.

A tailing example would look like this:

FROM mcr.microsoft.com/dotnet/runtime:7.0
WORKDIR /app
COPY . .
RUN dotnet Company.Project.dll
CMD ["tail", "-f", "/dev/null"]

In this case, once the dotnet Company.Project.dll command finishes, the CMD ["tail", "-f", "/dev/null"] command keeps the container running, preventing Docker Compose from thinking that the service has crashed. This allows the rest of the services in the Docker Compose file to start.

This is what Visual Studio does by default when you're starting a debug session using Docker Compose startup. I wrote more about it and some issues I had with it in one of the previous newsletters. You can read about it here.

Well, I've got to wrap this up at some point. We've navigated through several advanced topics and practices in Docker. It's time to put these tips, tricks, and deep dives into practice to take full advantage of Docker's potential.

If you're interested in something I didn't cover in this topic, feel free to leave a comment or reach out.