dockerfile – Docker

Highlights from the BuildKit v0.11 Release

Justin Chadwell — Thu, 19 Jan 2023 15:00:00 +0000

BuildKit v0.11 is now available, along with Buildx v0.10 and v1.5 of the Dockerfile syntax. We’ve released new features, bug fixes, performance improvements, and improved documentation for all of the Docker Build tools.
Let’s dive into what’s new! We’ll cover the highlights, but you can get the whole story in the full changelogs .

In this post:

1. SLSA Provenance

BuildKit can now create SLSA Provenance attestation to trace the build back to source and make it easier to understand how a build was created. Images built with new versions of Buildx and BuildKit include metadata like links to source code, build timestamps, and the materials used during the build. To attach the new provenance, BuildKit now defaults to creating OCI-compliant images.

Although docker buildx will add a provenance attestation to all new images by default, you can also opt into more detail. These additional details include your Dockerfile source, source maps, and the intermediate representations used by BuildKit. You can enable all of these new provenance records using the new --provenance flag in Buildx:

$ docker buildx build --provenance=true -t / --push .

Or manually set the provenance generation mode to either min or max (read more about the different modes):

$ docker buildx build --provenance=mode=max -t / --push .

You can inspect the provenance of an image using the imagetools subcommand. For example, here’s what it looks like on the moby/buildkit image itself:

$ docker buildx imagetools inspect moby/buildkit:latest --format '{{ json .Provenance }}'
{
  "linux/amd64": {
    "SLSA": {
      "buildConfig": {

You can use this provenance to find key information about the build environment, such as the git repository it was built from:

$ docker buildx imagetools inspect moby/buildkit:latest --format '{{ json (index .Provenance "linux/amd64").SLSA.invocation.configSource }}'
{
  "digest": {
	"sha1": "830288a71f447b46ad44ad5f7bd45148ec450d44"
  },
  "entryPoint": "Dockerfile",
  "uri": "https://github.com/moby/buildkit.git#refs/tags/v0.11.0"
}

Or even the CI job that built it in GitHub actions:

$ docker buildx imagetools inspect moby/buildkit:latest --format '{{ (index .Provenance "linux/amd64").SLSA.builder.id }}'
https://github.com/moby/buildkit/actions/runs/3878249653

Read the documentation to learn more about SLSA Provenance attestations or to explore BuildKit’s SLSA fields.

2. Software Bill of Materials

While provenance attestations help to record how a build was completed, Software Bill of Materials (SBOMs) record what components are used. This is similar to tools like docker sbom, but, instead of requiring you to perform your own scans, the author of the image can build the results into the image.

You can enable built-in SBOMs with the new --sbom flag in Buildx:

$ docker buildx build --sbom=true -t / --push .

By default, BuildKit uses docker/buildkit-syft-scanner (powered by Anchore’s Syft project) to build an SBOM from the resulting image. But any scanner that follows the BuildKit SBOM scanning protocol can be used here:

$ docker buildx build --sbom=generator= -t / --push .

Similar to SLSA provenance, you can use imagetools to query SBOMs attached to images. For example, if you list all of the discovered dependencies used in moby/buildkit, you get this:

$ docker buildx imagetools inspect moby/buildkit:latest --format '{{ range (index .SBOM "linux/amd64").SPDX.packages }}{{ println .name }}{{ end }}'
github.com/Azure/azure-sdk-for-go/sdk/azcore
github.com/Azure/azure-sdk-for-go/sdk/azidentity
github.com/Azure/azure-sdk-for-go/sdk/internal
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob
...

Read the SBOM attestations documentation to learn more.

3. SOURCE_DATE_EPOCH

Getting reproducible builds out of Dockerfiles has historically been quite tricky — a full reproducible build requires bit-for-bit accuracy that produces the exact same result each time. Even builds that are fully deterministic would get different timestamps between runs.

The new SOURCE_DATE_EPOCH build argument helps resolve this, following the standardized environment variable from the Reproducible Builds project. If the build argument is set or detected in the environment by Buildx, then BuildKit will set timestamps in the image config and layers to be the specified Unix timestamp. This helps you get perfect bit-for-bit reproducibility in your builds.

SOURCE_DATE_EPOCH is automatically detected by Buildx from the environment. To force all the timestamps in the image to the Unix epoch:

$ SOURCE_DATE_EPOCH=0 docker buildx build -t / .

Alternatively, to set it to the timestamp of the most recent commit:

$ SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct) docker buildx build -t / .

Read the documentation to find out more about how BuildKit handles SOURCE_DATE_EPOCH.

4. OCI image layouts as named contexts

BuildKit has been able to export OCI image layouts for a while now. As of v0.11, BuildKit can import those results again using named contexts. This makes it easier to build contexts entirely locally — without needing to push intermediate results to a registry.

For example, suppose you want to build your own custom intermediate image based on Alpine that contains some development tools:

$ docker buildx build . -f intermediate.Dockerfile --output type=oci,dest=./intermediate,tar=false

This builds the contents of intermediate.Dockerfile and exports it into an OCI image layout into the intermediate/ directory (using the new tar=false option for OCI exports). To use this intermediate result in a Dockerfile, refer to it using any name you like in the FROM statement in your main Dockerfile:

FROM base
RUN ... # use the development tools in the intermediate image

You can then connect this Dockerfile to your OCI layout using the new oci-layout:// URI schema for the --build-context flag:

$ docker buildx build . -t / --build-context base=oci-layout://intermediate

Instead of resolving the image base to Docker Hub, BuildKit will instead read it from oci-layout://intermediate in the current directory, so you don’t need to push the intermediate image to a remote registry to be able to use it.

Refer to the documentation to find out more about using oci-layout:// with the --build-context flag.

5. Cloud cache backends

To get good build performance when building in ephemeral environments, such as CI pipelines, you need to store the cache in a remote backend. The newest release of BuildKit supports two new storage backends: Amazon S3 and Azure Blob Storage.

When you build images, you can provide the details of your S3 bucket or Azure Blob store to automatically store your build cache to pull into future builds. This build cache means that even though your CI or local runners might be destroyed and recreated, you can still access your remote cache to get quick builds when nothing has changed.

To use the new backends, you can specify them using the --cache-to and --cache-from flags:

$ docker buildx build --push -t / \
  --cache-to type=s3,region=,bucket=,name=[,parameters...] \
  --cache-from type=s3,region=,bucket=,name= .

$ docker buildx build --push -t / \
  --cache-to type=azblob,name=[,parameters...] \
  --cache-from type=azblob,name=[,parameters...] .

You also don’t have to choose between one cache backend or the other. BuildKit v0.11 supports multiple cache exports at a time so you can use as many as you’d like.

Find more information about the new S3 backend in the Amazon S3 cache and the Azure Blob Storage cache backend documentation.

6. OCI Image annotations

OCI image annotations allow attaching metadata to container images at the manifest level. They’re an alternative to labels that are more generic, and they can be more easily attached to multi-platform images.

All BuildKit image exporters now allow setting annotations to the image exporters. To set the annotations of your choice, use the --output flag:

$ docker buildx build ... \
    --output "type=image,name=foo,annotation.org.opencontainers.image.title=Foo"

You can set annotations at any level of the output, for example, on the image index:

$ docker buildx build ... \
    --output "type=image,name=foo,annotation-index.org.opencontainers.image.title=Foo"

Or even set different annotations for each platform:

$ docker buildx build ... \
    --output "type=image,name=foo,annotation[linux/amd64].org.opencontainers.image.title=Foo,annotation[linux/arm64].org.opencontainers.image.title=Bar"

You can find out more about creating OCI annotations on BuildKit images in the documentation.

7. Build inspection with `--print`

If you are starting in a codebase with Dockerfiles, understanding how to use them can be tricky. Buildx supports the new --print flag to print details about a build. This flag can be used to get quick and easy information about required build arguments and secrets, and targets that you can build.

For example, here’s how you get an outline of BuildKit’s Dockerfile:

$ BUILDX_EXPERIMENTAL=1 docker buildx build --print=outline https://github.com/moby/buildkit.git
TARGET:  	buildkit
DESCRIPTION: builds the buildkit container image

BUILD ARG              	   VALUE   DESCRIPTION
RUNC_VERSION           	   v1.1.4   
ALPINE_VERSION         	   3.17	 
BUILDKITD_TAGS                  	defines additional Go build tags for compiling buildkitd
BUILDKIT_SBOM_SCAN_STAGE   true

We can also list all the different targets to build:

$ BUILDX_EXPERIMENTAL=1 docker buildx build --print=targets https://github.com/moby/buildkit.git
TARGET             	DESCRIPTION
alpine-amd64      	 
alpine-arm        	 
alpine-arm64      	 
alpine-s390x

Any frontend that implements the BuildKit subrequests interface can be used with the buildx --print flag. They can even define their own print functions, and aren’t just limited to outline or targets.

The --print feature is still experimental, so the interface may change, and we may add new functionality over time. If you have feedback, please open an issue or discussion on the docker/buildx GitHub, we’d love to hear your thoughts!

8. Bake features

The Bake file format for orchestrating builds has also been improved.

Bake now supports more powerful variable interpolation, allowing you to use fields from the same or other blocks. This can reduce duplication and make your bake files easier to read:

target "foo" {
  dockerfile = target.foo.name + ".Dockerfile"
  tags       = [target.foo.name]
}

Bake also supports null values for build arguments and allows labels to use the defaults set in your Dockerfile so your bake definition doesn’t override those:

variable "GO_VERSION" {
  default = null
}
target "default" {
  args = {
    GO_VERSION = GO_VERSION
  }
}

Read the Bake documentation to learn more.

More improvements and bug fixes

In this post, we’ve only scratched the surface of the new features in the latest release. Along with all the above features, the latest releases include quality-of-life improvements and bug fixes. Read the full changelogs to learn more:

We welcome bug reports and contributions, so if you find an issue in the releases, let us know by opening a GitHub issue or pull request, or get in contact in the #buildkit channel in the Docker Community Slack.

How to Fix and Debug Docker Containers Like a Superhero

Tyler Charboneau — Wed, 19 Oct 2022 14:00:00 +0000

While containers help developers rapidly build and run cross-platform applications, creating error-free apps remains a constant challenge. And while it’s not always obvious how container errors occur, this mystery is even harder for newer developers to unravel. Figuring out how to debug Docker containers can seem daunting.

In this Community All-Hands session, Ákos Takács demonstrated how to solve many of these pesky problems and gain the superpower of fixing containers.

Each issue can impact your image builds and final applications. Some bugs may not trigger clear error messages. To further complicate things, source-code inspection isn’t always helpful.

But, common container issues don’t have to be your kryptonite! We’ll share Ákos’ favorite tips and show you how to conquer these development challenges.

In this tutorial:

Finding and fixing common container mistakes

Finding and fixing common container mistakes

Everyone is prone to the occasional silly mistake. You code when you’re tired, suffer from the occasional keyboard slip, or sometimes fail to copy text correctly between steps. These missteps can carry forward from one command to the next. And because easy-to-miss things like spelling errors or character omissions can fly under the radar, you’re left doing plenty of digging to solve basic problems. Nobody wants that, so what tools are at your disposal?

Using the CLI for extra container visibility

Say we have an image downloaded from Docker Hub — any image at all — and use some variation of the docker run command to run it. The resulting container will be running the default command. If you want to surface that command, entering docker container ls --all will grab a list of containers with their respective commands.

Users often copy these commands and reuse them within other longer CLI commands. As you’d expect, it’s incredibly easy to highlight incorrectly, copy an incomplete phrase, and run a faulty command that uses it.

While spinning up a new container, you’ll hit a snag. The runtime in this instance will fail since Docker cannot find the executable. It’s not located in the PATH, which indicates a problem:

Running the docker container ls --all command also offers some hints. Note the httpd-foregroun container command paired with its created (but not running) container. Conversely, the v0 container that’s running successfully leverages a valid, complete command:

How do we investigate further? Use the docker run --rm -it --name MYCONTAINER [IMAGE] bash command to open an interactive terminal within your container. Take the container’s default command and attempt to run it again. A “command not found” error message will appear.

This is much more succinct and shows that you’ve likely entered the wrong command — in this case by forgetting a character. While Ákos’ example uses httpd, it’s applicable to almost any container image.

Change your CLI output formatting for visibility and readability

Container commands are clipped once they exceed a certain length in the terminal output. That prevents you from inspecting the command in its entirety.

Luckily, Ákos showed how the --format ‘{{ json . }}’ | jq -C flag can improve how your terminal displays outputs. Instead of cutting off portions of text, here’s how your docker container ls --all result will look:

You can read and compare any parameters in full. Nothing is hidden. If you don’t have jq installed, you could instead enter the following command to display outputs similarly minus syntax highlighting. This beats the default tabular layout for troubleshooting:

docker container ls --all --format ‘{{ json . }}’ | python3 -m json.tool --json-lines

Lastly, why not just expand the original table view while only displaying relevant information? Run the following command with the --no-trunc flag to expand those table rows and completely reveal each cell’s contents:

docker container ls --all --format ‘table {{ .Names }}/t{{ .Status }}/t{{ .Command }}’ --no-trunc

These examples highlight the importance of visibility and transparency in troubleshooting. When you can uncover and easily digest the information you need, making corrections is much easier.

Remember to leverage your logs

By following best practices, any active application running within a Docker container will produce log outputs. While you might view logging as a problem-catching mechanism, many running containers don’t experience issues.

Ákos believes it’s important to understand how normal log entries look. As a result, identifying abnormal log entries becomes that much easier. The docker logs command enables this:

The process of tuning your logs differs between tools and languages. For example, Ákos drew from methods involving httpd — like trace for detailed trace-level messages or LogLevel for filtering error messages — but these practices are widely applicable. You’ll probably want to zero in on startup and runtime errors to diagnose most issues.

Log handling is configurable. Here are some common commands to help you drill down into container issues (and reduce noise):

Grab your container’s last 100 logs:

docker logs --tail 100 [container ID]

Grab all logs for a specific container:

docker logs [container ID]

View all active processes within a running container, should its logs be inaccessible:

docker top [container ID]

Log inspection enables easier remediation. Alongside Ákos, we agree that you should confirm any container changes or fixes after making them. This means you’ve taken the right steps and can move ahead.

Want to view all your logs together within Docker Desktop? Download our Logs Explorer extension, which lets you browse through your logs using filters and advanced search capabilities. You can even view new logs as they populate.

Tackle issues with ENTRYPOINT

When running applications, you’ll need to run executable files within your container. The ENTRYPOINT portion of your Dockerfile sets the main command within a container and basically assigns it a task. These ENTRYPOINT instructions rely on executable files being in the container.

In Ákos’ example, he tackles a scenario where improper permissions can prevent Docker from successfully mounting and running an entrypoint.sh executable. You can copy his approach by doing the following:

Use the ls -l $PWD/examples/v6/entrypoint.sh command to view your file’s permissions, which may be inadequate.
Confirm that permissions are incorrect.
Run a chmod 774 command to let this file read, write, and execute for all users.
Use docker run to spin up a container v7 from the original entrypoint, which may work briefly but soon stop running.
Inspect the entrypoint.sh file to confirm our desired command exists.

We can confirm this again by entering docker container inspect v7-exiting to view our container definition and parameters. While the Entrypoint is specified, its Cmd definition is null. That’s what’s causing the issue:

Why does this happen? Many don’t know that by setting --entrypoint, any image with a default command will empty that command automatically. You’ll need to redefine your command for your container to work properly. Here’s how that CLI command might look:

docker run -d -v $PWD/examples/v7/entrypoint.sh:/entrypoint.sh --entrypoint /entrypoint.sh --name v7-running httpd:2.4 httpd-foreground

This works for any container image but we’re just drawing from an earlier example. If you run this and list your containers again, v7 will be active. Confirm within your logs that everything looks good.

Access and inspect container content

Carefully managing files and system resources is critical during local development. That’s doubly true while working with multiple images, containers, or resource constraints. There are scenarios where your containers bloat as their contents accumulate over time.

Keeping your files tidy is one thing. However, you may also want to copy your files from your container and move them into a temporary folder — using the docker cp command with a specified directory. Using a variation of ls -la ./var/v8, borrowing from Ákos’ example, then produces a list containing every file.

This is great for visibility and confirming your container’s contents. And we can diagnose any issues one step further with docker container diff v8 to view which files have been changed, appended, or even deleted. If you’re experiencing strange container behavior, digging into these files might be useful.

Note: You can also leverage our Resource Usage extension to monitor disk space consumption, network activity, CPU usage, and memory usage in real time!

Dive deeply into files and folders

Close inspection is where hexdump comes in handy. The hexdump function converts your file into hexadecimal code, which is much more readable than binary. Ákos used the following commands:

docker cp v8:/usr/local/apache2/bin/httpd ./var/v8-httpd`
`hexdump -C -n 100 ./var/v8-httpd

You can adjust this -n number to read additional or fewer initial bytes. If your file contains text, this content will stand out and reveal the file’s main purpose. But, say you want to access a folder. While changing your directory and running docker container inspect … is standard, this method doesn’t work for Docker Desktop users. Since Desktop runs things in a VM, the host cannot access the folders within.

Ákos showcased CTO Justin Cormack’s own nsenter1 image on GitHub, which lets us tap into those containers running with Docker Desktop environments. Docker Captain Bret Fisher has since expanded upon nsenter1’s documentation while adding useful commands. With these pieces in place, run the following command:

docker run --rm --privileged --pid=host alpine:3.16.2 nsenter -t 1 -m -u -i -n -p -- sh -c “ cd \”$(docker container inspect v8 --format ‘{{ .GraphDriver.Data.UpperDir }}’}\” \&& find .”

This command’s output mirrors that from our earlier docker container diff command. You can also run a hexdump using that same image above, which gives you the same troubleshooting abilities regardless of your environment. You can also inspect your entrypoint.sh to make important changes.

Solve Docker Build errors

While Docker BuildKit is quick and resilient, you can encounter errors that prevent image build completion. To learn why, run the following command to view each sequential build stage:

docker build $PWD/[MY SOURCE] --tag “MY TAG” --progress plain

BuildKit will provide readable context for each step and display any errors that occur:

If you see a missing file or directory error like the one above, don’t worry! You can use the cat $PWD/[MY SOURCE]/[MY DOCKERFILE] command to view the contents of your Dockerfile. Not only can you see where you misstepped more clearly, but you can also add a new instruction before the failing command to list your folder’s contents.

Maybe those contents need updating. Maybe your folder is empty! In that case, you need to update everything so docker build has something to leverage.

Next, run the build command again with the --no-cache flag added. This flag tells Docker to cleanly build from scratch each time without relying on caching:

You can progressively build updated versions of your Dockerfile and test those changes, given the cascading nature of instructions. Writing new instructions after the last working instruction — or making changes earlier on in your file — can eliminate those pesky build issues. Mechanisms like unlink or cp are helpful. The first behaves like rm while accepting only one argument, while cp copies critical files and folders into your image from a source.

Solve Docker Compose errors

We use Docker Compose to spin up multiple services simultaneously using the docker compose --project-directory $PWD/[MY SOURCE] up -d command.

However, one or more of those containers might unexpectedly exit. By running docker compose --project-directory $PWD/[MY SOURCE] ps to list out our services, you can see which containers are running or exited.

To pinpoint the problem, you’d usually grab logs via the docker compose logs command. You won’t need to specify a project directory in most cases. However, your container produces no logs since it isn’t running.

Next, run the cat $PWD/[MY SOURCE]/docker-compose.yml command to view your Docker Compose file’s contents. It’s likely that your services definitions need fixing, so digging line by line within the CLI is helpful. Enter the following command to make this output even clearer:

docker compose --project-directory $PWD/[MY SOURCE] config

Your container exits when the commands contained within are invalid — just like we saw earlier. You’ll be able to see if you’ve entered a command incorrectly or if that command is empty. From there, you can update your Compose file and re-run docker compose --project-directory $PWD/[MY SOURCE] up -d. You can now confirm that everything is working by listing your services again. Your terminal will also output logs!

Optional: Make direct file edits within running containers

Finally, it’s possible (and tempting) to directly edit your files within your container. This is viable while testing new changes and inspecting your containers. However, it’s usually considered best practice to create a new image and container instead.

If you want to make edits within running containers, an editor like VS Code allows this, while IntelliJ doesn’t by comparison. Install the Docker extension for VS Code. You can then browse through your containers in the left sidebar, expand your collection of resources, and directly access important files. For example, web developers can directly edit their index.html files to change how user content is structured.

Investigate less and develop more

Overall, the process of fixing a container, on the surface, may seem daunting to newer Docker users. The methods we’ve highlighted above can dramatically reduce that troubleshooting complexity — saving you time and effort. You can spend less time investigating issues and more time creating the applications users love. And we think those skills are pretty heroic.

For more information, you can view Ákos Takács’ full presentation on YouTube to carefully follow each step. Want to dive deeper? Check out these additional resources to become a Docker expert:

Learn how to view logs for a container or service
View our Docker Run reference documentation
Learn about best practices for writing Dockerfiles

Image rebase and improved remote cache support in new BuildKit

Tonis Tiigi — Thu, 17 Mar 2022 15:00:00 +0000

We’ve just shipped new versions of the BuildKit builder engine, Dockerfile 1.4 frontend, and Docker Buildx CLI. Each of these comes with many new features. In this blog post, I’ll show one of them, a new copy mode in Dockerfiles, and explain why you should start to use it on your Dockerfiles.

With the Dockerfile 1.4 release, the COPY and ADD commands for copying files from the build context or from another stage now accept a new flag `–link`. Using this flag enables much better cache semantics as well as the ability to perform a fast 2nd-day rebase of your builds on top of new base images without rebuilding them.

In order to use this flag, you will need to add a line containing # syntax=docker/dockerfile:1.4 to the top of your Dockerfile. This makes sure that the proper frontend image with support for this flag is loaded. In order to get the correct cache semantics for the flag, BuildKit v0.10 needs to be used as well.

# syntax=docker/dockerfile:1.4
FROM ...
COPY --link foo bar

docker buildx create --use --name mybuilder
docker buildx build .

Before we get into the details of what this new flag does, let’s go over how the Dockerfile commands work at the moment.

Docker images consist of layers that are tarballs in the registry that make up the container filesystem. When you pull an image, these tarballs get extracted on top of each other. The implementation of how this extraction happens and how files actually get stored on the disk depends on the underlying snapshotter type. If you use the overlay snapshotter, your filesystem can create a special mount that combines multiple directories into one. For other snapshotters the process usually involves making (shallow) copies of files.

Every RUN, COPY or ADD command in Dockerfile also creates a new snapshot that is added on top of previously created contents. Once the build is ready and you want to export an image as a build result, we will run a “differ” component that compares all the snapshots and creates new tarballs containing the new files that were added in each snapshot.

An important concept to understand here is that in order for a new layer to be created, the previous layers(also called parent layers) already need to be created before and exist on disk. Whenever you used COPY command to move some files to a directory, all the previous commands on the same stage needed to be completed before. Without it, you wouldn’t have the destination directory where the files would be copied to.

This limitation changes now with the new --link flag that has been added to COPY and ADD commands. When this flag is present, the COPY command works in a different mode where files are instead copied to a completely new snapshot. Then this new snapshot is turned into a new layer tarball on its own, and that tarball is linked into the chain of previous tarball layers. This linking action is usually just a metadata change where a new item is added to the layers array without the need to access or move any files. As shown in the next examples, it can even happen remotely with the layers existing in the remote registry without ever needing to pull or push them.

To summarize:

COPY --link=false (previous method and default): Files are copied on top of the result of the previous command Layers are created later by comparing snapshots on disk
COPY --link=true: Files are copied to a new location and turned into an independent layer Layer identifier is added on top of previous layers

By removing the dependency from the destination directory, we don’t need to wait for previous commands to finish before completing the COPY command. We also do not need to invalidate our build cache for the current command when previous commands on the same Dockerfile stage change.

Let’s look at some example use-cases that this enables.

Example: Rebasing an existing image

The previous release of BuildKit v0.9 introduced another new feature: lazy image pulling. What this feature means is that whenever BuildKit needs to access a remote image/cache, it will delay the pulling of its layers until there is a task that actually needs to read files from them. For example, when a layer is just used in another image this pulling is not needed and BuildKit can just create a new image referencing the previous layer by its immutable digest.

FROM ubuntu
ENV MYCONFIG=foo
VOLUME /data

For example, if you build this Dockerfile with docker buildx build -t myuser/myubuntu --push . on a clean system without cache, you will notice that the whole build only takes a couple of seconds before your new image is ready in your repository. This is because the layers of the ubuntu image are never pulled to your local machine and never pushed to hub repository. Instead, BuildKit creates a new image config and manifest containing the Ubuntu layer digests and pushes only them. The layers are linked directly from the Ubuntu repository using the cross-repo mount feature of the registry. This pattern can also be used with a remote cache source where your build would only need to validate that remote cache is still up-to-date and not actually pull down any layers.

This method works well for metadata commands like ENV and VOLUME that only modify the image config. If you used a command that created new layers like COPY or RUN, the base image still needed to be pulled first because local files were needed in order to run these commands.

COPY --link removes this requirement. Let’s look at a common multi-stage build Dockerfile that has been updated to use COPY --link:

#syntax=docker/dockerfile:1.4
FROM golang AS build
....
RUN go build -o /myapp .

FROM alpine:3.14
COPY --from=build --link /out/myapp /bin
ENTRYPOINT ["/bin/myapp"]

When you build this file with BuildKit v0.10, the first thing you will notice is that your build completes without ever pulling the Alpine image. This is because copying myapp to the /bin/ directory does not depend on Alpine files anymore. If you push this image to another Docker Hub repository Alpine layers are linked directly. Only if you export the image in some other way, for example into a local OCI tarball with --output type=oci will the layers be actually pulled.

Now when we have built and pushed this image for the first time, we can look at what happens when we need to update this image in the future. Either in the case a new Alpine 3.14 image with security fixes comes out or when we want to update to 3.15.

To avoid rebuilding everything again we can store remote cache from our earlier build. BuildKit supports many cache backend but the easiest, in this case, is to use “inline cache” that just embeds the build cache information into the image config.

To enable inline cache we either run:

docker buildx build --cache-to type=inline --push -t mysuser/myapp .

docker buildx build --build-arg BUILDKIT_INLINE_CACHE=1 --push -t mysuser/myapp .

Now we can use the image itself as a cache source when doing subsequent builds. For example, let’s see what happens when we update our previous Dockerfile to use Alpine 3.15 instead and build using the previous cache.

FROM golang AS build
....
FROM alpine:3.15
COPY --from=build --link /out/myapp /bin/
ENTRYPOINT ["/bin/myapp"]

docker buildx build --cache-from myuser/myapp -t myuser/myapp --push .

Similarily to our initial build, we will see that alpine:3.15 is not actually pulled to the local machine, and instead, the layer blobs were directly moved inside the registry. What might be more interesting is that the golang image was not pulled as well. This is because we can verify that the myapp binary has not changed, and therefore the second layer in our image has not changed as well, and we can just rebase it on top of the new alpine image. This all happens completely remotely without any local layers.

Note that without --link this was not possible before as the COPY operation depended on /bin directory from the base image and its cache was not valid anymore because the base image changed, resulting in pulling both Alpine and Golang image and recompilation of myapp binary.

Example: Better remote cache support

As another example, let’s look at how the cache is handled if you have multiple COPY commands.

#syntax=docker/dockerfile:1.4
FROM golang AS build
....
RUN go build -o /myapp .

FROM ubuntu AS config
...
RUN generate -o /myapp.config

FROM alpine:3.14
COPY --from=config --link /myapp.config /etc/
COPY --from=build --link /myapp /bin/
ENTRYPOINT ["/bin/myapp"]

In this file, we have added a second copy that adds a generated config file from another build stage. It is a very common pattern to use multiple stages for dependencies and then copy them all together in a final stage. This is how you get the best parallelization and cache reuse for your builds.

Let’s say we build and push this Dockerfile with inline cache as before:

docker buildx build --cache-to type=inline -t myuser/myapp2 --push .

Now let’s consider what happens when we need to do a rebuild using our previous inline cache and our config file generation has changed. The stage with our config generation needs to run again, but what happens to the last stage?

Without using --link, if the file myapp.config changed it would mean that Alpine image was pulled and extracted, myapp.config copied over that snapshot, and because that changed the dependencies for the COPY of myapp it would need to be recompiled and copied again as well. Note that the possibility of cache reuse here depended on the order of commands, the cache could be used until the last COPY command that matched cache, and all commands after that would need to run again. If cache for myapp would have been invalidated, we still would have gotten cache for myapp.config because that file was copied before, but not vice versa.

By adding --link, the cache reuse is now much better. All the COPY commands are now independent and none of them depend on the base image. After the new config is generated, it is directly converted into a new layer. Then this layer is replaced inside the previous image. The bottom layers for the base image and the top layer containing myapp are left as is – they never need to be pulled to the local machine at all. Only the new layer is pushed together with the new image manifest.

Differences from `--link=false`

You might wonder why was the new flag added at all instead of changing all the COPY commands to use new semantics automatically. The reason is that it is not completely backward-compatible in some rare cases. For example, let’s say your copy command is COPY myapp /path/to/myapp. If the destination directory you specified in /path/to/myapp contained a symlink in one of the components, it would have been followed and files copied to the symlink target instead. With --link, all the copies are independent, and they are never allowed to see what files the destination path contained. So instead of following a symlink, COPY --link myapp /path/to/myapp would first always create a new directory /path/to and copy the file inside it.

Another case you might see is with a command like COPY myapp /usr/bin. Notice that the destination path does not end with a slash. Without --link the previous semantics would have checked if /usr/bin is a directory. If it was, then the file would be copied as /usr/bin/myapp. If it was not then the new file would have been copied to /usr as regular file bin. These kinds of checks require extracting files on disk so that their types can be verified and are not allowed with --link. Therefore when using --link, you need to make sure that the destination path does not contain a symlink and not use ambiguous destination directory detection.

The cases listed above should be quite rare and easy to fix by simple Dockerfile modifications. If you don’t rely on symlinks in your COPY commands, the recommendation is to always start using --link. The performance of linked copies should always be either better or equivalent to regular copies, and you get much better cache reuse and optimizations for your builds.

If you are interested more about the internals on COPY --link, it is powered by the new MergeOp feature in BuildKit’s LLB definition. You can read more about MergeOp, as well as the companion DiffOp feature that is conceptually a reverse of MergeOp from BuildKit documentation..

Compiling Qt with Docker multi-stage and multi-platform

Viktor Petersson — Wed, 23 Dec 2020 15:00:00 +0000

This is a guest post from Viktor Petersson, CEO of Screenly.io. Screenly is the most popular digital signage product for the Raspberry Pi. Find Viktor on Twitter @vpetersson.

For those not familiar with Qt, it is a cross-platform development framework that is used in a wide range of products, including cars (Tesla), digital signs (Screenly), and airplanes (Lufthansa). Needless to say, Qt is very powerful. One thing you cannot say about the Qt framework, however, is that it is easy to compile — at least for embedded devices. The countless blog posts, forum threads, and Stack Overflow posts on the topic reveal that compiling Qt is a common headache.

As long-term Qt users, we have had our fair share of battles with it at Screenly. We migrated to Qt for our commercial digital signage software a number of years ago, and since then, we have been very happy with both its performance and flexibility. Recently, we decided to migrate our open source digital signage software (Screenly OSE) to Qt as well. Since these projects share no code base, this was a greenfield opportunity that allowed us to start afresh and explore exciting new technologies for the build process.

Because compiling Qt (and QtWebEngine) is a very heavy operation, we would need to pre-compile and distribute Qt so that the Dockerfile could simply download and include it in the build process (rather than compiling as part of the installation process).

We sat down and created the following requirements for our build process:

The process must be fully automated from start to finish.
We need to be able to build Qt/QtWebEngine for all supported Raspberry Pi boards (with the appropriate Qt device profile).
We should use cross compilation on x86 to speed up the process where it makes sense.
We need to be able to run the full process on CI, and thus cannot rely on a Raspberry Pi.
We should confine everything to run inside Docker containers so we do not clutter the host with build packages.

With the above goals in mind, we had a great opportunity to try out the new multi-platform support in Docker. Used in conjunction with multi-stage builds, we were able to get the best of both worlds:

Use emulation where we cannot cross-compile
Switch to cross-compilation for the heavy lifting

How does multi-platform in Docker work?

The easiest way to use multi-platform functionality in Docker is to invoke it from the command line. Using the docker buildx, we can tap into new beta functionalities. By running docker buildx build --platform linux/arm/v7 -t arm-build . This command builds the docker image as per the `Dockerfile` in the current directory using ARMv7 emulation. Behind the scenes, Docker runs the whole Docker build process in a QEMU virtualized environment (qemu-user-static to be precise). By doing this, the complexity of setting up a custom VM is removed. Once built, we can even use docker run to launch containers in ARMv7 mode automagically.

Multi-platform, multi-stage and Qt

While multi-platform functionality is a great stand-alone feature, it gets even more powerful when combined with multi-stage builds. Within a single Dockerfile, we’re able to mix and match platforms and copy between the steps. This functionality is exactly what we ended up doing with the Qt build process for Screenly OSE.

Stage 1: ARM

Thanks to the fine folks over at Balena, we are able to use a Raspbian base image in the first stage. We can invoke this step using:

FROM --platform=linux/arm/v7 balenalib/rpi-raspbian:buster as builder

After the above step, we can use Docker as we normally do and execute various RUN commands, such as installing packages etc.. Do note that this container is running emulated using QEMU if the build is not run on ARMv7 hardware. In our case, we use the command to install the Qt build dependencies. The above step also allows us to fully eliminate the need for copying files from either a disk image (which is what the Qt Wiki suggests) or rsync files from a physical Raspberry Pi.

Stage 2: x86

Once we have installed our dependencies in our ARM step, we can switch over to the builder’s native x86 architecture to avoid emulation and do the cross compile with the following line:

FROM --platform=linux/amd64 debian:buster

Now, we are onto the interesting part. After we have switched over to x86, we can copy files from the previous step. We do this in order to create a sysroot that we can use for Qt. We complete this step by running the following commands:

RUN mkdir -p /sysroot/usr /sysroot/opt /sysroot/lib

COPY --from=builder /lib/ /sysroot/lib/

COPY --from=builder /usr/include/ /sysroot/usr/include/

COPY --from=builder /usr/lib/ /sysroot/usr/lib/

COPY --from=builder /opt/vc/ sysroot/opt/vc/

We now have the best of both worlds. By taking advantage of both multi-step and multi-platform functionality, we generate a sysroot that we can use to build Qt. Since we used a fully functional Raspbian image in our previous step, we are even able to get Qt to pick up all existing libraries.

./configure \

-sysroot /sysroot

As we mentioned in the introduction, compiling Qt is far from straightforward. There are a lot of steps required to compile it successfully. To learn more about the exact steps, you can see the full Dockerfile and script build_qt5.sh.

To emulate or not to emulate…

Being able to emulate a platform like ARM is amazing and provides a lot of flexibility. However, it does come at a cost. There is a big performance penalty. This issue is the reason why we do not actually compile Qt using emulation. Instead, we use cross-compilation. If you have the ability to cross-compile rather than emulate, know that cross-compilation will give you much better performance.

About Screenly

Screenly is the most popular digital signage product for the Raspberry Pi. If you want to turn a physical screen into a secure, remotely-controllable device (over UI or digital signage API) that can display dashboards, images, videos, and webpages, Screenly makes setup a breeze. Screenly is available in two flavors: an open source version and a commercial version.

Speed Up Your Development Flow With These Dockerfile Best Practices

Guillaume Lours — Mon, 27 Apr 2020 15:00:00 +0000

The Dockerfile is the starting point for creating a Docker image. The file format provides a well-defined set of directives that allow you to copy files or folders, run commands, set environment variables, and do other tasks required to create a container image. It’s really important to craft your Dockerfile well to keep the resulting image secure, small, quick to build, and quick to update.

In this post, we’ll see how to write good Dockerfiles to speed up your development flow, ensure build reproducibility and that produce images that can be confidently deployed to production.

Note: for this blog post we’ll base our Dockerfile examples on the react-java-mysql sample from the awesome-compose repository.

Development flow

As developers, we want to match our development environment to the target production context as closely as possible to ensure that what we build will work when deployed.

We also want to be able to develop quickly which means we want builds to be fast and for us to be able to use developer tools like debuggers. Containers are a great way to codify our development environment but we need to define our Dockerfile correctly to be able to interact quickly with our containers.

Incremental builds

A Dockerfile is a list of instructions for building your container image. While the Docker builder caches the result of each step as an image layer, the cache can be invalidated causing the step that invalidated the cache and all subsequent steps to need to be rerun and the corresponding layers to be regenerated.

The cache is invalidated when files in the build context that are referenced by COPY or ADD change. The ordering of the steps can therefore have drastic effects on performance.

Let’s take a look at an example where we build a NodeJs project in the Dockerfile. In this project, there are dependencies specified in the package.json file which are fetched when the npm ci command is run.

The simplest Dockerfile would be:

FROM node:lts

ENV CI=true
ENV PORT=3000

WORKDIR /code
COPY . /code
RUN npm ci

CMD [ "npm", "start" ]

Structuring the Dockerfile as above will cause the cache to be invalidated at the COPY line any time a file in the build context changes. This means that the dependencies will be fetched and the node_modules directory filled when any file is changed instead of just the package.json file which can take a long time.

To avoid this and only fetch the dependencies when they change (i.e.: when package.json or package-lock.json changes), we should consider separating the dependency installation from the build and run of our application.

A more optimized Dockerfile would be this:

FROM node:lts


ENV CI=true

ENV PORT=3000


WORKDIR /code

COPY package.json package-lock.json /code/

RUN npm ci

COPY src /code/src


CMD [ "npm", "start" ]

Using this separation, if there are no changes in package.json or package-lock.json then the cache will be used for the layer generated by the RUN npm ci instruction. This means that when you edit your application source and rebuild, the dependencies won’t be redownloaded which saves time 🎉.

We also limit the second COPY to the src directory as explained in a previous post.

Keep live reload active between the host and the container

This tip is not directly related to the Dockerfile but we often hear this kind of question: How do I keep live reload active while running the app in a container and modifying the source code from my IDE on the host machine?

With our example, we need to mount our project directory in the container and pass an environment variable to enable Chokidar which wraps NodeJS file change events from the host.

$ docker run -e CHOKIDAR_USEPOLLING=true -v ${PWD}/src/:/code/src/ -p 3000:3000 repository/image_name

Consistent builds

One of the most important things with a Dockerfile is to build the exact same image from the same build context (sources, dependencies…)

We’ll continue to improve the Dockerfile defined in the previous section.

Build consistently from sources

As we saw in the previous section, we’re able to build an application by adding the source files and dependencies in the Dockerfile description and then running commands on them.

But in our previous example we aren’t able to confirm that the image generated will be the same each time we run a docker build…Why? Because each time NodeJS is released, we can expect the lts tag to point to the latest LTS version of the NodeJS image, which will change over time and could introduce breaking changes. We can easily fix this by using a more specific tag for the base image (we’ll let you choose between LTS or the latest stable version 😉)

FROM node:13.12.0

ENV CI=true
ENV PORT=3000

WORKDIR /code
COPY package.json package-lock.json /code/
RUN npm ci
COPY src /code/src

CMD [ "npm", "start" ]

We’ll see in the No more latest section that there are other advantages to using more specific base image tags and avoiding the latest tag.

Multi-stage and targets to match the right environment

We made the development build consistent, but how can we do this for the production artifact?

Since Docker 17.05, we can use multi-stage builds to define steps to produce our final image. Using this mechanism in our Dockerfile, we’ll be able to split the image we use for our development flow from that used to build the application and that used in production.

FROM node:13.12.0 AS development

ENV CI=true
ENV PORT=3000

WORKDIR /code
COPY package.json package-lock.json /code/
RUN npm ci
COPY src /code/src

CMD [ "npm", "start" ]

FROM development AS builder

RUN npm run build

FROM nginx:1.17.9 AS production

COPY --from=builder /code/build /usr/share/nginx/html

Each time you see FROM … AS … it’s a build stage.
So we now have a development, a build, and a production stage.
We can continue to use a container for our development flow by building the specific development stage image using the --target flag.

$ docker build --target development -t repository/image_name:development .

And use it as usual

$ docker run -e CHOKIDAR_USEPOLLING=true -v ${PWD}/src/:/code/src/ repository/image_name:development

A docker build without the --target flag will build the final stage which in this case is the production image. Our production image is simply a nginx image with the binaries built in the previous steps put in the correct place that they are served.

Production ready

It’s really important to keep your production image as lean and as secure as possible. Here are a few things to check before running a container in production.

No more latest image version

As we previously saw in the Build consistently from sources section, using a specific tag for build steps help to make the image build reproducible. There are at least two other very good reasons to use more specific tags for your images:

You can easily find all the containers running with an image version in your favorite orchestrator (Swarm, Kubernetes…)

# Search in Docker engine containers using our repository/image_name:development image

$ docker inspect $(docker ps -q) | jq -c '.[] | select(.Config.Image == "repository/image_name:development") |"\(.Id) \(.State) \(.Config)"'


"89bf376620b0da039715988fba42e78d42c239446d8cfd79e4fbc9fbcc4fd897 {\"Status\":\"running\",\"Running\":true,\"Paused\":false,\"Restarting\":false,\"OOMKilled\":false,\"Dead\":false,\"Pid\":25463,\"ExitCode\":0,\"Error\":\"\",\"StartedAt\":\"2020-04-20T09:38:31.600777983Z\",\"FinishedAt\":\"0001-01-01T00:00:00Z\"}
{\"Hostname\":\"89bf376620b0\",\"Domainname\":\"\",\"User\":\"\",\"AttachStdin\":false,\"AttachStdout\":true,\"AttachStderr\":true,\"ExposedPorts\":{\"3000/tcp\":{}},\"Tty\":false,\"OpenStdin\":false,\"StdinOnce\":false,\"Env\":[\"CHOKIDAR_USEPOLLING=true\",\"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\",\"NODE_VERSION=12.16.2\",\"YARN_VERSION=1.22.4\",\"CI=true\",\"PORT=3000\"],\"Cmd\":[\"npm\",\"start\"],\"Image\":\"repository/image_name:development\",\"Volumes\":null,\"WorkingDir\":\"/code\",\"Entrypoint\":[\"docker-entrypoint.sh\"],\"OnBuild\":null,\"Labels\":{}}"

#Search in k8s pods running a container with our repository/image_name:development image (using jq cli)
$ kubectl get pods --all-namespaces -o json | jq -c '.items[] | select(.spec.containers[].image == "repository/image_name:development")| .metadata'

{"creationTimestamp":"2020-04-10T09:41:55Z","generateName":"image_name-78f95d4f8c-","labels":{"com.docker.default-service-type":"","com.docker.deploy-namespace":"docker","com.docker.fry":"image_name","com.docker.image-tag":"development","pod-template-hash":"78f95d4f8c"},"name":"image_name-78f95d4f8c-gmlrz","namespace":"docker","ownerReferences":[{"apiVersion":"apps/v1","blockOwnerDeletion":true,"controller":true,"kind":"ReplicaSet","name":"image_name-78f95d4f8c","uid":"5ad21a59-e691-4873-a6f0-8dc51563de8d"}],"resourceVersion":"532","selfLink":"/api/v1/namespaces/docker/pods/image_name-78f95d4f8c-gmlrz","uid":"5c70f340-05f1-418f-9a05-84d0abe7009d"}

In case of CVE (Common Vulnerabilities and Exposure), you can quickly know if you need to patch or not your containers and image descriptions.

From our example we could specify that our development and production images are alpine versions.

FROM node:13.12.0-alpine AS development

ENV CI=true
ENV PORT=3000

WORKDIR /code
COPY package.json package-lock.json /code/
RUN npm ci
COPY src /code/src

CMD [ "npm", "start" ]

FROM development AS builder

RUN npm run build

FROM nginx:1.17.9-alpine

COPY --from=builder /code/build /usr/share/nginx/html

Use official images

You can use Docker Hub to search for base images to use in your Dockerfile, some of these are the officially supported ones. We strongly recommend to use these images as:

their content has been verified
they’re updated quickly when a CVE is fixed

You can add an image_filter request query param to only get the official images.

https://hub.docker.com/search?q=nginx&type=image&image_filter=official

All the previous examples in this post were using official images of NodeJS and NGINX.

Just enough permissions!

All applications, running in a container or not, should adhere to the principle of least privilege which means an application should only access the resources it needs.

In case of malicious behavior or because of bugs, a process running with too many privileges may have unexpected consequences on the whole system at runtime.

Because the NodeJS official image is well setup, we’ll switch to the backend Dockerfile.

Configuring an image to run as an unprivileged user is very easy:

FROM maven:3.6.3-jdk-11 AS builder
WORKDIR /workdir/server
COPY pom.xml /workdir/server/pom.xml
RUN mvn dependency:go-offline

RUN mvn package

FROM openjdk:11-jre-slim
RUN addgroup -S java && adduser -S javauser -G java
USER javauser

EXPOSE 8080
COPY --from=builder /workdir/server/target/project-0.0.1-SNAPSHOT.jar /project-0.0.1-SNAPSHOT.jar

CMD ["java", "-Djava.security.egd=file:/dev/./urandom", "-jar", "/project-0.0.1-SNAPSHOT.jar"]

Simply by creating a new group, adding a user to it, and using the USER directive we can run our container with a non-root user.

Conclusion

In this blog post we just showed some of the many ways to optimize and secure your Docker images by carefully crafting your Dockerfile. If you’d like to go further you can take a look at:

Our official documentation about Dockerfile best practices
A previous post on the subject by Tibor Vass
A session during the DockerCon 2019 by Tibor Vass and Sebastiaan van Stijn
Another session during Devoxx 2019 by Jérémie Drouet and myself

Top 4 Tactics To Keep Node.js Rockin’ in Docker

Bret Fisher — Tue, 30 Jul 2019 17:00:03 +0000

This is a guest post from Docker Captain Bret Fisher, a long time DevOps sysadmin and speaker who teaches container skills with his popular Docker Mastery courses including Docker Mastery for Node.js, weekly YouTube Live shows, and consults to companies adopting Docker. Join Bret for an online meetup on August 28th, where he’ll give demos and Q&A on Node.js and Docker topics.

Foxy, my Docker Mastery mascot is a fan of Node and Docker

We’ve all got our favorite languages and frameworks, and Node.js is tops for me. I’ve run Node.js in Docker since the early days for mission-critical apps. I’m on a mission to educate everyone on how to get the most out of this framework and its tools like npm, Yarn, and nodemon with Docker.

There’s a ton of info out there on using Node.js with Docker, but so much of it is years out of date, and I’m here to help you optimize your setups for Node.js 10+ and Docker 18.09+. If you’d rather watch my DockerCon 2019 talk that covers these topics and more, check it out on YouTube.

Let’s go through 4 steps for making your Node.js containers sing! I’ll include some quick “Too Long; Didn’t Read” for those that need it.

Stick With Your Current Base Distro

TL;DR: If you’re migrating Node.js apps into containers, use the base image of the host OS you have in production today. After that, my favorite base image is the official node:slim editions rather than node:alpine, which is still good but usually more work to implement and comes with limitations.

One of the first questions anyone asks when putting a Node.js app in Docker, is “Which base image should I start my Node.js Dockerfile from?”

slim and alpine are quite smaller than the default image

There are multiple factors that weigh into this, but don’t make “image size” a top priority unless you’re dealing with IoT or embedded devices where every MB counts. In recent years the slim image has shrunk down in size to 150MB and works the best across the widest set of scenarios. Alpine is a very minimal container distribution, with the smallest node image at only 75MB. However, the level of effort to swap package managers (apt to apk), deal with edge cases, and work around security scanning limitations causes me hold off on recommending node:alpine for most use cases.

When adopting container tech, like anything, you want to do what you can to reduce the change rate. So many new tools and processes come along with containers. Choosing the base image your devs and ops are most used to has many unexpected benefits, so try to stick with it when it makes sense, even if this means making a custom image for CentOS, Ubuntu, etc.

Dealing With Node Modules

TL;DR: You don’t have to relocate node_modules in your containers as long as you follow a few rules for proper local development. A second option is to move mode_modules up a directory in your Dockerfile, configure your container properly, and it’ll provide the most flexible option, but may not work with every npm framework.

We’re all now used to a world where we don’t write all the code we run in an app, and that means dealing with app framework dependencies. One common question is how to deal with those code dependencies in containers when they are a subdirectory of our app. Local bind-mounts for development can affect your app differently if those dependencies were designed to run on your host OS and not the container OS.

The core of this issue for Node.js is that node_modules can contain binaries compiled for your host OS, and if it’s different then the container OS, you’ll get errors trying to run your app when you’re bind-mounting it from the host for development. Note that if you’re a pure Linux developer and you develop on Linux x64 for Linux x64, this bind-mount issue isn’t usually a concern.

For Node.js I offer you two approaches, which come with their own benefits and limitations:

Solution A: Keep It Simple

Don’t move node_modules. It will still sit in the default subdirectory of your app in the container, but this means that you have to prevent the node_modules created on your host from being used in the container during development.

This is my preferred method when doing pure-Docker development. It works great with a few rules you must follow for local development:

Develop only through the container. Why? Basically, you don’t want to mix up the node_modules on your host with the node_modules in the container. On macOS and Windows, Docker Desktop bind-mounts your code across the OS barrier, and this can cause problems with binaries you’ve installed with npm for the host OS, that can’t be run in the container OS.
Run all your npm commands through docker-compose. This means your initial npm install for your project should now be docker-compose run npm install.

Solution B: Move Container Modules and Hide Host Modules

Relocate node_modules up the file path in the Dockerfile so you can develop Node.js in and out of the container, and the dependencies won’t clash which you switch between host-native development and Docker-based development.

Since Node.js is designed to run on multiple OS’s and architectures, you may not want to always develop in containers. If you want the flexibility to sometimes develop/run your Node.js app directly on the host, and then other times spin it up in a local container, then Solution B is your jam.

In this case you need a node_modules on host that is built for that OS, and a different node_modules in the container for Linux.

The basic lines you’ll need to move node_modules up the path

Rules for this solution include:

Move the node_modules up a directory in the container image. Node.js always looks for a node_modules as a subdirectory, but if it’s missing, it’ll walk up the directory path until it finds one. Example of doing that in a Dockerfile here.
To prevent the host node_modules subdirectory from showing up in the container, use a workaround I call an “empty bind-mount” to prevent the host node_modules from ever being used in the container. In your compose YAML it would look like this.
This works with most Node.js code, but some larger frameworks and projects seem to hard-code in the assumption that node_modules is a subdirectory, which will rule out this solution for you.

For both of these solutions, always remember to add node_modules to your .dockerignore file (same syntax as .gitignore) so you’ll never accidentally build your images with modules from the host. You always want your builds to run an npm install inside the image build.

Use The Node User, Go Least Privilege

All the official Node.js images have a Linux user added in the upstream image called node. This user is not used by default, which means your Node.js app will run as root in the container by default. This isn’t the worst thing, as it’s still isolated to that container, but you should enable in all your projects where you don’t need Node to run as root. Just add a new line in your Dockerfile: USER node

Here are some rules for using it:

Location in the Dockerfile matters. Add USER after apt/yum/apk commands, and usually before npm install commands.
It doesn’t affect all commands, like COPY, which has its own syntax for controlling owner of files you copy in.
You can always switch back to USER root if you need to. In more complex Dockerfiles this will be necessary, like my multi-stage example that includes running tests and security scans during optional stages.
Permissions may get tricky during development because now you’ll be doing things in the container as a non-root user by default. The way to often get around this is to do things like npm install by telling Docker you want to run those one-off commands as root: docker-compose run -u root npm install

Don’t Use Process Managers In Production

TL;DR: Except for local development, don’t wrap your node startup commands with anything. Don’t use npm, nodemon, etc. Have your Dockerfile CMD be something like [“node”, “file-to-start.js”] and you’ll have an easier time managing and replacing your containers.

Nodemon and other “file watchers” are necessary in development, but one big win for adopting Docker in your Node.js apps is that Docker takes over the job of what we used to use pm2, nodemon, forever, and systemd for on servers.

Docker, Swarm, and Kubernetes will do the job of running healthchecks and restarting or recreating your container if it fails. It’s also now the job of orchestrators to scale the number of replicas of our apps, which we used to use tools like pm2 and forever for. Remember, Node.js is still single-threaded in most cases, so even on a single server you’ll likely want to spin up multiple container replicas to take advantage of multiple CPU’s.

My example repo shows you how to using node directly in your Dockerfile, and then for local development, either build use a different image stage with docker build --target , or override the CMD in your compose YAML.

Start Node Directly in Dockerfiles

TL;DR I also don’t recommend using npm to start your apps in your Dockerfile. Let me explain.

I recommend calling the node binary directly, largely due to the “PID 1 Problem” where you’ll find some confusion and misinformation online about how to deal with this in Node.js apps. To clear up confusion in the blogosphere, you don’t always need a “init” tool to sit between Docker and Node.js, and you should probably spend more time thinking about how your app stops gracefully.

Node.js accepts and forwards signals like SIGINT and SIGTERM from the OS, which is important for proper shutdown of your app. Node.js leaves it up to your app to decide how to handle those signals, which means if you don’t write code or use a module to handle them, your app won’t shut down gracefully. It’ll ignore those signals and then be killed by Docker or Kubernetes after a timeout period (Docker defaults to 10 seconds, Kubernetes to 30 seconds.) You’ll care a lot more about this once you have a production HTTP app that you have to ensure doesn’t just drop connections when you want to update your apps.

Using other apps to start Node.js for you, like npm for example, often break this signaling. npm won’t pass those signals to your app, so it’s best to leave it out of your Dockerfiles ENTRYPOINT and CMD. This also has the benefit of having one less binary running in the container. Another bonus is it allows you to see in the Dockerfile exactly what your app will do when your container is launched, rather then also having to check the package.json for the true startup command.

For those that know about init options like docker run --init or using tini in your Dockerfile, they are good backup options when you can’t change your app code, but it’s a much better solution to write code to handle proper signal handling for graceful shutdowns. Two examples are some boilerplate code I have here, and looking at modules like stoppable.

Is That All?

Nope. These are concerns that nearly every Node.js team deals with, and there’s lots of other considerations that go along with that. Topics like multi-stage builds, HTTP proxies, npm install performance, healthchecks, CVE scanning, container logging, testing during image builds, and microservice docker-compose setups are all common questions for my Node.js clients and students.

If you’re wanting more info on these topics, you can watch my DockerCon 2019 session video on this topic, or check my 8-hours of Docker for Node.js videos at https://www.bretfisher.com/node

Thanks for reading. You can reach me on Twitter, get my weekly DevOps and Docker newsletter, subscribe to my weekly YouTube videos and Live Show, and check out my other Docker resources and courses.

Keep on Dockering!

Docker Captain @bretfisher dives into 4 Ways to Keep #nodejs Rockin in Docker
Click To Tweet

Want to learn more? Join Bret for an online meetup on August 28th.