Justin Chadwell – Docker https://www.docker.com Tue, 24 Jan 2023 18:47:32 +0000 en-US hourly 1 https://wordpress.org/?v=6.2.2 https://www.docker.com/wp-content/uploads/2023/04/cropped-Docker-favicon-32x32.png Justin Chadwell – Docker https://www.docker.com 32 32 Generating SBOMs for Your Image with BuildKit https://www.docker.com/blog/generate-sboms-with-buildkit/ Tue, 24 Jan 2023 15:00:00 +0000 https://www.docker.com/?p=39978 Learn how to use BuildKit to generate SBOMs for your images and packages.

The latest release series of BuildKit, v0.11, introduces support for build-time attestations and SBOMs, allowing publishers to create images with records of how the image was built. This makes it easier for you to answer common questions, like which packages are in the image, where the image was built from, and whether you can reproduce the same results locally.

This new data helps you make informed decisions about the security of the images you consume — without needing to do all the manual work yourself.

In this blog post, we’ll discuss what attestations and SBOMs are, how to build images that contain SBOMs, and how to start analyzing the resulting data!

What are attestations?

An attestation is a declaration that a statement is true. With software, an attestation is a record that specifies a statement about a software artifact. For example, it could include who built it and when, what inputs it was built with, what outputs it produced, etc.

By writing these attestations, and distributing them alongside the artifacts themselves, you can see these details that might otherwise be tricky to find. To get this kind of information without attestations, you’d have to try and reverse-engineer how the image was built by trying to locate the source code and even attempting to reproduce the build yourself.

To provide this valuable information to the end-users of your images, BuildKit v0.11 lets you build these attestations as part of your normal build process. All it takes is adding a few options to your build step.

BuildKit supports attestations in the in-toto format (from the in-toto framework). Currently, the Dockerfile frontend produces two types of attestations that answer two different questions:

  • SBOM (Software Bill of Materials) – An SBOM contains a list of software components inside an image. This will include the names of various packages installed, their version numbers, and any other associated metadata. You can use this to see, at a glance, if an image contains a specific package or determine if an image is vulnerable to specific CVEs.
  • SLSA Provenance – The Provenance of the image describes details of the build process, such as what materials (like, images, URLs, files, etc.) were consumed, what build parameters were set, as well as source maps that allow mapping the resulting image back to the Dockerfile that created it. You can use this to analyze how an image was built, determine whether the sources consumed all appear legitimate, and even attempt to rebuild the image yourself.

Users can also define their own custom attestation types via a custom BuildKit frontend. In this post, we’ll focus on SBOMs and how to use them with Dockerfiles.

Getting the latest release

Building attestations into your images requires the latest releases of both Buildx and BuildKit – you can get the latest versions by updating Docker Desktop to the most recent version.

You can check your version number, and ensure it matches the buildx v0.10 release series:

$ docker buildx version
github.com/docker/buildx 0.10.0 ...

To use the latest release of BuildKit, create a docker-container builder using buildx:

$ docker buildx create --use --name=buildkit-container --driver=docker-container

You can check that the new builder is configured correctly, and ensure it matches the buildkit v0.11 release series:

$ docker buildx inspect | grep -i buildkit
Buildkit:  v0.11.1

If you’re using the docker/setup-buildx-action in GitHub Actions, then you’ll get this all automatically without needing to update.

With that out of the way, you can move on to building an image containing SBOMs!

Adding SBOMs to your images

You’re now ready to generate an SBOM for your image!

Let’s start with the following Dockerfile to create an nginx web server:

# syntax=docker/dockerfile:1.5

FROM nginx:latest
COPY ./static /usr/share/nginx/html

You can build and push this image, along with its SBOM, in one step:

$ docker buildx build --sbom=true -t <myorg>/<myimage> --push .

That’s all you need! In your build output, you should spot a message about generating the SBOM:

...
=> [linux/amd64] generating sbom using docker.io/docker/buildkit-syft-scanner:stable-1                           	0.2s
...

BuildKit generates SBOMs using scanner plugins. By default, it uses buildkit-syft-scanner, a scanner built on top of Anchore’s Syft open-source project, to do the heavy lifting. If you like, you can use another scanner by specifying the generator= option. 

Here’s how you view the generated SBOM using buildx imagetools:

$ docker buildx imagetools inspect <myorg>/<myimage> --format "{{ json .SBOM.SPDX }}"
{
	"spdxVersion": "SPDX-2.3",
	"dataLicense": "CC0-1.0",
	"SPDXID": "SPDXRef-DOCUMENT",
	"name": "/run/src/core/sbom",
	"documentNamespace": "https://anchore.com/syft/dir/run/src/core/sbom-a589a536-b5fb-49e8-9120-6a12ce988b67",
	"creationInfo": {
	"licenseListVersion": "3.18",
	"creators": [
	"Organization: Anchore, Inc",
	"Tool: syft-v0.65.0",
	"Tool: buildkit-v0.11.0"
	],
	"created": "2023-01-05T16:13:17.47415867Z"
	},
	...

SBOMs also work with the local and tar exporters. When you export with these exporters, instead of attaching the attestations directly to the output image, the attestations are exported as separate files into the output filesystem:

$ docker buildx build --sbom=true -o ./image .
$ ls -lh ./image
-rw-------  1 user user 6.5M Jan 17 14:36 sbom.spdx.json
...

Viewing the SBOM in this case is as simple as cat-ing the result:

$ cat ./image/sbom.spdx.json | jq .predicate
{
	"spdxVersion": "SPDX-2.3",
	"dataLicense": "CC0-1.0",
	"SPDXID": "SPDXRef-DOCUMENT",
	…

Supplementing SBOMs

Generating SBOMs using a scanner is a good first start! But some packages won’t be correctly detected because they’ve been installed in a slightly unconventional way.

If that’s the case, you can still get this information into your SBOMs with a bit of manual interaction.

Let’s suppose you’ve installed foo v1.2.3 into your image by downloading it using curl:

RUN curl https://example.com/releases/foo-v1.2.3-amd64.tar.gz | tar xzf - && \
    mv foo /usr/local/bin/

Software installed this way likely won’t appear in your SBOM unless the SBOM generator you’re using has special support for this binary (for example, Syft has support for detecting certain known binaries).

You can manually generate an SBOM for this piece of software by writing an SPDX snippet to a location of your choice on the image filesystem using a Dockerfile heredoc:

COPY /usr/local/share/sbom/foo.spdx.json <<"EOT"
{
	"spdxVersion": "SPDX-2.3",
	"SPDXID": "SPDXRef-DOCUMENT",
	"name": "foo-v1.2.3",
	...
}
EOT

This SBOM should then be picked up by your SBOM generator and included in the final SBOM for the whole image. This behavior is included out-of-the-box in buildkit-syft-scanner, but may not be part of every generator’s toolkit.

Even more SBOMs!

While the above section is good for scanning a basic image, it might struggle to provide more detailed package and file information. BuildKit can help you scan additional components of your build, including intermediate stages and your build context using the BUILDKIT_SBOM_SCAN_STAGE and BUILDKIT_SBOM_SCAN_CONTEXT arguments respectively.

In the case of multi-stage builds, this allows you to track dependencies from previous stages, even though that software might not appear in your final image.

For example, for a demo C/C++ program, you might have the following Dockerfile:

# syntax=docker/dockerfile:1.5

FROM ubuntu:22.04 AS build
ARG BUILDKIT_SBOM_SCAN_STAGE=true
RUN apt-get update && apt-get install -y git build-essential
WORKDIR /src
RUN git clone https://example.com/myorg/myrepo.git .
RUN make build

FROM scratch
COPY --from=build /src/build/ /

If you just scanned the resulting image, it wouldn’t reveal that the build tools, like Git or GCC (included in the build-essential package), were ever used in the build process! By integrating SBOM scanning into your build using the BUILDKIT_SBOM_SCAN_STAGE build argument, you can get much richer information that would otherwise have been completely lost.

You can access these additionally generated SBOM documents in imagetools as well:

$ docker buildx imagetools inspect <myorg>/<myimage> --format "{{ range .SBOM.AdditionalSPDXs }}{{ json . }}{{ end }}"
{
	"spdxVersion": "SPDX-2.3",
	"SPDXID": "SPDXRef-DOCUMENT",
	...
}
{
	"spdxVersion": "SPDX-2.3",
	"SPDXID": "SPDXRef-DOCUMENT",
	...
}
...

For the local and tar exporters, these will appear as separate files in your output directory:

$ docker buildx build --sbom=true -o ./image .
$ ls -lh ./image
-rw------- 1 user user 4.3M Jan 17 14:40 sbom-build.spdx.json
-rw------- 1 user user  877 Jan 17 14:40 sbom.spdx.json
...

Analyzing images

Now that you’re publishing images containing SBOMs, it’s important to find a way to analyze them to take advantage of this additional data.

As mentioned above, you can extract the attached SBOM attestation using the imagetools subcommand:

$ docker buildx imagetools inspect <myorg>/<myimage> --format "{{json .SBOM.SPDX}}"
{
	"spdxVersion": "SPDX-2.3",
	"dataLicense": "CC0-1.0",
	"SPDXID": "SPDXRef-DOCUMENT",
	...

If your target image is built for multiple architectures using the --platform flag, then you’ll need a slightly different syntax to extract the SBOM attestation:

$ docker buildx imagetools inspect <myorg>/<myimage> --format "{{ json (index .SBOM "linux/amd64").SPDX}}"
{
	"spdxVersion": "SPDX-2.3",
	"dataLicense": "CC0-1.0",
	"SPDXID": "SPDXRef-DOCUMENT",
	...

Now suppose you want to list all of the packages, and their versions, inside an image. You can modify the value passed to the --format flag to be a go template that lists the packages:

$ docker buildx imagetools inspect <myorg>/<myimage> --format '{{ range .SBOM.SPDX.packages }}{{ println .name .versionInfo }}{{ end }}' | sort
adduser 3.118
apt 2.2.4
base-files 11.1+deb11u6
base-passwd 3.5.51
bash 5.1-2+deb11u1
bsdutils 1:2.36.1-8+deb11u1
ca-certificates 20210119
coreutils 8.32-4+b1
curl 7.74.0-1.3+deb11u3
...

Alternatively, you might want to get the version information for a piece of software that you know is installed:

$ docker buildx imagetools inspect <myorg>/<myimage> --format '{{ range .SBOM.SPDX.packages }}{{ if eq .name "nginx" }}{{ println .versionInfo }}{{ end }}{{ end }}'
1.23.3-1~bullseye

You can even take the whole SBOM and use it to scan for CVEs using a tool that can use SBOMs to search for CVEs (like Anchore’s Grype):

$ docker buildx imagetools inspect <myorg>/<myimage> --format '{{ json .SBOM.SPDX }}' | grype
NAME          	INSTALLED            	FIXED-IN 	TYPE  VULNERABILITY 	SEVERITY   
apt           	2.2.4                             	deb   CVE-2011-3374 	Negligible  
bash          	5.1-2+deb11u1        	(won't fix) deb   CVE-2022-3715 	 
...

These operations should complete super quickly! Because the SBOM was already generated at build, you’re just querying already-existing data from Docker Hub instead of needing to generate it from scratch every time.

Going further

In this post, we’ve only covered the absolute basics to getting started with BuildKit and SBOMs — you can find out more about the things we’ve talked about on docs.docker.com:

And you can find out more about other features released in the latest BuildKit v0.11 release here.

]]>
Highlights from the BuildKit v0.11 Release https://www.docker.com/blog/highlights-buildkit-v0-11-release/ Thu, 19 Jan 2023 15:00:00 +0000 https://www.docker.com/?p=39889 BuildKit v0.11 now available.

BuildKit v0.11 is now available, along with Buildx v0.10 and v1.5 of the Dockerfile syntax. We’ve released new features, bug fixes, performance improvements, and improved documentation for all of the Docker Build tools.
Let’s dive into what’s new! We’ll cover the highlights, but you can get the whole story in the full changelogs.

1. SLSA Provenance

BuildKit can now create SLSA Provenance attestation to trace the build back to source and make it easier to understand how a build was created. Images built with new versions of Buildx and BuildKit include metadata like links to source code, build timestamps, and the materials used during the build. To attach the new provenance, BuildKit now defaults to creating OCI-compliant images.

Although docker buildx will add a provenance attestation to all new images by default, you can also opt into more detail. These additional details include your Dockerfile source, source maps, and the intermediate representations used by BuildKit. You can enable all of these new provenance records using the new --provenance flag in Buildx:

$ docker buildx build --provenance=true -t <myorg>/<myimage> --push .

Or manually set the provenance generation mode to either min or max (read more about the different modes):

$ docker buildx build --provenance=mode=max -t <myorg>/<myimage> --push .

You can inspect the provenance of an image using the imagetools subcommand. For example, here’s what it looks like on the moby/buildkit image itself:

$ docker buildx imagetools inspect moby/buildkit:latest --format '{{ json .Provenance }}'
{
  "linux/amd64": {
    "SLSA": {
      "buildConfig": {

You can use this provenance to find key information about the build environment, such as the git repository it was built from:

$ docker buildx imagetools inspect moby/buildkit:latest --format '{{ json (index .Provenance "linux/amd64").SLSA.invocation.configSource }}'
{
  "digest": {
	"sha1": "830288a71f447b46ad44ad5f7bd45148ec450d44"
  },
  "entryPoint": "Dockerfile",
  "uri": "https://github.com/moby/buildkit.git#refs/tags/v0.11.0"
}

Or even the CI job that built it in GitHub actions:

$ docker buildx imagetools inspect moby/buildkit:latest --format '{{ (index .Provenance "linux/amd64").SLSA.builder.id }}'
https://github.com/moby/buildkit/actions/runs/3878249653

Read the documentation to learn more about SLSA Provenance attestations or to explore BuildKit’s SLSA fields.

2. Software Bill of Materials

While provenance attestations help to record how a build was completed, Software Bill of Materials (SBOMs) record what components are used. This is similar to tools like docker sbom, but, instead of requiring you to perform your own scans, the author of the image can build the results into the image.

You can enable built-in SBOMs with the new --sbom flag in Buildx:

$ docker buildx build --sbom=true -t <myorg>/<myimage> --push .

By default, BuildKit uses docker/buildkit-syft-scanner (powered by Anchore’s Syft project) to build an SBOM from the resulting image. But any scanner that follows the BuildKit SBOM scanning protocol can be used here:

$ docker buildx build --sbom=generator=<custom-scanner> -t <myorg>/<myimage> --push .

Similar to SLSA provenance, you can use imagetools to query SBOMs attached to images. For example, if you list all of the discovered dependencies used in moby/buildkit, you get this:

$ docker buildx imagetools inspect moby/buildkit:latest --format '{{ range (index .SBOM "linux/amd64").SPDX.packages }}{{ println .name }}{{ end }}'
github.com/Azure/azure-sdk-for-go/sdk/azcore
github.com/Azure/azure-sdk-for-go/sdk/azidentity
github.com/Azure/azure-sdk-for-go/sdk/internal
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob
...

Read the SBOM attestations documentation to learn more.

3. SOURCE_DATE_EPOCH

Getting reproducible builds out of Dockerfiles has historically been quite tricky — a full reproducible build requires bit-for-bit accuracy that produces the exact same result each time. Even builds that are fully deterministic would get different timestamps between runs.

The new SOURCE_DATE_EPOCH build argument helps resolve this, following the standardized environment variable from the Reproducible Builds project. If the build argument is set or detected in the environment by Buildx, then BuildKit will set timestamps in the image config and layers to be the specified Unix timestamp. This helps you get perfect bit-for-bit reproducibility in your builds.

SOURCE_DATE_EPOCH is automatically detected by Buildx from the environment. To force all the timestamps in the image to the Unix epoch:

$ SOURCE_DATE_EPOCH=0 docker buildx build -t <myorg>/<myimage> .

Alternatively, to set it to the timestamp of the most recent commit:

$ SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct) docker buildx build -t <myorg>/<myimage> .

Read the documentation to find out more about how BuildKit handles SOURCE_DATE_EPOCH

4. OCI image layouts as named contexts

BuildKit has been able to export OCI image layouts for a while now. As of v0.11, BuildKit can import those results again using named contexts. This makes it easier to build contexts entirely locally — without needing to push intermediate results to a registry.

For example, suppose you want to build your own custom intermediate image based on Alpine that contains some development tools:

$ docker buildx build . -f intermediate.Dockerfile --output type=oci,dest=./intermediate,tar=false

This builds the contents of intermediate.Dockerfile and exports it into an OCI image layout into the intermediate/ directory (using the new tar=false option for OCI exports). To use this intermediate result in a Dockerfile, refer to it using any name you like in the FROM statement in your main Dockerfile:

FROM base
RUN ... # use the development tools in the intermediate image

You can then connect this Dockerfile to your OCI layout using the new oci-layout:// URI schema for the --build-context flag:

$ docker buildx build . -t <myorg>/<myimage> --build-context base=oci-layout://intermediate

Instead of resolving the image base to Docker Hub, BuildKit will instead read it from oci-layout://intermediate in the current directory, so you don’t need to push the intermediate image to a remote registry to be able to use it.

Refer to the documentation to find out more about using oci-layout:// with the --build-context flag.

5. Cloud cache backends

To get good build performance when building in ephemeral environments, such as CI pipelines, you need to store the cache in a remote backend. The newest release of BuildKit supports two new storage backends: Amazon S3 and Azure Blob Storage.

When you build images, you can provide the details of your S3 bucket or Azure Blob store to automatically store your build cache to pull into future builds. This build cache means that even though your CI or local runners might be destroyed and recreated, you can still access your remote cache to get quick builds when nothing has changed.

To use the new backends, you can specify them using the --cache-to and --cache-from flags:

$ docker buildx build --push -t <user>/<image> \
  --cache-to type=s3,region=<region>,bucket=<bucket>,name=<cache-image>[,parameters...] \
  --cache-from type=s3,region=<region>,bucket=<bucket>,name=<cache-image> .

$ docker buildx build --push -t <registry>/<image> \
  --cache-to type=azblob,name=<cache-image>[,parameters...] \
  --cache-from type=azblob,name=<cache-image>[,parameters...] .

You also don’t have to choose between one cache backend or the other. BuildKit v0.11 supports multiple cache exports at a time so you can use as many as you’d like.

Find more information about the new S3 backend in the Amazon S3 cache and the Azure Blob Storage cache backend documentation. 

6. OCI Image annotations

OCI image annotations allow attaching metadata to container images at the manifest level. They’re an alternative to labels that are more generic, and they can be more easily attached to multi-platform images.

All BuildKit image exporters now allow setting annotations to the image exporters. To set the annotations of your choice, use the --output flag:

$ docker buildx build ... \
    --output "type=image,name=foo,annotation.org.opencontainers.image.title=Foo"

You can set annotations at any level of the output, for example, on the image index:

$ docker buildx build ... \
    --output "type=image,name=foo,annotation-index.org.opencontainers.image.title=Foo"

Or even set different annotations for each platform:

$ docker buildx build ... \
    --output "type=image,name=foo,annotation[linux/amd64].org.opencontainers.image.title=Foo,annotation[linux/arm64].org.opencontainers.image.title=Bar"

You can find out more about creating OCI annotations on BuildKit images in the documentation.

7. Build inspection with --print

If you are starting in a codebase with Dockerfiles, understanding how to use them can be tricky. Buildx supports the new --print flag to print details about a build. This flag can be used to get quick and easy information about required build arguments and secrets, and targets that you can build. 

For example, here’s how you get an outline of BuildKit’s Dockerfile:

$ BUILDX_EXPERIMENTAL=1 docker buildx build --print=outline https://github.com/moby/buildkit.git
TARGET:  	buildkit
DESCRIPTION: builds the buildkit container image

BUILD ARG              	   VALUE   DESCRIPTION
RUNC_VERSION           	   v1.1.4   
ALPINE_VERSION         	   3.17	 
BUILDKITD_TAGS                  	defines additional Go build tags for compiling buildkitd
BUILDKIT_SBOM_SCAN_STAGE   true 

We can also list all the different targets to build:

$ BUILDX_EXPERIMENTAL=1 docker buildx build --print=targets https://github.com/moby/buildkit.git
TARGET             	DESCRIPTION
alpine-amd64      	 
alpine-arm        	 
alpine-arm64      	 
alpine-s390x      	 

Any frontend that implements the BuildKit subrequests interface can be used with the buildx --print flag. They can even define their own print functions, and aren’t just limited to outline or targets.

The --print feature is still experimental, so the interface may change, and we may add new functionality over time. If you have feedback, please open an issue or discussion on the docker/buildx GitHub, we’d love to hear your thoughts!

8. Bake features

The Bake file format for orchestrating builds has also been improved.

Bake now supports more powerful variable interpolation, allowing you to use fields from the same or other blocks. This can reduce duplication and make your bake files easier to read:

target "foo" {
  dockerfile = target.foo.name + ".Dockerfile"
  tags       = [target.foo.name]
}

Bake also supports null values for build arguments and allows labels to use the defaults set in your Dockerfile so your bake definition doesn’t override those:

variable "GO_VERSION" {
  default = null
}
target "default" {
  args = {
    GO_VERSION = GO_VERSION
  }
}

Read the Bake documentation to learn more. 

More improvements and bug fixes 

In this post, we’ve only scratched the surface of the new features in the latest release. Along with all the above features, the latest releases include quality-of-life improvements and bug fixes. Read the full changelogs to learn more:

We welcome bug reports and contributions, so if you find an issue in the releases, let us know by opening a GitHub issue or pull request, or get in contact in the #buildkit channel in the Docker Community Slack.

]]>
Introduction to heredocs in Dockerfiles https://www.docker.com/blog/introduction-to-heredocs-in-dockerfiles/ Fri, 30 Jul 2021 15:00:00 +0000 https://www.docker.com/blog/?p=28551 Guest post by Docker Community Member Justin Chadell. This post originally appeared here.

As of a couple weeks ago, Docker’s BuildKit tool for building Dockerfiles now supports heredoc syntax! With these new improvements, we can do all sorts of things that were difficult before, like multiline RUNs without needing all those pesky backslashes at the end of each line, or the creation of small inline configuration files.

In this post, I’ll cover the basics of what these heredocs are, and more importantly what you can use them for, and how to get started with them! 🎉

Whale Logo332 5

BuildKit (a quick refresher)

From BuildKit’s own github:

BuildKit is a toolkit for converting source code to build artifacts in an efficient, expressive and repeatable manner.

Essentially, it’s the next generation builder for docker images, neatly separate from the rest of the main docker runtime; you can use it for building docker images, or images for other OCI runtimes.

It comes with a lot of useful (and pretty) features beyond what the basic builder supports, including neater build log output, faster and more cache-efficient builds, concurrent builds, as well as a very flexible architecture to allow easy extensibility (I’m definitely not doing it justice).

You’re either most likely using it already, or you probably want to be! You can enable it locally by setting the environment variable DOCKER_BUILDKIT=1 when performing your docker build, or switch to using the new(ish) docker buildx command.

At a slightly more technical level, buildkit allows easy switching between multiple different “builders”, which can be local or remote, in the docker daemon itself, in docker containers or even in a Kubernetes pod. The builder itself is split up into two main pieces, a frontend and a backend: the frontend produces intermediate Low Level Builder (LLB) code, which is then constructed into an image by the backend.

You can think of LLB to BuildKit as the LLVM IR is to Clang.

Part of what makes buildkit so fantastic is it’s flexibility – these components are completely detached from each other, so you can use any frontend in any image. For example, you could use the default Dockerfile frontend, or compile your own self-contained buildpacks, or even develop your own alternative file format like Mockerfile.

Getting setup

To get started with using heredocs, first make sure you’re setup with buildkit. Switching to buildkit gives you a ton of out-of-the-box improvements to your build setup, and should have complete compatibility with the old builder (and you can always switch back if you don’t like it).

With buildkit properly setup, you can create a new Dockerfile: at the top of this file, we need to include a #syntax= directive. This directive informs the parser to use a specific frontend – in this case, the one located at docker/dockerfile:1.3-labs on Docker Hub.

# syntax=docker/dockerfile:1.3-labs

With this line (which has to be the very first line), buildkit will find and download the right image, and then use it to build the image.

We then specify the base image to build from (just like we normally would):

FROM ubuntu:20.04

With all that out the way, we can use a heredoc, executing two commands in the same RUN!

RUN <<EOF
echo "Hello" >> /hello
echo "World!" >> /hello
EOF

Why?

Now that heredocs are working, you might be wondering – why all the fuss? Well, this feature has kind of, until now, been missing from Dockerfiles.

See moby/moby#34423 for the original issue that proposed heredocs in 2017.

Let’s suppose you want to build an image that requires a lot of commands to setup. For example, a fairly common pattern in Dockerfiles involves wanting to update the system, and then to install some additional dependencies, i.e. apt update, upgrade and install all at once.

Naively, we might put all of these as separate RUNs:

RUN apt-get update
RUN apt-get upgrade -y
RUN apt-get install -y ...

But, sadly like too many intuitive solutions, this doesn’t quite do what we want. It certainly works – but we create a new layer for each RUN, making our image much larger than it needs to be (and making builds take much longer).

So, we can squish this into a single RUN command:

RUN apt-get update && \
    apt-get upgrade -y && \
    apt-get install -y ...

And that’s what most Dockerfiles do today, from the official docker images down to the messy ones I’ve written for myself. It works fine, images are small and fast to build… but it does look a bit ugly. And if you accidentally forget the line continuation symbol \, well, you’ll get a syntax error!

Heredocs are the next step to improve this! Now, we can just write:

RUN <<EOF
apt-get update
apt-get upgrade -y
apt-get install -y ...
EOF

We use the <<EOF to introduce the heredoc (just like in sh/bash/zsh/your shell of choice), and EOF at the end to close it. In between those, we put all our commands as the content of our script to be run by the shell!

More ways to run…

So far, we’ve seen some basic syntax. However, the new heredoc support doesn’t just allow simple examples, there’s lots of other fun things you can do.

For completeness, the hello world example using the same syntax we’ve already seen:

RUN <<EOF
echo "Hello" >> /hello
echo "World!" >> /hello
EOF

But let’s say your setup scripts are getting more complicated, and you want to use another language – say, like Python. Well, no problem, you can connect heredocs to other programs!

RUN python3 <<EOF
with open("/hello", "w") as f:
    print("Hello", file=f)
    print("World", file=f)
EOF

In fact, you can use as complex commands as you like with heredocs, simplifying the above to:

RUN python3 <<EOF > /hello
print("Hello")
print("World")
EOF

If that feels like it’s getting a bit fiddly or complicated, you can also always just use a shebang:

RUN <<EOF
#!/usr/bin/env python3
with open("/hello", "w") as f:
    print("Hello", file=f)
    print("World", file=f)
EOF

There’s lots of different ways to connect heredocs to RUN, and hopefully some more ways and improvements to come in the future!

…and some file fun!

Heredocs in Dockerfiles also let us mess around with inline files! Let’s suppose you’re building an nginx site, and want to create a custom index page:

FROM nginx

COPY index.html /usr/share/nginx/html

And then in a separate file index.html, you put your content. But if your index page is just really simple, it feels frustrating to have to separate everything out: heredocs let you keep everything in the same place if you want!

FROM nginx

COPY <<EOF /usr/share/nginx/html/index.html
(your index page goes here)
EOF

You can even copy multiple files at once, in a single layer:

COPY <<robots.txt <<humans.txt /usr/share/nginx/html/
(robots content)
robots.txt
(humans content)
humans.txt

Finishing up

Hopefully, I’ve managed to convince you to give heredocs a try when you can! For now, they’re still only available in the staging frontend, but they should be making their way into a release very soon – so make sure to take a look and give your feedback!If you’re interested, you can find out more from the official buildkit Dockerfile syntax guide.

]]>