Creating minimal Docker images from dynamically linked ELF binariesThu 05 February 2015 by Lars Kellogg-Stedman Tags docker
In this post, we'll look at a method for building minimal Docker images for dynamically linked ELF binaries, and then at a tool for automating this process.
It is tempting, when creating a simple Docker image, to start with one
of the images provided by the major distributions. For example, if
you need an image that provides
tcpdump for use on your Atomic
host, you might do something like:
FROM fedora RUN yum -y install tcpdump
And while this will work, you end up consuming 250MB for
In theory, the layering mechanism that Docker uses to build images
will reduce the practical impact of this (because other images based on
fedora image will share the common layers), but in practice the
size is noticeable, especially if you often find yourself pulling this
image into a fresh environment with no established cache.
You can substantially reduce the space requirements for a Docker image
by including only those things that are absolutely necessary. For
statically linked files, that may only be the binary itself, but the
situation is a little more complex for dynamically linked executables.
You might naively start with this (assuming that you had the
binary in your local directory):
FROM scratch COPY tcpdump /usr/sbin/tcpdump
If you were to build an image with this and tag it
docker build -t tcpdump .
...and then try running it:
docker run tcpdump
You would immediately see:
no such file or directory FATA Error response from daemon: Cannot start container ...: no such file or directory
And this is because the image is missing two things:
- The Linux dynamic runtime loader, and
- The shared libraries required by the
The path to the appropriate loader is stored in the ELF binary in the
.interp section, which we can inspect using the
$ objdump -s -j .interp tcpdump tcpdump: file format elf64-x86-64 Contents of section .interp: 400238 2f6c6962 36342f6c 642d6c69 6e75782d /lib64/ld-linux- 400248 7838362d 36342e73 6f2e3200 x86-64.so.2.
Which tells us we need
We can use the
ldd tool to get a list of shared libraries required
by the binary:
$ ldd tcpdump linux-vdso.so.1 => (0x00007fffed1fe000) libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007fb2c05a3000) libpcap.so.1 => /lib64/libpcap.so.1 (0x00007fb2c0361000) libc.so.6 => /lib64/libc.so.6 (0x00007fb2bffa3000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fb2bfd9f000) libz.so.1 => /lib64/libz.so.1 (0x00007fb2bfb89000) /lib64/ld-linux-x86-64.so.2 (0x00007fb2c09b7000)
If we copy all of the dependencies into a local directory, along with
tcpdump binary itself, and use the following layout:
Dockerfile usr/sbin/tcpdump lib64/libcrypto.so.10 lib64/libpcap.so.1 lib64/libc.so.6 lib64/libdl.so.2 lib64/libz.so.1 lib64/ld-linux-x86-64.so.2
And the following Dockerfile content:
FROM scratch COPY . / ENTRYPOINT ["/usr/sbin/tcpdump"]
And then we turn this into a Docker image and run it, we get:
$ docker build -t tcpdump . [...] $ docker run tcpdump tcpdump -i eth0 -n tcpdump: Couldn't find user 'tcpdump'
Well, so let's create an
/etc/passwd file with the
and add that to our collection:
$ mkdir etc $ grep tcpdump /etc/passwd > etc/passwd $ grep tcpdump /etc/group > etc/group $ docker build -t tcpdump . $ docker run tcpdump tcpdump -i eth0 -n tcpdump: Couldn't find user 'tcpdump'
And this is because most programs don't reference files like
/etc/passwd directly, but instead delegate this task to the C
library, which relies on the name service switch (nss)
mechanism to support multiple sources of information. Let's add the
nss libraries necessary for supporting legacy files (
etc) and DNS for hostname lookups:
lib64/libnss_files.so.2-- this includes support the traditional files in
/etc, such as
lib64/libnss_dns.so.2-- this supports hostname resolution via dns.
And we'll also need
/etc/nsswitch.conf to go along with that:
passwd: files shadow: files group: files hosts: files dns
After all this, we have:
Dockerfile usr/sbin/tcpdump lib64/libcrypto.so.10 lib64/libpcap.so.1 lib64/libc.so.6 lib64/libdl.so.2 lib64/libz.so.1 lib64/ld-linux-x86-64.so.2 lib64/libnss_files.so.2 lib64/libnss_dns.so.2 etc/passwd etc/group etc/nsswitch.conf
Let's rebuild the image and run it one more time:
$ docker build -t tcpdump . $ docker run tcpdump -i eth0 -n
And now, finally it runs. Wouldn't it be nice if that process were easier?
Dockerize is a tool that largely automates the above process. To
build a minimal
tcpdump image, for example, you would run:
$ dockerize -u tcpdump -t tcpdump /usr/sbin/tcpdump
This would include the
tcpdump user (
-u tcpdump) from your local
system, as well as
/usr/sbin/tcpdump, all it's dependencies, and
file-based nss support, and build an image tagged
tcpdump). When you build an image from a single command, like this,
Dockerize will up that command as the Docker
ENTRYPOINT, so you can
run it like this:
$ docker run tcpdump -i eth0 -n
You can also build images containing multiple binaries. For example:
$ dockerize -t dockerizeme/xmltools \ /usr/bin/xmllint \ /usr/bin/xml2 \ /usr/bin/2xml \ /usr/bin/tidyp
In this case, you need to provide the command name when running the image:
$ docker run dockerizeme/xmltools tidyp -h
You can find some scripts that generate marginally useful example images in the examples folder of the repository.
These are pushed into the dockerizeme namespace on the Docker hub, so you can, for example, get yourself a minimal webserver by running:
$ docker run -v $PWD:/content -p 8888:80 dockerizeme/thttpd -d /content
And then browse to http://localhost:8888 and see your current directory.