Cross-posted on Docker Engineering blog
Docker engine provides lots of great functionality that is often tightly integrated to the features of the Linux kernel. For example, a component of container isolation is based on Linux namespaces. To create namespaces in Linux you need privileged capabilities. Same is true for mounting filesystems that is the basis of Docker’s storage model. Therefore historically Docker daemon has always needed to be started by the root user.
We can now start to relax this requirement in certain conditions, helping to reduce the security footprint of the daemon and expose the power of Docker to the systems where users cannot gain root privileges.
With the great work done by Moby and BuildKit maintainer Akihiro Suda, we added rootless support to BuildKit image builder in 2018 and from February 2019 the same rootless support was merged to Moby upstream and is available for all Docker users to try in experimental mode.
To make it very easy to get started with rootless Docker, we have prepared an install script.
curl -sSL https://get.docker.com/rootless | sh
This script is meant to be run as an unprivileged user. It will download the latest nightly build of Docker CE, extract it under your home directory and start up the daemon for you. It will determine if your system is ready to run rootless containers and possibly print out some setup commands if some dependencies are required before the installer can complete.
We have tested the installer in Ubuntu 18.04, Ubuntu 16.04, Debian 9, Fedora 28 and CentOS 7.5. Manual installation steps are available in https://github.com/moby/moby/blob/master/docs/rootless.md .
Note that rootless mode can’t be currently considered a replacement for the full suite of regular Docker engine features. Some examples of things that do not work on rootless mode are cgroups resource controls, apparmor security profiles, checkpoint/restore, overlay networks etc. Exposing ports from containers currently requires manual
socat helper process.
Only Ubuntu based distros support
overlay filesystems in rootless mode. For other systems, rootless mode uses
vfs storage driver that is suboptimal in many filesystems and not recommended for production workloads.
Also, note that rootless mode is currently only provided for nightly builds that may not be as stable as you are used to.
As mentioned before, a lot of Linux features that Docker needs require privileged capabilities. So how does rootless mode work around that?
The key to the solution is to take advantage of user namespaces. User namespaces map a range of user ID-s so that the root user in the inner namespace maps to an unprivileged range in the parent namespace. A fresh process in user namespace also picks up a full set of process capabilities.
The user namespaces feature has been present in Docker for a long time with the
--userns-remap flag that maps the users inside containers to a different range in the host, providing better security in the case where the container has access to the same external resources. The rootless mode works in a similar way, except we create a user namespace first and start the daemon already in the remapped namespace. The daemon and the containers will both use the same user namespace that is different from the host one.
Although Linux allows creating user namespaces without extended privileges these namespaces only map a single user and therefore do not work with many current existing containers. To overcome that, rootless mode has a dependency on the
uidmap package that can do the remapping of users for us. The binaries in
uidmap package use setuid bit (or file capabilities) and therefore always run as root internally.
To make the launching of different namespaces and integration with
uidmap simpler Akihiro created a project called rootlesskit. Rootlesskit also takes care of setting up networking for rootless containers. By default rootless docker uses networking based on
moby/vpnkit project that is also used for networking in the Docker Desktop products. Alternatively, users can install
slirp4netns and use that instead.
We want to thank Akihiro Suda again for all the outstanding work he has done to make this feature happen. Make sure to also check out his Usernetes project that aims to provide Kubernetes support without requiring root.