Debian Containers with systemd-nspawn

Note: This tuturial gives an introduction to containerization with systemd-nspawn on Debian Jessie with Systemd 215. Systemd’s container management features are much improved in recent Systemd versions, still systemd-nspawn in Jessie is stable and usable.

While Docker is great to implement a container-based deployment workflow, sometimes all I need is a plain old virtual server container.
This has been available for years using tools like Linux-VServer and OpenVZ and the newer LXC. While Linux-VServer and OpenVZ rely on kernel patches that aren’t compiled by default, LXC (like Docker) uses kernel features available in every modern kernel to contain processes.

Setting up LXC can be complicated though, and on systemd-based distributions there’s a much simpler tool included to achieve the same goal: systemd-nspawn.

On newer Systems like Debian stretch (testing), systemd-nspawn is spun off into it’s own package systemd-container which is not installed by default.

Systemd-nspawn looks and feels like chroot, but it will completely virtualize the spawned process’s environment exactly like LXC, optionally booting a complete linux system (/sbin/init) inside a container. As every container is managed by a running systemd-nspawn process, there is no management daemon common to all containers necessary.

Setting up a container

By convention, systemd-nspawn containers live in /var/lib/container/<name>. This not a technical restriction though, containers can be installed and spawned anywhere in the filesystem.

Bootstrapping Debian

A bootable Debian root filesystem can be bootstrapped from scratch using debootstrap or cdebootstrap. While debootstrap is a Shell Script, cdebootstrap is a smaller debootstrap implemented in C that is used by the Debian Installer. As debootstrap has no dependencies except for a Shell and wget, it can be used to bootstrap Debian filesystems on other distributions like Fedora, Arch etc.

To install a Debian filesystem using debootstrap/cdebootstrap, you only need to know the desired release codename, a target directory and the Debian mirror to use:

debootstrap jessie /var/lib/container/mynewdebian http://ftp.de.debian.org/debian

If the target directory doesn’t exist, it will be created by debootstrap.

Changing the root password

To be able to login to the Debian container just created, we need to set a root password. This can be done easily by not booting the container, but running a single command (/usr/bin/passwd) inside the container’s isolated environment. When the spawned process (passwd) exits, systemd-nspawn will exit:

systemd-nspawn --directory=/var/lib/container/mynewdebian passwd

Networking inside the container

By default, containers spawned by systemd-nspawn will not have a own isolated network interface. Processes inside the container will see and use the host’s networking.

To separate the container’s network from the host, the options --network-bridge= and --network-macvlan= can be used. --network-bridge=br0 will create a virtual interface inside the container and attach it to a existing bridge device br0. To just attach the container to a virtual interface attached to the host’s eth0 interface, you don’t neet to set up a bridge and can just use MACVLAN using --network-macvlan=eth0.

Booting up the container

To change the root password, we just spawned a single process inside the container. Using --boot, systemd-nspawn can be instructed to find the init binary inside the container’s filesytem and use it to boot a virtual linux system:

systemd-nspawn --directory=/var/lib/container/mynewdebian --network-macvlan=eth0 --boot

After logging in, the container will not have it’s network interface configured. You can activate systemd-networkd to configure the container’s networking using DHCP:

echo -e "[Match]\nName=mv-eth0\n\n[Network]\nDHCP=yes" > /etc/systemd/network/mv-eth0.network
ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf
systemctl enable systemd-networkd.service systemd-resolved.service
systemctl start systemd-networkd.service systemd-resolved.service

If you want to be able to log in using SSH later, don’t forget to install SSH now:

apt-get update
apt-get install --no-install-recommends ssh

Correcting timezone and locales

apt-get install --no-install-recommends dbus locales
timedatectl set-timezone Europe/Berlin
dpkg-reconfigure locales

Automatically starting up / shutting down a container using systemd

I create a new systemd unit for each container I want Systemd to manage.

/etc/systemd/system/mynewdebian.service:

[Unit]
Description=Container mynewdebian

[Service]
ExecStart=/usr/bin/systemd-nspawn --quiet --keep-unit --boot --link-journal=try-guest --directory=/var/lib/container/mynewdebian --network-macvlan=eth0
KillMode=mixed
Type=notify
RestartForceExitStatus=133
SuccessExitStatus=133

[Install]
WantedBy=multi-user.target

The new unit has to be enabled and started:

systemctl daemon-reload
systemctl enable mynewdebian.service
systemctl start mynewdebian.service

Note: Systemd also comes with a unit template called systemd-nspawn@.service which can be used or adapted to manage a number of containers using exactly the same options. The functionality of this unit and Systemd’s containerization features are changed and improved in newer Systemd versions, so for now I am using individual .service files for every container.

Managing containers

Startup/Shutdown can be initiated by starting/stopping the corresponding Systemd Unit. Live containers can be monitored using Systemd’s included virtual machine and container registration manager. It allows to inspect and control all running containers and also virtual machines (using libvirt) on the system using the machinectl command:

root@nuc:~# machinectl list
MACHINE     CLASS     SERVICE
mynewdebian container nspawn 

1 machines listed.

The output of machinectl status mynewdebian will give a detailed summary of the running container, including running processes, console log and assigned addresses.

Security considerations

Systemd-nspawn’s manpage still includes this disclaimer:

Note that even though these security precautions are taken systemd-nspawn is not suitable for secure container setups. Many of the security features may be circumvented and are hence primarily useful to avoid accidental changes to the host system from the container. The intended use of this program is debugging and testing as well as building of packages, distributions and software involved with boot and systems management.

In a recent interview Systemd’s author Lennart Poettering stated that the intention has shifted since:

systemd also contains the systemd-nspawn container manager. It’s a relatively minimal, yet powerful implementation of a container manager. Initially we wrote it for testing purposes, but nowadays we consider it ready for many production uses. In fact CoreOS’ rkt container tool makes use of it as the lower level container backend.

I personally use this simple rule as a precautionary measure: I generally consider root privileges inside a container as insecure as root privileges in the host. This is also recommended for Docker containers.