Many of us know that you can run Go binaries in “scratch” containers. Your container doesn’t need to be based on Alpine or Ubuntu. It can be based on nothing and contain just the binary you built from Go source. This is largely because Go code can be statically linked, and so requires no installed libraries.
But what about VMs? Normally you start from Ubuntu, or Alpine or whatever and then you install your stuff on top. What would happen if you didn’t? Could you have a VM that’s just a linux kernel and your Go binary?
I thought I’d find out.
When a linux machine starts, first some low-level magic happens to mount the root file system, and load and run the kernel. Once the kernel is ready it hands control to user-space by running
/sbin/init as process ID 1. Everything else that happens on the machine then happens because
/sbin/init makes it happen. Every other user-space process is started by init or by a process started by
init. And the OS only keeps running while process 1 keeps running.
If I replace
/sbin/init with a static Go binary I’ve effectively replaced all the user-space components of the distribution.
So, what happens if we replace
/sbin/init with a statically linked Go binary that just prints “Hello World!” and then sleeps a lot?
I’m going to start with the simplest linux distribution I can find, replace
/sbin/init with my Go binary, then try to work out what else I need to do to get a running system.
Vagrant gives me a very convenient way to do this. This Vagrant file is all I need to configure a local VM.
Vagrant.configure("2") do |config|
config.vm.box = "alpine/alpine64"
config.vm.network "forwarded_port", guest: 80, host: 8080, host_ip: "127.0.0.1"
This gives me an easy-to-recycle local VM to play with. I can start it with
vagrant up, and if things go wrong I can completely delete it with
vagrant destroy -f.
I chose Alpine linux as my distribution as it has a reputation for being small & simple, which hopefully will make it easier to understand.
Once I start experimenting with this I expect lots of things will stop working, so I won’t be able to look at logs written to file or connect to the VM over a network. My debugging is likely to depend on getting access to the VM console. So I use VirtualBox to run my VM, as I know that will show me the console via the VirtualBox app.
This is our first attempt at a new world of distribution-less linux. A simple “hello world” program that I’ll build as a statically-linked binary. The program repeatedly sleeps rather than exiting, as the kernel will panic if process 1 exits.
I can build a linux version of this on my Mac using
GOOS=linux go build. Since I’ve called my directory scratchmachine the output binary is called
scratchmachine. I then do
vagrant up followed by
vagrant ssh and suddenly I’m in the Alpine VM, with my Mac directory mounted as
/vagrant. I then run
sudo cp /vagrant/scratchmachine /sbin/init to replace the init binary, followed by
sudo reboot to restart the machine.
When the machine reboots, first the linux kernel will load, then the kernel will start the first user-space process, process 1, using my “hello world” binary that it finds at
If we open VirtualBox and look at the machine console we can see the output of this experiment. It’s a success!
But this is all our machine can do. Our new init is the only thing running in user-space on this machine. And all it does is says hello and goes to sleep.
What I’d really like to do is run a web-server. For that I need a network connection.
My new mantra is
vagrant destroy -f; vagrant up; vagrant ssh, which quickly restores a fully working alpine machine.
To get the network working I know I will need an active network interface. Perhaps I should just copy what happens when running alpine normally?
ifconfig -a shows me the interfaces on the VM.
alpine:~$ ifconfig -a
eth0 Link encap:Ethernet HWaddr 08:00:27:9E:9E:E5
inet addr:10.0.2.15 Bcast:0.0.0.0 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe9e:9ee5/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
The VM has a single network interface eth0 using IP address 10.0.2.15.
So, what would happen if I tried to just assign 10.0.2.15 to eth0 and set the UP and RUNNING flags from my code? Some digging turned up the linux netdevice interface was what I needed to do this.
This netdevice interface is extremely weird. To use it you open any old internet socket, then commands to the kernel using the SYS_IOCTL syscall referencing the socket (IOCTL stands for input/output control).
Luckily there’s support for making the syscalls and for some of the structures I needed in
Unfortunately it’s not that easy. The eth0 device I’ve tried to configure does not exist.
/sbin/init must normally do something to make the device appear.
I can now be heard muttering
vagrant destroy -f; vagrant up; vagrant ssh as I stomp around trying to think how to make
eth0 appear. It must be something
/sbin/init does when the machine boots.
So what does
/sbin/init do when the machine boots? Well, one thing it does is run “init scripts”. These are arcane scripts that have been handed down by the ancient ones to make machines start. The scripts usually live in
/etc but the exact details vary between unixes. Using ancient wisdom, I go looking in
/etc for files and directories related to “init”, to “rc” and to “run levels”.
And it turns out
/etc/runlevels exists and has subdirectories
boot, each with a bunch of scripts that get run to start the system. I try deleting scripts and rebooting to see what’s crucial for setting up ethernet. Cutting a long story short, the interesting file is
/etc/runlevels/sysinit/hwdrivers. This is a quite short script that boils down to the following.
find /sys -name modalias -type f -print0 | xargs -0 sort -u \
| xargs modprobe -b -a
This is looking for files under /sys and passing them to modprobe.
man modprobe tells us
modprobe — program to add and remove modules from the Linux Kernel
So perhaps we need to load a driver for eth0? If we poke around in
/sys for things related to eth0 we find
/sys/class/net/eth0/device. And from there we can discover that the driver is called
So how do we load the driver? I don’t want
modprobe in my final system, so I need to load the driver directly from my Go code.
Looking for clues, I found some source code for
modprobe here. This shows
modprobe reading the bytes of a driver binary, then calling
init_module, which turns out to be another syscall. The man page says there’s a newer version called
finit_module. So, obviously, I go with the f’ing one.
The modprobe code also contains another hint. It looks for modules under
/lib/modules. A quick
find /lib/modules -print | grep e1000 shows us the driver we want is
/lib/modules/4.9.73–0-virthardened/kernel/drivers/net/ethernet/intel/e1000/e1000.ko. This is the driver I want to load. All I need to do is open this file and pass the file descriptor to the
Ever optimistic, I add some code to start an HTTP server after the code to load the ethernet driver and configure the interface. This is what the code looks like now:
I rebuild, copy the binary over
/sbin/init and reboot. And wait a minute. And then…
$ curl localhost:8080
Hello from Scratch Machine!
So I can build a Go web server and install it as
/sbin/init in a linux VM. The web server is the only user-space process running on the VM, and I can convince myself that it’s really the only user-space code that counts. But I really wanted a VM with only my code & the kernel on it and nothing else. How can I achieve that?
This turns out to be really quite hard. Not many people do this kind of thing, so there aren’t many clues out in the world. And all the clues that are there are arcane and somewhat contradictory.
In the end (several weeks later!) I find a working formula.
- I build a CD/DVD image (.iso) using the xorriso package.
- I configure this to boot linux using what’s called an “initial RAM file system” and the isolinux boot loader.
- Because I don’t need much else I stop there. Normally the “initial RAM file system” is just enough code to work out where the real root file system is, mount it and boot from it. In my case the “initial RAM file system” contains just my binaries and the ethernet driver, and I have no “real root” with additional stuff.
- I boot a VirtualBox VM from this iso with no other hard drive configured.
To reiterate, the initial RAM file system contains the following.
- e1000.ko (the ethernet driver).
- My Go program, renamed to ‘init’ (the
/sbinprefix is unnecessary for an initramfs).
My image just contains the following.
- isolinux.bin & ldlinux.c32 (the ISOLINUX bootloader)
- an isolinux.cfg configuration file.
- vmlinuz-virtualhardened (the linux kernel copied from alpine).
- initramfs.gz, which is the gzipped cpio archive of the initial RAM file system.
The “initial RAM file system” is just a gzipped cpio archive with the files I need. I can build it as follows. All these commands are run inside the alpine virtual machine.
# build our initial RAM file system
mkdir -p ramfs
cp /vagrant/scratchmachine ramfs/init
cp /lib/modules/4.9.73-0-virthardened/kernel/drivers/net/ethernet/intel/e1000/e1000.ko ramfs/e1000.ko
# Make our own initramfs, with just our binary
cat <<EOF | cpio -o -H newc | gzip > initramfs.gz
To build the ISO I again just need to build a directory containing the files I need in my alpine VM and run a command.
# Copy the kernel from alpine
cp /boot/vmlinuz-virthardened cdroot/kernel
# Copy the initramfs.gz file just created
cp ramfs/initramfs.gz cdroot
# Copy in the ISOLINUX bootloader
mkdir -p cdroot/isolinux
cp /usr/share/syslinux/isolinux.bin cdroot/isolinux
cp /usr/share/syslinux/ldlinux.c32 cdroot/isolinux
# Create the ISOLINUX config file
cat <<EOF > cdroot/isolinux/isolinux.cfg
SERIAL 0 115200
SAY Now booting the kernel from ISOLINUX...
APPEND root=/dev/ram0 ro console=tty0 console=ttyS0,115200
Finally we can build the iso.
mkisofs -o /vagrant/output.iso \
-cache-inodes -J -l \
-b isolinux/isolinux.bin -c isolinux/boot.cat \
-no-emul-boot -boot-load-size 4 -boot-info-table \
My Go program is 6,749,734 bytes. My ISO boot image is 7,114,752 bytes, which compares well with the ~38 MB for the alpine VM iso. It takes about 24s to boot under VirtualBox on my laptop (which I think is far too slow!). I suspect it is not vulnerable to many known linux security issues as it contains zero standard user-space components.
On the down side it isn’t very configurable (hardwired IP address!) or debuggable.
Personally I think this might not be a crazy avenue to pursue. It wouldn’t be too difficult to add a few things like a DHCP client or a log forwarder as either libraries or additional executables. Then you might have a useful system that’s trivial to audit for known security vulnerabilities.
If you want to take a closer look, the code is on github.com
None of this is a terribly new idea. If you’re interested in this area, you might want to take a look at some of the following.