We're So Back: Docker-in-a-Container Edition

2024-07-11

As I've written about numerous times, I like to use a container as my development environment. I change machines just often enough that it's annoying to have to re-install all of the tools that I need. But recently, as I was getting set up with a new machine for my new job, I ran into some issues. It's time for another episode of "Solving Problems I Created for Myself".

My work issued me a Mac Book Pro, but I've only ever tested out my development environment on Linux machines. My first attempt to run the build script failed immediately due to some differences between the how the date command works on Linux vs OS X. Those were pretty easy to deal with though. The bigger problem came when trying to get the group id of the docker group on my Mac.

As you may remember from my previous post on Self Hosting My Developement Environment, I had a pretty hacky setup in order to use Docker from within the container. Since I wanted to be able to use docker without having to always type sudo, the /var/run/docker.sock socket needed to belong to the docker group (i.e. the typical root:docker owner). However, the group id for the docker group inside the container was different from the group id of the docker group on the host machine. So, on Linux, when you mounted the socket into the container, what you would see is something like:

kurtis@container # ls -al /var/run/docker.sock
srw-rw---- 1 root 134 0 Jul  3 15:24 /var/run/docker.sock

Even though my container user was in the container's docker group, the actual group membership check whenever running docker was against this 134 group (i.e. the id of the docker group on my host machine).

To address this, I did some hacky stuff in my build script that would force the group id of the docker group in the container to be the same as the group id on the host.

I never liked this because:

It means the container image being built is specific to the machine it's built on, making the image non-portable (not that I necessarily needed that).
There was never any guarantee that the group id on the host wouldn't conflict with some pre-existing group id in the container I was building (e.g. some group already setup in the Arch Linux base image).

Little did I know this whole hack also completely broke when building the container on a Mac. The problem I immediately ran into was that the methods for determining a group id on Linux and Mac are different (getent on Linux vs dscl on Mac). Rather than try to solve the build script issue, I decided to just chuck out the whole "access Docker from within my container" feature. I hadn't used it in a long time and I wanted get to work. I ripped out the problematic part of the build script unceremoniously.

Actually, I Need That Piece

A few weeks into my job, I found out that some of our internal tools actually need to run containers locally for certain operations (more on this later). This meant I either needed give up on using my containerized development environment, or figure out a way to make Docker work again from within the container. It's debatable on whether or not the juice is worth the squeeze here, but I'm no quitter.

I realized that I'd actually forgotten to stop mounting the Docker socket when starting the container, so it was still mounted in there. On my Mac, in the container, an ls of the socket revealed:

kurtis@container # ls -al /var/run/docker.sock
srw-rw---- 1 root root 0 Jul  3 15:24 /var/run/docker.sock

I did some research and played around with a few things before just trying sudo chown root:docker on the socket. It worked! I could run Docker commands from inside the container no problem. I made a commit and felt proud of myself.

However, something was irking me. I could swear this was one of the first things I tried when I'd previously attempted this. And the results had been bad. Changing the owner of the socket from within the container had resulted in the owner of the of the socket also being changed on the host. This is bad because it borked the socket for anything trying to access it from the host (since the group id was now some random group id that only the container understood).

But things seemed to be working on my Mac. Maybe something had changed about Docker since I'd last tried to set this all up. The owner was root:root now, which was different from before. I went home to test out the change on my personal Linux machine and sure enough, it broke things just like before.

Before changing ownership:

kurtis@container # ls -al /var/run/docker.sock
srw-rw---- 1 root 134 0 Jul  3 15:24 /var/run/docker.sock
kurtis@container # exit
kurtis@host # ls -al /var/run/docker.sock
srw-rw---- 1 root docker 0 Jul  3 15:24 /var/run/docker.sock

After changing ownership from within the container (this borks Docker entirely):

kurtis@container # ls -al /var/run/docker.sock
srw-rw---- 1 root docker 0 Jul  3 15:24 /var/run/docker.sock
kurtis@container # exit
kurtis@host # ls -al /var/run/docker.sock
srw-rw---- 1 root 973 0 Jul  3 15:24 /var/run/docker.sock

I thought about it for a minute and realized something. The way containers run on Linux vs Mac is fundamentally different. On Linux, the container just runs as another process on the host OS. This is why we are able to notice (and be bothered by) the difference in group ids.

But on a Mac, the container is run within a purpose built Linux Virtual Machine. Access to the host file system is done through something called VirtioFS. My guess is that on a Mac, the ownership change from within the container doesn't end up getting propagated to the host for reasons having to do with all the virtualisation. Or rather, the change might be propagated back to the host, but in this case the "host" is the VM that's running the container. The Mac "host" is fine and unbothered.

At this point I was pretty discouraged. I wasn't sure I'd be able to get anything that worked across both Mac and Linux. This was the "Dark Night of the Soul" portion of our story.

socat To The Rescue

After a little wallowing, I took a step back. It felt like I needed a fundamentally different approach to the problem. One of the options I'd encountered for using the Docker socket inside a container involved not using /var/run/docker.sock at all. Instead of Docker exposing a Unix socket, you can configure it to expose a TCP socket. This is, generally speaking, discouraged. You're essentially exposing the Docker daemon (i.e. root permissions) to anyone who has network access to your machine. For this reason, I'd been trying to avoid the TCP socket option. But, now my back was up against a wall. My use-case here is running my Docker daemon behind a NAT'd firewall, i.e. no exposure to the public internet. So I figured I should be fine.

It wasn't immediately obvious how to enable the TCP socket on a Mac. I was starting to suspect that the feature might not exist for Macs when I ran across this Stackoverflow post. Apparently the functionality had for sure existed at one point and had also been broken. A workaround was to use something called socat. I decided to give that a spin, and it did indeed work!

On my Mac, I had one terminal open running:

kurtis@host # socat TCP-LISTEN:2376,reuseaddr,fork,bind=127.0.0.1 UNIX-CLIENT:/var/run/docker.sock

This essentially set up a pipe, directing traffic sent to localhost:2376 to /var/run/docker.sock.

Then from within the container I could do:

kurtis@container # DOCKER_HOST="host.docker.internal:2376" docker ps
CONTAINER ID   IMAGE           COMMAND               CREATED         STATUS         PORTS                                           NAMES
a1d94235ad9b   kdevenv:0.9.1   "/usr/sbin/sshd -D"   9 minutes ago   Up 9 minutes   0.0.0.0:8080->8080/tcp, 0.0.0.0:3000->22/tcp,   liveenv

(note that host.docker.internal is a way to access localhost on the host from within the container)

This was a good proof-of-concept but it had a large drawback. The whole point of having a containerized development environment is to avoid having to install dependencies on the host system. socat did not come pre-installed on my Mac and I had to install it via Homebrew. Also, this meant I needed to have some way of ensuring the socat process was running whenever my container was running. It was not ideal.

Further Refinement

But then I had an epiphany. What if mounted /var/run/docker.sock like I had always been doing, but then ran socat in the background from within the container. As long as I ran the socat process as root, the ownership of /var/run/docker.sock should be irrelevant. I created a little script as an entrypoint for the container that would run both the socat process in the background, as well as start the ssh server I used:

socat TCP-LISTEN:2376,reuseaddr,fork,bind=127.0.0.1 UNIX-CLIENT:/var/run/docker.sock &
/usr/sbin/sshd -D

I then started the container and ran:

kurtis@container # DOCKER_HOST="localhost:2376" docker ps
CONTAINER ID   IMAGE           COMMAND                  CREATED          STATUS          PORTS 
                                                                             NAMES
4ecc4c5f9ef5   kdevenv:0.9.1   "/usr/local/sbin/ser…"   14 minutes ago   Up 14 minutes   0.0.0.
0:8080->8080/tcp, :::8080->8080/tcp, 0.0.0.0:3000->22/tcp, :::3000->22/tcp   liveenv

Voilà! A completely platform independent way of sudo-less Docker access from within a container. What's more, since we're running socat from within the container, there shouldn't be any of the security concerns one might have with typically exposing the Docker socket via TCP.

But Can We Do Better?

There are two things you could quibble with here:

Because I'm exposing a generic TCP socket, we effectively loose all access control over the Docker socket. Any process in the container can access Docker now.
The use of Docker with a TCP socket is non-standard and likely to break random things in surprising ways.

Upon thinking about the problem a little more, I wondered, "why do we need to limit ourselves to a TCP socket?". Can we just redirect from one Unix socket to another? Yes, dear reader, indeed we can. Some quick googling found almost an exact answer to, essentially, what we've been trying to do all along. We can have a "new" socket with the desired ownership,root:docker, and then pipe traffic from that socket to the "real" Docker socket (with owner root:<problem>)

We'll do two things. First, when mounting the host Docker socket into the container, we'll mount it at a special location with --volume "/var/run/docker.sock:/var/run/host_docker.sock". This will put the real socket in a slightly different location within the container, with its standard but problematic ownership. We can then create a new socket at the correct /var/run/docker.sock location with the correct ownership that points to /var/run/host_docker.sock:

socat UNIX-CONNECT:/var/run/host_docker.sock UNIX-LISTEN:/var/run/docker.sock,fork,user=root,group=docker,mode=660

Put it all together and everything works¹!

Have We Gone Too Far?

Was it really worth it, all this work, the 3 burned afternoons, just to avoid having to install a few binaries on the rare occasion when you get a new machine. Probably not. Was it fun and did I learn a lot? Absolutely. The container image is now truly independent of what ever machine it's built on. It means we can build the image once, on an arbitrary machine (we could even use CI!), and then use it anywhere. I'm pretty excited about the opportunities this opens up. Maybe now I'll finally get around to trying out remote development on a beefy cloud VM...

Footnotes

[1]: Note this initial commit had a bug in the socat call I needed to fix. However, this blog post has correct call.