Moving to SSH for my Dev Container

2024-02-02

Recently, while working in my Dev Container, I felt the urge to open up a second terminal window to do some work. This however, is not possible with my current setup. The current Dev Container starts with the following command:

docker run -it --rm \
    -h kdevenv \
    --name liveenv \
    --env SHELL="$CONTAINER_SHELL" \
    --volume "$HOME_VOLUME:$HOME_DIR" \
    --volume "/var/run/docker.sock:/var/run/docker.sock" \
    --workdir "$HOME_DIR" \
    --user "$CONTAINER_USER" \
    -p 8080:8080 \
    "kdevenv:${KDEVENV_VERSION}" \
    $CONTAINER_SHELL

The thing to note here is this starts the container and immediately runs $CONTAINER_SHELL (zsh for now) as it's command. But that means the instance of the container is tied to this invocation of the docker run command. Three consequences of this are:

As soon as I exit the shell, the container itself also stops. So I can't leave anything running. Tmux doesn't work here because the whole container shuts down.
It's kind of weird to run two of these at the same time since they would both mount my home volume. While sharing a volume between containers is actually fully supported with Docker, it doesn't feel appropriate for my particular use case.
I need to include a bunch of interesting arguments when starting up the container to ensure the terminal works the way I want it to, namely specifying the $SHELL environment variable, the user I want to run as, and setting the workdir so I land in my home directory. Dockerfile tweaks might be able to address this, but they'd be weird.

This initial solution for starting the container had always felt a little janky. Typically you start a container and then leave it running. So for all these reasons, I started thinking about alternatives.

A Different Approach

The first thing I considered was starting the container and having it run a "nothing daemon". Essentially, I'd add this to the end of the Dockerfile:

CMD ["/usr/bin/sleep", "infinity"]

The container would start up and then it's primary process would immediately sleep forever. The container would keep "running" until we explicitly stopped it. I could then docker exec liveenv /bin/zsh into the container however many times I wanted. This addresses issue number 2, and maybe addresses issue 1 (I'm actually not sure if tmux would work in this case). It doesn't really address issue 3, as at least some of the weird arguments would need to move from the docker run command to the docker exec command in one form or another.

Instead, I started thinking about how I normally do work. It almost always involves me ssh'ing into another machine. And in fact, if my container weren't running locally, the natural way to access it would be SSH. But as it so happens, SSH is still a completely valid option even for a locally running container. We could have the container startup by running sshd. In the Dockerfile we could put:

EXPOSE 22/tcp
CMD ["/usr/sbin/sshd", "-D"]

and the ssh daemon will keep running until we shut down the container. This means I can definitely leave things like tmux running, open as many terminals/ssh sessions as I want, and get rid of the weird docker run arguments since we'd be ssh'ing into the container as a proper user. On top of everything, SSH just feels right for some reason. I titled this section "A Different Approach", but dare I say, maybe it's a better approach?

Challenges

Authorized Keys

I could have used a password for ssh but:

My current user doesn't have a password set which is nice from a security perspective.
I'd have to type in a password every time I ssh'd into the container.

Public/Private keys are the better way to go in general when using SSH. This means I needed to get a public key into the user's $HOME/.ssh/authorized_keys. But the home directory is explicitly not part of the container build process. I make sure that the home directory resides in a volume to allow the contents of $HOME to be independent of the container image (see my Dev Container post for details).

Thankfully, Docker has a nice bit of behavior when it comes to empty volumes. Specifically:

If you mount an empty volume into a directory in the container in which files or directories exist, these files or directories are propagated (copied) into the volume.

Perfect. As part of the container image build process, we can copy in the desired public key to $HOME/.ssh/authorized_keys on the container image. Then, when we're building the container for the first time on a new machine, that key will be copied on to the new, empty volume we're using for our home directory when it is first mounted.

Kitty Terminfo

As I was testing out the SSH solution, I immediately noticed that my backspace key was not working. Pressing it instead resulted in a space being inserted. A little bit of googling revealed that my terminal, Kitty, was the culprit. Or specifically, the container image was missing the Kitty terminfo. And that's the story of how I learned about Terminfo and Terminal Capabilities.

Stable Host Keys

In order for sshd to run, the container itself needs a set of public/private keys (so called "host" keys). In fact, when I first attempted to run the container with sshd it crashed because I was missing these keys. My first attempt at solving the problem was to add:

RUN ssh-keygen -A

to the Dockerfile (a standard way to initialize host keys).

However, throughout my initial testing I encountered an interesting problem. Since ssh-keygen could potentially run each time I built the container image, we would get a different set of keys each time. Changing keys however would (and did) result in the "Warning: remote host identification changed" warning from ssh when I tried to ssh into the container after multiple builds. The solution here was to generate the public/private keys outside of the Dockerfile, save them, and then reuse them for each successive build.

If I was building out something more robust, I might actually put both the public/private key for the user and the public/private host keys in a secret store. This would allow me to always use the same keys for every build, no matter where I built the image. I briefly considered integrating my build script with my 1password password manager, since they have really good support for SSH keys. However, I decided that it wasn't quite worth it at this time.

For now, both sets of keys are stored in a (git ignored) directory aptly named keys. The keys are initialized if not present when the build script is run, but otherwise reused when building the image.

Rust Toolchain

The Rust toolchain likes to be installed in your home directory. But I really wanted Rust to be installed as part of my build process, allowing it to be automatically updated over time as part of new image builds. This meant it needed to live in the root file system. After getting some inspiration from this github issue, I had landed on the following for my Dockerfile:

ENV RUSTUP_HOME=/opt/rust
RUN pacman -Syu --noconfirm rustup rust-analyzer; \
    rustup default stable; \
    rustup component add rust-src; \
    rustup toolchain install nightly; \

This ensures Rust gets installed as part of the container image via ENV $RUSTUP_HOME=/opt/rust. It also notably doesn't set $CARGO_HOME, letting that default to somewhere in my home directory. This is desirable because $CARGO_HOME contains a bunch of caches. Since it is put in my home directory on the persistent volume, the caches don't get blown away on image rebuilds. Rust toolchain installed on the image, cargo caches in the persistent volume. 👍

This all seemingly broke with the move to ssh. While running cargo run I got an error message saying "rustup could not choose a version of cargo to run". After some debugging I realized that the $RUSTUP_HOME environment variable was not set. See, previously, when I was running the container directly, my terminal session would inherit the setting of RUSTUP_HOME=/opt/rust from the Dockerfile when the container was run. That was still happening, but in the context of the sshd command that starts by default. When I ssh in, no such environment variable was set.

I didn't want to put this environment variable setting in my dotfiles since I try to keep those machine/setup/environment agnostic. And since the environment variable setting is specific to this container image, it made sense to make the variable setting part of the image. Some quick googling helped me find that there's a pretty straight forward way to set environment variables for all users on a Linux system. So we just add this to our Dockerfile:

    echo "export RUSTUP_HOME=$RUSTUP_HOME" >> /etc/profile.d/rustenv.sh

and we're off to the races.

End Result and Next Steps

Et voilà, starting our container is now much simpler:

    docker run -d --rm \
        -h kdevenv \
        --name $CONTAINER_NAME \
        --volume "$HOME_VOLUME:$HOME_DIR" \
        --volume "/var/run/docker.sock:/var/run/docker.sock" \
        -p 8080:8080 \
        -p 3000:22 \
        "kdevenv:${KDEVENV_VERSION}"

We're able to get rid of the stuff I mentioned above (and some docker run arguments) and all we have to add is the new port mapping of -p 3000:22. This will allow us to ssh to the container on port 3000 which will then get forwarded to port 22 in the container. SSH'ing can be done with:

ssh -i keys/dev_ed25519 -p 3000 devuser@localhost

where keys/dev_ed25519 is the private key corresponding to the public key we baked into the image.

If you want to see how everything now all fits together, you can check out the git repository.

We're now only a hop, skip, and a jump from being able to run the container remotely, say on a super beefy VM in the cloud. We'd have to sort out how the self-hosting bit works, but to be honest I'm not really using that feature much these days. We could probably just cut it. It would also be a cool excuse to try out Tailscale and use that for connecting to the remote VM without assigning it an external IP address. We'll see if something in the future motivates me to get this all setup in a remote fashion.