You've read a bunch of online tutorials and you've managed to containerize your application. You have even exposed a port so you can reach the application from the outside world, but when you connect, you're greeted with an error page: "Cannot connect to the database". It's time to start debugging! Below you will find some methods for debugging containers, as well as some information about the crashcart tool we have developed at Oracle to make debugging easier.
Related content
Containers can be a challenge to debug, especially when you are a little fuzzy on exactly what a container is and how it works. Some people treat containers like miniature vms, and go so far as to run an ssh daemon inside their container so that they can login when things go crazy. Others stick a bunch of useful tools inside their container and use `docker exec` to get a shell inside their container. But for those of us with slightly-more-sane operational practices, what do we do when things go wrong?
If you are using a microcontainer, your container only contains a single application and its dependencies. That means no debugging tools, no shell, no help at all! Fortunately, a lot of debugging can be done from the host. One of the most important tools in your arsenal is nsenter. A linux container is a combination of quite a few isolation and protection primitives, but the most important of these to understand are namespaces. Namespaces isolate your containerized process from other processes on the system. Nsenter allows you to enter existing namespaces.
For example, lets say you wanted to debug some networking issues in your container. You could start by entering the network namespace of your container. To do this, first determine the pid of a process in your container:
PID=docker inspect -f "{{.State.Pid}}" <container-id>
To get a shell in the network namespace you can use nsenter with the pid:
sudo nsenter -n -p$PID
Files that represent the namespaces can be found in the proc filesystem, and you can also use the location directly:
sudo nsenter -n/proc/$PID/ns/net
Nsenter is pretty powerful, especially for dealing with network issues. You can list interfaces or dump traffic with tools that you have installed on your host. If you need access to most of the other namespaces of the container, you can enter them in a similar way.
There is, however, one type of debugging that is challenging: accessing the container's filesystem. You can't enter the mount namespace of the container without losing access to the mount namespace of the host, which is where all your tools live. There are various ways to get access to the container's files depending on which version of docker and which fs driver you are using. One fairly straightforward technique is to look in /proc/$PID/root, although absolute symlinks will be broken and you will have to manually translate file locations between the two views.
The perfect solution would involve somehow mounting your debug tools inside the container when you need them, and then removing them when you are finished so you don't leave around any security vulnerabilities. There are two problems with this idea:
The first thing we need is a set of debugging tools that are happy living in an alternative location. In order to be sure that things are going to run without a hitch, the entire build chain from binutils on should be built with a non-standard prefix. In addition, library dependencies should be static to make sure any libraries in the container don't conflict with our debugging tools and cause problems.
It turns out there is a pretty cool packaging system that builds in an alternative location: nix. Using nix allows us to load the /nix directory with our tools and as long as the container itself wasn't built with nix, we are free from conflicts. To also support debugging containers built with nix we could choose an alternate directory, like /dev/crashcart (it can be useful to prepend /dev because dev is almost always a writable tmpfs in containers, which means we can mount things there even if the root filesystem happens to be readonly).
To clear the second roadblock we need a way to mount new things into the container namespace. One option for this is to create an rslave mount when you create the container. For example, you could load an rslave mount into your container namespace with docker's volume command:
docker run -v /tmp/mymaster:/dev/crashcart:rslave mycontainer
This makes /dev/crashcart in the container a slave mount of /tmp/mymaster on the host. That means if you bind mount a directory over /tmp/mymaster it will be propagated to /dev/crashcart in the container. This technique means we can bind mount in tools on demand and remove them later. We can then use nsenter to enter the mount namespace and run our tools. There is still one drawback with this method. To use it you must create a special volume mount for every container that you run at start time. If you didn't run your container with the rslave mount, you still have to restart to do your debugging. Wouldn't it be great if there was some way to do it without starting the container with a volume?
It turns out there is a method that can be used to mount tools in the container on demand. It involves some tricky hacks, and it should be noted that it will not work if user namespaces are in use unless you are on kernel 4.8 or later. The strategy is to mknod a block device inside the container's mount namespace and then use the mount syscall to mount the block device to the filesystem. In order to have a block device, we can package up our binaries into an ext3 filesystem and create a loopback block device using /dev/loop.
Doing this method manually is almost impossible if you don't already have mknod and mount inside your container, so we developed a rust utility called crashcart to do it for you. Crashcart will mount the image into your container's namespace and run /dev/crashcart/bin/bash for you using either a method similar to nsenter or by calling docker exec. This gives you access to any tools that have been put into the crashcart image.
All of the reasons for using rust covered in Building a Container Runtime in Rust apply to crashcart as well. Even though crashcart has under 700 lines of code and could be written in c, it is always nice to be memory safe to avoid potential security vulnerabilities. Rust can be a bit tough to read for newcomers, but we encourage people who are new to rust to dive in and collaborate on this project. It is a fascinating language and has some very useful characteristics.
Crashcart provides a way to debug containers in a unique way today, but things could definitely get better. We hope that by making this tool available to the community that things will improve. Some ideas for potential improvements to the techniques follow: