46 How-to fix broken Docker-in-Docker socket
When using Docker-in-Docker, there is a chance that dind
hasn’t started when a build is requested. If this happens, the volume mount to load /var/run/dind/docker.sock
into the build container may occur before dind
has created the socket. If this happens, the volume mount will create a directory at the mount point (which we don’t want to happen). If this happens, Docker-in-Docker will be inaccessible until /var/run/dind
is manually deleted and the dind
pod is restarted.
46.1 Spotting the problem
Build pods will not be working, and the dind
pods are stuck in CrashLoopBackoff
.
46.2 Band aiding the problem
46.2.1 Bots
We implemented a bot to monitor the issue and the source code is available at https://github.com/gesiscss/orc2/blob/main/ansible/usr/bin/orc2-fix-dind-bot.py.
46.2.2 OpenLens
For an introduction to use OpenLens, read Chapter 37.
Open OpenLens and connect to the cluster.
In the navigation bar on the left, click on
Workloads
andPods
.Search for the
binderhub-dind-
pod that has many restarts. Click in the node name for thebinderhub-dind-
pod of interest to open the node details.On the node details navigation bar at the right top corner, click on the first icon (Node shell).
OpenLens opened a terminal as
root
user at the node. Executerm -rf /var/run/dind/docker.sock/
Select the
binderhub-dind-
andbinderhub-image-cleaner-
pods.Remove the selected pods by clicking the minus button at the bottom right corner of the list of pods.
46.3 Fixing the problem
No fix is available.