Review the disaster caused by Synology’s btrfs + docker:dind

Original link: https://www.dosk.win/2022/06/04/fu-pan-yi-ci-qun-hui-de-btrfs-docker-dind-yin-fa-de-zai-nan/

background

  1. I mainly use k3s to manage homelab now, but Synology’s kernel lacks complete support for functions such as cgroup v2 and overlayfs , so docker-compose is the main one on Synology
  2. Synology deploys three main things: gitea + drone + drone-docker-runner , I have a warehouse that will use drone to build a docker image and upload it to the registry of my intranet, and build drone drone-docker-runner and dind , also successfully accomplish the desired function
  3. But then I found that there were a few more automatically generated docker volume on Synology. I thought it would be fine to delete it, but I found that I couldn’t delete it, and it prompted that I had no permission…

fool operation

User group lost

  1. First of all, I thought the problem was caused by user groups such as docker/containerd . I didn’t want to delete the file system directly, so I didn’t confirm this. I preconceived that it was caused by the user group, and then called this command: sudo synogroup --member administrator docker , but this is the beginning of the fool, because this command needs to pass in all the users that need to be placed in the group, here I wrote a docker , which resulted in only the docker user in the administrator group, and then I quit ssh and rebooted the system so I couldn’t log in from any angle…

  2. The solution was very hack , because I forwarded /var/run/docker.sock through socat , so I could operate Synology’s docker from other machines on the intranet, so I opened an alpine container and mounted /etc/group to modify the user group, and finally restore…

(artificial) disk corruption

  1. This is the beginning of the nightmare. In short, I tried all kinds of methods to delete the volume, but I unilaterally thought that the volume was no longer ok without knowing btrfs , so I decided to rebuild the storage space/storage pool and make this The decision is because my Synology does not have only one storage space, and the space where the docker is located is divided by the first hard disk (the characteristics of Synology, the first hard disk stores most of the system data and is used first)

  2. The nightmare is that I can’t delete the storage space/storage pool, unilaterally think that the file system is irreversibly damaged, so I make a more wrong decision, I want to cause artificial data damage, because only in this way can the system no longer hold files System related information, so I directly unplug the first hard drive and format it when it is turned on…

  3. The next step is to delete the damaged storage space/storage pool, and rebuild the storage space/storage pool according to the process of adding a new hard disk; in fact, only the data of the package is affected, and it has no effect on my important data, because the storage space is divided separately. /storage pool to save…

Review

I will directly say the real reason after investigation here. Please study the hierarchical structure of drone dind by yourself.

  1. Synology uses btrfs to implement storage drive with overlayfs docker , so there is no choice

  2. According to the official documentation of docker , the image is actually a btrfs subvolume locally, and all the upper layer are snapshot , so in my scenario, the drone-docker-runner and docker:dind images are in the host , which is Synology direct btrfs subvolume on

  3. docker:dind Dockerfile VOLUME /var/lib/docker so an anonymous volume will be generated in the host during operation, that is, the volume that I cannot delete. At this time, this volume is still a normal volume, behind which is the btrfs file system

  4. When the pipeline is running, it pulls other images by sharing /var/run/docker.sock of docker:dind . Of course, the location is in /var/lib/docker of docker:dind , which is the anonymous volume of the host, and btrfs subvolume is created. btrfs subvolume

  5. After the build process is over, drone-docker-runner will be responsible for cleaning up temporary images and containers. For her, she only needs to perform the steps, and she doesn’t care whether it is really cleaned up. When the service of docker:dind exits, it will not be actively cleaned up. The image that has been pulled (and there is no chance or reason to do this), so the final result is that an anonymous volume of the host contains a btrfs subvolume

  6. Some people may wonder why they can’t delete it after clearing it. Here is another pit. According to the document: Arch WiKi – Btrfs , the subvolume of btrfs needs to be deleted using the btrfs command, which can only be used in versions after linux 4.18 Ordinary file system commands rm/rmdir to delete, but coincidentally the current kernel version of Synology is only 4.4.180+ (Linux Nazi 4.4.180+ #42661 SMP Fri Apr 1 15:31:10 CST 2022 x86_64 GNU/Linux synology_v1000_1621+) , so it can’t be deleted…

  7. That is to say, what I can do with btrfs subvolume list -p xxxx and then btrfs subvolume delete yyyy has gone around so much…

follow-up

  1. Attempt to modify the build parameters and use other supported non- btrfs file systems to mount /var/lib/docker , but the Synology kernel has limited support so it was unsuccessful

  2. Some people can see that because there is actually a way to delete it, it is a little troublesome, but this may be the obsessive-compulsive disorder of the programmer. I hope that each role can do what he should do, and don’t boast about his responsibilities and leave a tail. So in the end I decided to just level the hierarchy and let docker:latest map the host ‘s /var/run/docker.sock to write the build myself. The philosophy of this matter is: either let the roles perform their duties, or leave them all The developer (aka myself) to fully DIY

This article is reprinted from: https://www.dosk.win/2022/06/04/fu-pan-yi-ci-qun-hui-de-btrfs-docker-dind-yin-fa-de-zai-nan/
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment