Hi Giuseppe,
Thanks, some useful points there. However, my question was more specifically around how "special" mounts get created in containers, given it's not possible for the container process itself to create them. A concrete example below using rootless podman...
> podman run --rm -it --name ubuntu --privileged ubuntu:20.04
root@b2069e97cd13:/# findmnt -R /sys/fs/cgroup/freezer
TARGET SOURCE FSTYPE OPTIONS
/sys/fs/cgroup/freezer cgroup cgroup rw,nosuid,nodev,noexec,relatime,seclabel,freezer
root@b2069e97cd13:/# umount /sys/fs/cgroup/freezer
root@b2069e97cd13:/# mount -t cgroup cgroup /sys/fs/cgroup/freezer -o rw,nosuid,nodev,noexec,relatime,seclabel,freezer
mount: /sys/fs/cgroup/freezer: permission denied.
This shows that cgroup mounts are present in the container, and yet the container does not have permission to create the mount.
However, I've realised these are perhaps just bind mounts from the host mount namespace? I can simulate this as follows:
> podman run --rm -it --name ubuntu --privileged -v /sys/fs/cgroup:/tmp/host/cgroup:ro ubuntu:20.04
root@495f11acdd5b:/# findmnt -R /tmp/host/cgroup/freezer/
TARGET SOURCE FSTYPE OPTIONS
/tmp/host/cgroup/freezer cgroup cgroup rw,nosuid,nodev,noexec,relatime,seclabel,freezer
root@495f11acdd5b:/# umount /sys/fs/cgroup/freezer
root@495f11acdd5b:/# mount --bind /tmp/host/cgroup/freezer /sys/fs/cgroup/freezer
root@495f11acdd5b:/# findmnt -R /sys/fs/cgroup/freezer/
TARGET SOURCE FSTYPE OPTIONS
/sys/fs/cgroup/freezer cgroup cgroup rw,nosuid,nodev,noexec,relatime,seclabel,freezer
One further thing I'm unclear on is as follows. It seems when a new mount namespace is created that the mount list is copied from the parent process, but some of the container cgroup mounts are bind mounts at some point in the hierarchy rather than being the same as the host mounts. Perhaps the container runtime first unmounts /sys/fs/cgroup in the container mount namespace before creating these bind mounts?
root@495f11acdd5b:/# findmnt /sys/fs/cgroup/devices
TARGET SOURCE FSTYPE OPTIONS
/sys/fs/cgroup/devices cgroup[/user.slice] cgroup rw,nosuid,nodev,noexec,relatime,seclabel,devices
root@495f11acdd5b:/# findmnt /tmp/host/cgroup/devices
TARGET SOURCE FSTYPE OPTIONS
/tmp/host/cgroup/devices cgroup cgroup rw,nosuid,nodev,noexec,relatime,seclabel,devices
Thanks,
Lewis