This is a long convoluted tail. I’ve spent hours searching the interwebs and trying
various things. If anyone can offer any insights at all, I would appreciate it.
The goal is to create an Atlassian Bamboo Agent running in a container (base image
<
https://hub.docker.com/r/atlassian/bamboo-agent-base/>, source
<
https://bitbucket.org/atlassian-docker/docker-bamboo-agent-base/src/maste...>).
My purpose in this agent is to enable NVIDIA CUDA jobs to run on a server with a GPU.
I’m running on RHEL 8.8, podman 4.9. This machine is using cgroups v1 (more on that
later), and runc.
I’m assuming that this agent container will run podman commands as part of various build
jobs (but I don’t have insight into how build jobs are managed). I’ve never run podman in
a container before. I did find Dan Walsh’s blog
<
https://www.redhat.com/en/blog/podman-inside-container> and drew upon it heavily.
Starting from the Atlassian Dockerfile.ubi, and a UBI9 base image, I added Dan’s steps
from creating podman-stable and created my image, bamboo-agent. I don’t think the details
of the Containerfile are important at this time.
Normally, I expose the GPU to my containers with the --device
nvidia.com/gpu=all flag. It
appears the easiest way to run containers in containers is to run the outer container with
--privileged, but when paired with the NVIDIA CDI, the container fails to run. It appears
that there is a device /dev/dri/card1 that is not world-readable (for reasons) so it can’t
be mounted in the container. Since I can’t use --privileged, I tried to find the
“Goldilocks” solution. Starting here:
/usr/bin/podman run --rm --device
nvidia.com/gpu=all --security-opt=label=disable
--cap-add=sys_admin,perfmon --group-add keep-groups --mount
type=bind,src=/home/jmelton/projects/bamboo/work,dst=/usr/src/project --workdir
/usr/src/project --log-driver k8s-file --log-opt
path=/home/jmelton/projects/bamboo/log/bamboo-agent.20250820120312.log --network host -v
podman_containers:/var/lib/containers,rw -v
bambooAgentVolume:/var/atlassian/application-data/bamboo-agent --hostname=agent-cuda-12_3
-it --name=bamboo-agent bamboo-agent:latest bash
My outer container is rootless, running as me. In my (outer) container, I’m only going to
run rootful containers. I might be naive, but since the outer container is rootless, my
“rootful” inner containers are still in user-space. If I try to (for example) use the
podman user in the podman-stable image, I end up with files in my home directory that I
don’t own and can’t delete.
In this running, container, if I try to run anything,
[root@agent-cuda-12_3 project]# podman run --rm -it alpine ls /
WARN[0000] Using cgroups-v1 which is deprecated in favor of cgroups-v2 with Podman v5 and
will be removed in a future version. Set environment variable
`PODMAN_IGNORE_CGROUPSV1_WARNING` to hide this warning.
Trying to pull docker.io/alpine:latest...
Getting image source signatures
Copying blob 9824c27679d3 done |
Copying config 9234e8fb04 done |
Writing manifest to image destination
Error: mounting storage for container
ba508c13fcdde1a5cd40de68aad8c62cedecb86d7001d79190b1a3205cae114c: creating overlay mount
to
/var/lib/containers/storage/overlay/1bd977c57d4226fd4146762e90974f6cd7aed84a9a94716087fa63a205340e5c/merged,
mount_data="lowerdir=/var/lib/containers/storage/overlay/l/7EYYG6BVMHUWP3XB5NMRWSYZFL,upperdir=/var/lib/containers/storage/overlay/1bd977c57d4226fd4146762e90974f6cd7aed84a9a94716087fa63a205340e5c/diff,workdir=/var/lib/containers/storage/overlay/1bd977c57d4226fd4146762e90974f6cd7aed84a9a94716087fa63a205340e5c/work,nodev,fsync=0,volatile":
using mount program /usr/bin/fuse-overlayfs: unknown argument ignored: lazytime
fuse: device not found, try 'modprobe fuse' first
fuse-overlayfs: cannot mount: No such file or directory
Well, that was annoying. But after a bit of mucking about and searching various posts, I
come up with running the outer container like this:
/usr/bin/podman run --rm --device
nvidia.com/gpu=all --security-opt=label=disable
--cap-add=sys_admin,perfmon --group-add keep-groups --mount
type=bind,src=/home/jmelton/projects/bamboo/work,dst=/usr/src/project --workdir
/usr/src/project --log-driver k8s-file --log-opt
path=/home/jmelton/projects/bamboo/log/bamboo-agent.20250820133608.log --network host
--device /dev/fuse -v podman_containers:/var/lib/containers,rw -v
bambooAgentVolume:/var/atlassian/application-data/bamboo-agent --hostname=agent-cuda-12_3
-it --name=bamboo-agent bamboo-agent bash
Close, but no cigar:
[root@agent-cuda-12_3 project]# podman run --rm -it alpine ls /
Trying to pull docker.io/alpine:latest...
Getting image source signatures
Copying blob 9824c27679d3 done |
Copying config 9234e8fb04 done |
Writing manifest to image destination
Error: OCI runtime error: crun: mount_setattr `/sys`: Function not implemented
[root@agent-cuda-12_3 project]#
exit
[jmelton@gpuserver1 bamboo]$ Error: OCI runtime error: crun: mount_setattr `/sys`:
Function not implemented
-bash: /sys: Is a directory
bash: Error:: command not found…
Ultimately, I discovered that if I ran the inner container like this:
[root@agent-cuda-12_3 project]# podman run -v /sys:/sys -it busybox echo hello
Trying to pull docker.io/busybox:latest...
Getting image source signatures
Copying blob 80bfbb8a41a2 done |
Copying config 0ed463b26d done |
Writing manifest to image destination
hello
It works. Note that since the /sys in the inner container belongs to the outer container,
the security risk it low. So, how can I configure podman to always mount that when I run a
container? After some frustrating dialogue with ChatGPT, one would think that
/etc/containers/mounts.conf would do the trick. Unfortunately, it doesn’t. My guess is
that those mounts happen too late for crun but it’s all idle speculation. The best I
could coax out of ChatGPT was a wrapper script to insert the -v /sys:/sys to the podman
invocation. I would love to find a podman-native solution.
My head is sufficiently flattened from beating it against the mount wall, that I give up
looking for the perfect solution, and move on to creating a Quadlet service to run my
agent. Whoops, Quadlet requires cgroups v2. Ok, edit /etc/default/grub:
GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/rhel_gpuserver1-swap
rd.lvm.lv=rhel_gpuserver1/root rd.lvm.lv=rhel_gpuserver1/swap rhgb quiet \
rd.driver.blacklist=nouveau \
systemd.unified_cgroup_hierarchy=1"
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
reboot
Then all hell breaks loose.
/usr/bin/podman run --rm --device
nvidia.com/gpu=all --security-opt=label=disable
--cap-add=sys_admin,perfmon --group-add keep-groups --mount
type=bind,src=/home/jmelton/projects/bamboo/work,dst=/usr/src/project --workdir
/usr/src/project --log-driver k8s-file --log-opt
path=/home/jmelton/projects/bamboo/log/bamboo-agent.20250820174923.log --network host
--cap-add chown --device /dev/fuse --security-opt 'unmask=/sys/*' -v
podman_containers:/var/lib/containers,rw -v
bambooAgentVolume:/var/atlassian/application-data/bamboo-agent --hostname=agent-cuda-12_3
-dt --name=bamboo-agent bamboo-agent
Error: runc: runc create failed: unable to start container process: error during container
init: error setting cgroup config for procHooks process: openat2
/sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/user.slice/libpod-047db35e8dc99bee6681f9042215c7760c6ec6fb7a295a5bdb779a17b5711fb7.scope/pids.max:
no such file or directory: OCI runtime attempted to invoke a command that was not found
ChatGPT suggests updating podman, so I do, to 4.9.4, which is the latest appstream I find.
Unfortunately, things continue to go downhill:
/usr/bin/podman run --rm=true --device
nvidia.com/gpu=all --security-opt=label=disable
--cap-add=sys_admin,perfmon --network=host --workdir=/usr/src/project --mount
type=bind,source=/home/jmelton/projects/lbe/muse-gpu,destination=/usr/src/project --mount
type=bind,source=/home/data,dst=/data,ro=true -it artifactory.iss.snc:80/docker/lbe/dev
./build.bash
Error: OCI runtime error: runc: runc create failed: unable to start container process:
unable to apply cgroup configuration: unable to start unit
"libpod-15c84f640fb7343c120c39c468775ebf873a9033ba5e8ed4dcf4bba295700be2.scope"
(properties [{Name:Description Value:"libcontainer container
15c84f640fb7343c120c39c468775ebf873a9033ba5e8ed4dcf4bba295700be2"} {Name:Slice
Value:"user.slice"} {Name:Delegate Value:true} {Name:PIDs Value:@au [70881]}
{Name:MemoryAccounting Value:true} {Name:CPUAccounting Value:true} {Name:IOAccounting
Value:true} {Name:TasksAccounting Value:true} {Name:DefaultDependencies Value:false}]):
Interactive authentication required.
What in the world does it mean “Interactive authentication required”? I do find
indications of an error similar to this showing up on Fedora 37, and theoretically it was
fixed by some change to systemd. I’ll try anything at this point, but I only get to
239-82, which is quite a bit older than when the issue purportedly appeared.
By now, even ChatGPT is grasping at straws. It says switch to crun. There doesn’t appear
to be a downside, so I install crun-1.14.3-2, and edit
/usr/share/containers/containers.conf to switch the runtime. Verified:
$ podman info --format '{{.Host.OCIRuntime}}'
{crun crun-1.14.3-2.module+el8.10.0+23250+94af2c8e.x86_64 /usr/bin/crun crun version
1.14.3
commit: 1961d211ba98f532ea52d2e80f4c20359f241a98
rundir: /run/user/1001/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL}
My results of the most trivial invocation are different, but not better:
$ podman run --rm busybox ls
Error: OCI runtime error: crun: sd-bus call: Process org.freedesktop.systemd1 exited with
status 1: Input/output error
At this point, I reverted to cgroup v1 so other users could continue working, but I’m
completely baffled. While I know lots of people do “podman in podman”, has anyone else
tried to do “podman with CUDA in podman”?
I have a RHEL9 machine that natively runs cgroups v2 that I will try on tomorrow, but I
have run out of avenues to pursue.
—
“God can't give us peace and happiness apart from Himself because there is no such
thing.”
― C.S. Lewis
Jim Melton
http://blogs.melton.space/pharisee/