This is a long convoluted tail. I’ve spent hours searching the interwebs and trying various things. If anyone can offer any insights at all, I would appreciate it.

The goal is to create an Atlassian Bamboo Agent running in a container (base image, source). My purpose in this agent is to enable NVIDIA CUDA jobs to run on a server with a GPU.

I’m running on RHEL 8.8, podman 4.9. This machine is using cgroups v1 (more on that later), and runc.

I’m assuming that this agent container will run podman commands as part of various build jobs (but I don’t have insight into how build jobs are managed). I’ve never run podman in a container before. I did find Dan Walsh’s blog and drew upon it heavily.

Starting from the Atlassian Dockerfile.ubi, and a UBI9 base image, I added Dan’s steps from creating podman-stable and created my image, bamboo-agent. I don’t think the details of the Containerfile are important at this time.

Normally, I expose the GPU to my containers with the --device nvidia.com/gpu=all flag. It appears the easiest way to run containers in containers is to run the outer container with --privileged, but when paired with the NVIDIA CDI, the container fails to run. It appears that there is a device /dev/dri/card1 that is not world-readable (for reasons) so it can’t be mounted in the container. Since I can’t use --privileged, I tried to find the “Goldilocks” solution. Starting here:

/usr/bin/podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable --cap-add=sys_admin,perfmon --group-add keep-groups --mount type=bind,src=/home/jmelton/projects/bamboo/work,dst=/usr/src/project --workdir /usr/src/project --log-driver k8s-file --log-opt path=/home/jmelton/projects/bamboo/log/bamboo-agent.20250820120312.log --network host -v podman_containers:/var/lib/containers,rw -v bambooAgentVolume:/var/atlassian/application-data/bamboo-agent --hostname=agent-cuda-12_3 -it --name=bamboo-agent bamboo-agent:latest bash

My outer container is rootless, running as me. In my (outer) container, I’m only going to run rootful containers. I might be naive, but since the outer container is rootless, my “rootful” inner containers are still in user-space. If I try to (for example) use the podman user in the podman-stable image, I end up with files in my home directory that I don’t own and can’t delete.

In this running, container, if I try to run anything,

[root@agent-cuda-12_3 project]# podman run --rm -it alpine ls /

WARN[0000] Using cgroups-v1 which is deprecated in favor of cgroups-v2 with Podman v5 and will be removed in a future version. Set environment variable `PODMAN_IGNORE_CGROUPSV1_WARNING` to hide this warning.

Trying to pull docker.io/alpine:latest...

Getting image source signatures

Copying blob 9824c27679d3 done |

Copying config 9234e8fb04 done |

Writing manifest to image destination

Error: mounting storage for container ba508c13fcdde1a5cd40de68aad8c62cedecb86d7001d79190b1a3205cae114c: creating overlay mount to /var/lib/containers/storage/overlay/1bd977c57d4226fd4146762e90974f6cd7aed84a9a94716087fa63a205340e5c/merged, mount_data="lowerdir=/var/lib/containers/storage/overlay/l/7EYYG6BVMHUWP3XB5NMRWSYZFL,upperdir=/var/lib/containers/storage/overlay/1bd977c57d4226fd4146762e90974f6cd7aed84a9a94716087fa63a205340e5c/diff,workdir=/var/lib/containers/storage/overlay/1bd977c57d4226fd4146762e90974f6cd7aed84a9a94716087fa63a205340e5c/work,nodev,fsync=0,volatile": using mount program /usr/bin/fuse-overlayfs: unknown argument ignored: lazytime

fuse: device not found, try 'modprobe fuse' first

fuse-overlayfs: cannot mount: No such file or directory

Well, that was annoying. But after a bit of mucking about and searching various posts, I come up with running the outer container like this:

/usr/bin/podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable --cap-add=sys_admin,perfmon --group-add keep-groups --mount type=bind,src=/home/jmelton/projects/bamboo/work,dst=/usr/src/project --workdir /usr/src/project --log-driver k8s-file --log-opt path=/home/jmelton/projects/bamboo/log/bamboo-agent.20250820133608.log --network host --device /dev/fuse -v podman_containers:/var/lib/containers,rw -v bambooAgentVolume:/var/atlassian/application-data/bamboo-agent --hostname=agent-cuda-12_3 -it --name=bamboo-agent bamboo-agent bash

Close, but no cigar:

[root@agent-cuda-12_3 project]# podman run --rm -it alpine ls /

Trying to pull docker.io/alpine:latest...

Getting image source signatures

Copying blob 9824c27679d3 done |

Copying config 9234e8fb04 done |

Writing manifest to image destination

Error: OCI runtime error: crun: mount_setattr `/sys`: Function not implemented

[root@agent-cuda-12_3 project]#

exit

[jmelton@gpuserver1 bamboo]$ Error: OCI runtime error: crun: mount_setattr `/sys`: Function not implemented

-bash: /sys: Is a directory

bash: Error:: command not found…

Ultimately, I discovered that if I ran the inner container like this:

[root@agent-cuda-12_3 project]# podman run -v /sys:/sys -it busybox echo hello

Trying to pull docker.io/busybox:latest...

Getting image source signatures

Copying blob 80bfbb8a41a2 done |

Copying config 0ed463b26d done |

Writing manifest to image destination

hello

It works. Note that since the /sys in the inner container belongs to the outer container, the security risk it low. So, how can I configure podman to always mount that when I run a container? After some frustrating dialogue with ChatGPT, one would think that /etc/containers/mounts.conf would do the trick. Unfortunately, it doesn’t. My guess is that those mounts happen too late for crun but it’s all idle speculation. The best I could coax out of ChatGPT was a wrapper script to insert the -v /sys:/sys to the podman invocation. I would love to find a podman-native solution.

My head is sufficiently flattened from beating it against the mount wall, that I give up looking for the perfect solution, and move on to creating a Quadlet service to run my agent. Whoops, Quadlet requires cgroups v2. Ok, edit /etc/default/grub:

GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/rhel_gpuserver1-swap rd.lvm.lv=rhel_gpuserver1/root rd.lvm.lv=rhel_gpuserver1/swap rhgb quiet \

rd.driver.blacklist=nouveau \

systemd.unified_cgroup_hierarchy=1"

sudo grub2-mkconfig -o /boot/grub2/grub.cfg

reboot

Then all hell breaks loose.

/usr/bin/podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable --cap-add=sys_admin,perfmon --group-add keep-groups --mount type=bind,src=/home/jmelton/projects/bamboo/work,dst=/usr/src/project --workdir /usr/src/project --log-driver k8s-file --log-opt path=/home/jmelton/projects/bamboo/log/bamboo-agent.20250820174923.log --network host --cap-add chown --device /dev/fuse --security-opt 'unmask=/sys/*' -v podman_containers:/var/lib/containers,rw -v bambooAgentVolume:/var/atlassian/application-data/bamboo-agent --hostname=agent-cuda-12_3 -dt --name=bamboo-agent bamboo-agent

Error: runc: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: openat2 /sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/user.slice/libpod-047db35e8dc99bee6681f9042215c7760c6ec6fb7a295a5bdb779a17b5711fb7.scope/pids.max: no such file or directory: OCI runtime attempted to invoke a command that was not found

ChatGPT suggests updating podman, so I do, to 4.9.4, which is the latest appstream I find.

Unfortunately, things continue to go downhill:

/usr/bin/podman run --rm=true --device nvidia.com/gpu=all --security-opt=label=disable --cap-add=sys_admin,perfmon --network=host --workdir=/usr/src/project --mount type=bind,source=/home/jmelton/projects/lbe/muse-gpu,destination=/usr/src/project --mount type=bind,source=/home/data,dst=/data,ro=true -it artifactory.iss.snc:80/docker/lbe/dev ./build.bash

Error: OCI runtime error: runc: runc create failed: unable to start container process: unable to apply cgroup configuration: unable to start unit "libpod-15c84f640fb7343c120c39c468775ebf873a9033ba5e8ed4dcf4bba295700be2.scope" (properties [{Name:Description Value:"libcontainer container 15c84f640fb7343c120c39c468775ebf873a9033ba5e8ed4dcf4bba295700be2"} {Name:Slice Value:"user.slice"} {Name:Delegate Value:true} {Name:PIDs Value:@au [70881]} {Name:MemoryAccounting Value:true} {Name:CPUAccounting Value:true} {Name:IOAccounting Value:true} {Name:TasksAccounting Value:true} {Name:DefaultDependencies Value:false}]): Interactive authentication required.

What in the world does it mean “Interactive authentication required”? I do find indications of an error similar to this showing up on Fedora 37, and theoretically it was fixed by some change to systemd. I’ll try anything at this point, but I only get to 239-82, which is quite a bit older than when the issue purportedly appeared.

By now, even ChatGPT is grasping at straws. It says switch to crun. There doesn’t appear to be a downside, so I install crun-1.14.3-2, and edit /usr/share/containers/containers.conf to switch the runtime. Verified:

$ podman info --format '{{.Host.OCIRuntime}}'

{crun crun-1.14.3-2.module+el8.10.0+23250+94af2c8e.x86_64 /usr/bin/crun crun version 1.14.3

commit: 1961d211ba98f532ea52d2e80f4c20359f241a98

rundir: /run/user/1001/crun

spec: 1.0.0

+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL}

My results of the most trivial invocation are different, but not better:

$ podman run --rm busybox ls

Error: OCI runtime error: crun: sd-bus call: Process org.freedesktop.systemd1 exited with status 1: Input/output error

At this point, I reverted to cgroup v1 so other users could continue working, but I’m completely baffled. While I know lots of people do “podman in podman”, has anyone else tried to do “podman with CUDA in podman”?

I have a RHEL9 machine that natively runs cgroups v2 that I will try on tomorrow, but I have run out of avenues to pursue.

—
“God can't give us peace and happiness apart from Himself because there is no such thing.”
― C.S. Lewis

Jim Melton
http://blogs.melton.space/pharisee/