Hi,
I am looking for some guidance on how to securely containerize an
application that depends on the `CAP_SYS_NICE` capability to work.
Outside of the container world, one would probably just set the capability
on the binary so that a non-privileged user could run it :
```
$ my_app
Error!
$ sudo setcap 'cap_sys_nice+ep' my_app
$ my_app
Success!
```
When working with containers, the easiest solution would be to execute
Podman as root with the `--cap-add` parameter :
```
$ sudo podman run --rm --cap-add "sys_nice" -v
"$PWD/my_app:/my_app"
fedora:34 /my_app
Success!
```
A somewhat more secure option would consist in switching to a
non-privileged user with the `--user` parameter :
```
$ sudo podman run --rm --cap-add "sys_nice" -v
"$PWD/my_app:/my_app"
--user nobody fedora:34 /my_app
Success!
```
Now, in order to mitigate potential container-breakout vulnerabilities, I
would like to go a bit further and set up a rootless container.
I have recently learned about ambient capabilities and I have started
experimenting with the `capsh` command. This seems to work :
```
$ sudo capsh --caps="cap_sys_nice+eip
cap_setpcap,cap_setuid,cap_setgid+ep" --keep=1 --user="${USER}"
--addamb=cap_sys_nice -- -c ./my_app
Success!
```
But this does not (the ambient capability is not set in the container and
`strace` indicates that the `setpriority` system call fails with a
`Permission denied`) :
```
$ sudo capsh --caps="cap_sys_nice+eip
cap_setpcap,cap_setuid,cap_setgid+ep" --keep=1 --user="${USER}"
--addamb=cap_sys_nice -- -c "HOME=${HOME} podman run --rm --cap-add
sys_nice -v $PWD/my_app:/my_app fedora:34 /my_app"
Error!
```
Is this a podman limitation (Could it be improved?)? Is there a better
approach?
Thank you,
Vincent Quéméner.