Hello all,
When I upgraded podman from 2.1.1 to 2.2.1, my wireguard container stopped
working because wg-quick returned a 'permission denied' error. After some
bug hunting, I found out that starting a (rootful) container like this...
podman run --rm --uidmap 0:60000:1000 --gidmap 0:60000:1000 \
docker.io/library/alpine:3.12 \
grep ^Cap /proc/self/status
...on podman 2.1.1 returns a capability mask of a80425fb for all of CapInh,
CapPrm, CapEff, CapBnd and CapAmb, and on podman 2.2.1 returns 800405fb for all
of the capability masks except for CapAmb, which is all zeroes.
So on podman 2.1.1 my rootful usernamespaced container ran with these capabilities:
0x00000000a80425fb=cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
And on podman 2.2.1 with these:
0x00000000800405fb=cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_sys_chroot,cap_setfcap
So, containers no longer run with CAP_NET_RAW, CAP_MKNOD and CAP_AUDIT_WRITE,
which for for my wireguard container means I now have to `--cap-add`
CAP_NET_RAW in addition to CAP_NET_ADMIN, because `wg-quick` runs `iptables`,
which opens some raw sockets to communicate with the kernel.
I wonder what is the background of this change, is it intentional? And I also
noticed CapAmb is now all zeroes, which seems to have to do with commit bce8f8
[1], but reading the code I don't understand why CapAmb is now empty when it
seems to be the intention to populate it.
Regards,
Joost Molenaar
[1]
https://github.com/containers/podman/commit/bce8f851c1e891aa6159e61c56ccd...
_______________________________________________
Podman mailing list -- podman(a)lists.podman.io
To unsubscribe send an email to podman-leave(a)lists.podman.io
First the you can set your own defaults if you want in containers.conf.
The issue was several CVEs were reported against certain vpns that
allowed users to escape the network, when CAP_NET_RAW was enabled. We
felt that the only legitimate reason for CAP_NET_RAW on by default was
to ping, but we added a syscall to turn this on by default. And now
users can ping without CAP_NET_RAW. Since we were tightening the
security for containers altogether we decided to remove the other
questionable capabilites CAP_MKNOD and CAP_AUDIT_WRITE, we have heard
very few complaints about these being removed, and made the world of
containers considerably more secure by default.
You can question that this should not have happened until Podman 3.0,
but we felt it was important to get it out quickly because of the
potential vulnerabilities.