Thanks Dan, that explains a lot. I'm actually fine with this
reduced set, I didn't suspect CAP_NET_RAW had anything to do with
my problem until I ran strace to find that iptables creates raw
sockets to do things with the kernel. If anything it made me
aware that I should really `--cap-drop ALL` first and then
`--cap-add` whatever is needed to actually run the container.
Joost
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, 6 January 2021 19:15, Daniel Walsh <dwalsh(a)redhat.com> wrote:
On 1/6/21 12:40, Joost Molenaar wrote:
> Hello all,
> When I upgraded podman from 2.1.1 to 2.2.1, my wireguard container stopped
> working because wg-quick returned a 'permission denied' error. After some
> bug hunting, I found out that starting a (rootful) container like this...
>
> podman run --rm --uidmap 0:60000:1000 --gidmap 0:60000:1000 \\
> docker.io/library/alpine:3.12 \\
> grep ^Cap /proc/self/status
>
>
> ...on podman 2.1.1 returns a capability mask of a80425fb for all of CapInh,
> CapPrm, CapEff, CapBnd and CapAmb, and on podman 2.2.1 returns 800405fb for all
> of the capability masks except for CapAmb, which is all zeroes.
> So on podman 2.1.1 my rootful usernamespaced container ran with these capabilities:
>
0x00000000a80425fb=cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
> And on podman 2.2.1 with these:
>
0x00000000800405fb=cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_sys_chroot,cap_setfcap
> So, containers no longer run with CAP_NET_RAW, CAP_MKNOD and CAP_AUDIT_WRITE,
> which for for my wireguard container means I now have to `--cap-add`
> CAP_NET_RAW in addition to CAP_NET_ADMIN, because `wg-quick` runs `iptables`,
> which opens some raw sockets to communicate with the kernel.
> I wonder what is the background of this change, is it intentional? And I also
> noticed CapAmb is now all zeroes, which seems to have to do with commit bce8f8
> [1], but reading the code I don't understand why CapAmb is now empty when it
> seems to be the intention to populate it.
> Regards,
> Joost Molenaar
> [1]
https://github.com/containers/podman/commit/bce8f851c1e891aa6159e61c56ccd...
>
> Podman mailing list -- podman(a)lists.podman.io
> To unsubscribe send an email to podman-leave(a)lists.podman.io
First the you can set your own defaults if you want in containers.conf.
The issue was several CVEs were reported against certain vpns that
allowed users to escape the network, when CAP_NET_RAW was enabled. We
felt that the only legitimate reason for CAP_NET_RAW on by default was
to ping, but we added a syscall to turn this on by default. And now
users can ping without CAP_NET_RAW. Since we were tightening the
security for containers altogether we decided to remove the other
questionable capabilites CAP_MKNOD and CAP_AUDIT_WRITE, we have heard
very few complaints about these being removed, and made the world of
containers considerably more secure by default.
You can question that this should not have happened until Podman 3.0,
but we felt it was important to get it out quickly because of the
potential vulnerabilities.
Podman mailing list -- podman(a)lists.podman.io
To unsubscribe send an email to podman-leave(a)lists.podman.io