Thanks Dan, that explains a lot. I'm actually fine with this
reduced set, I didn't suspect CAP_NET_RAW had anything to do with
my problem until I ran strace to find that iptables creates raw
sockets to do things with the kernel. If anything it made me
aware that I should really `--cap-drop ALL` first and then
`--cap-add` whatever is needed to actually run the container.
Joost
That is the most secure way to do things that require a bit of
"root".
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, 6 January 2021 19:15, Daniel Walsh <dwalsh(a)redhat.com> wrote:
> On 1/6/21 12:40, Joost Molenaar wrote:
>
>> Hello all,
>> When I upgraded podman from 2.1.1 to 2.2.1, my wireguard container stopped
>> working because wg-quick returned a 'permission denied' error. After
some
>> bug hunting, I found out that starting a (rootful) container like this...
>>
>> podman run --rm --uidmap 0:60000:1000 --gidmap 0:60000:1000 \\
>> docker.io/library/alpine:3.12 \\
>> grep ^Cap /proc/self/status
>>
>>
>> ...on podman 2.1.1 returns a capability mask of a80425fb for all of CapInh,
>> CapPrm, CapEff, CapBnd and CapAmb, and on podman 2.2.1 returns 800405fb for all
>> of the capability masks except for CapAmb, which is all zeroes.
>> So on podman 2.1.1 my rootful usernamespaced container ran with these
capabilities:
>>
0x00000000a80425fb=cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
>> And on podman 2.2.1 with these:
>>
0x00000000800405fb=cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_sys_chroot,cap_setfcap
>> So, containers no longer run with CAP_NET_RAW, CAP_MKNOD and CAP_AUDIT_WRITE,
>> which for for my wireguard container means I now have to `--cap-add`
>> CAP_NET_RAW in addition to CAP_NET_ADMIN, because `wg-quick` runs `iptables`,
>> which opens some raw sockets to communicate with the kernel.
>> I wonder what is the background of this change, is it intentional? And I also
>> noticed CapAmb is now all zeroes, which seems to have to do with commit bce8f8
>> [1], but reading the code I don't understand why CapAmb is now empty when it
>> seems to be the intention to populate it.
>> Regards,
>> Joost Molenaar
>> [1]
https://github.com/containers/podman/commit/bce8f851c1e891aa6159e61c56ccd...
>>
>> Podman mailing list -- podman(a)lists.podman.io
>> To unsubscribe send an email to podman-leave(a)lists.podman.io
> First the you can set your own defaults if you want in containers.conf.
> The issue was several CVEs were reported against certain vpns that
> allowed users to escape the network, when CAP_NET_RAW was enabled. We
> felt that the only legitimate reason for CAP_NET_RAW on by default was
> to ping, but we added a syscall to turn this on by default. And now
> users can ping without CAP_NET_RAW. Since we were tightening the
> security for containers altogether we decided to remove the other
> questionable capabilites CAP_MKNOD and CAP_AUDIT_WRITE, we have heard
> very few complaints about these being removed, and made the world of
> containers considerably more secure by default.
>
> You can question that this should not have happened until Podman 3.0,
> but we felt it was important to get it out quickly because of the
> potential vulnerabilities.
>
> Podman mailing list -- podman(a)lists.podman.io
> To unsubscribe send an email to podman-leave(a)lists.podman.io
_______________________________________________
Podman mailing list -- podman(a)lists.podman.io
To unsubscribe send an email to podman-leave(a)lists.podman.io