On 7/27/23 14:12, Mark Raynsford via Podman wrote:
Hello!
I'm aware of the age-old advice of not running services as root; I've
been administering UNIX-like systems for decades now.
If you follow the advice given in, for example, this page:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_at...
... What you'll get is a redis container running as root (unless the
redis image drops privileges itself - I don't know, I've never run it).
I've set up a few production systems running services that are inside
podman containers. I'm lucky enough to be using 98% software that can
run inside completely unprivileged containers. For all of these
containers, I've run each container under its own user ID. The systemd
unit for each, for example, does something along these lines:
[Service]
Type=exec
User=_cardant
Group=_cardant
ExecStart=/usr/bin/podman run ...
However, doing things this way is a little messy. For example, if for
some reason I want to do something like `podman exec` in a container, I
have to `sudo -u _cardant podman exec ...`. `podman ps` will obviously
only show me the containers running for the current user. Additionally,
any images downloaded from the registry for each service will
effectively end up in the home directory of each service user,
complicating storage accounting somewhat. The UIDs/GIDs are yet another
thing I have to manage, even though they don't have any useful meaning
(they don't identify people, they're solely there because the
containers have to run as _something_). Containers also leak internal
UID/GID values (from the /etc/subuid ranges) into the filesystem, which
can complicate things.
Additionally, there are some containers that stubbornly make it awkward
to run as a non-root user despite not actually needing privileges. The
PostgreSQL image is a good example; you can run it as a non-root user
and it'll switch to another UID inside the container and then that
UID/GID will end on the database files that are inevitably mounted
inside the container. You'll also have to match these unpredictable
weird UID/GIDs if you want to supply the container with TLS keys/certs,
because postgres will refuse to open them unless the UID/GID matches.
You can't get around this by telling postgres to run as UID 0; it'll
refuse, even though UID 0 inside the container isn't UID 0 outside of
it when running unprivileged.
I'm running all of these services on systems that have SELinux in
enforcing mode. My understanding is that containers will all have the
container_t domain and therefore even if they all ran as root, a
compromised container would not be able to do any meaningful harm to
the system.
I'm therefore not certain if the usual "don't run as root" advice
applies as containers don't have the same security properties
(especially when combined with SELinux).
I feel like it'd simplify things if I could safely run all of
the containers as root. At the very least, I'd be able to predict
UID/GID values inside the containers from outside!
I can't get any clear advice on how people are expected to run podman
containers in production. All of the various bits of documentation in
Linux distributions that talk about running under systemd never
bother to talk about UIDs or GIDs. Any documentation on running podman
rootless seems to only talk about it in the context of developers
running unprivileged containers on their local machines for
experimentation/development. If you set up containers via Fedora
Server's cockpit UI, you'll get containers running as root everywhere.
What is the best practice here?
Running containers with the least privs possible is always the goal, but
it really is up to the application.
I wrote a blog a couple of years ago, which I have never published about
containers as root but using --userns=auto, which would automatically
pick a separate user namespace for each container. I kind of like this
idea, but it has one key weekness that the Podman command would be
running as root, and there is a potential attack from just pulling the
image and writing it to disk. A bug in the podman command or the
storage layer, could potentially allow a malformed image to overwrite
content on the system outside of containers/storage.
The issue with running all of your containers as a non root users, is if
every container runs as a non-root user, then the containers would be
allowed to attack the user account and every other container, if they
were to escape confinement (SELinux).
_______________________________________________
Podman mailing list -- podman(a)lists.podman.io
To unsubscribe send an email to podman-leave(a)lists.podman.io