[Podman] Re: Podman hanging on start, ps, and erroring out with slirp4netns on run

Monday, 3 February 2020

I'm a bit surprised say it's not suggested to use a static name, if we
remove name from the equation it would create a new container every
time it starts and the old one will be "orphaned".
In the case that conmon process gets killed the container would keep
running. Maybe I'm not 100% sure of how it's supposed to work, would
removing the PIDfile stop the container if it uses --conmon-pidfile?
If it's already stopped I don't see how podman could detect the
PIDfile being removed and proceed to remove the container. From my POV
it doesn't seem like there is any clean-up

Adding the ExecStartPre=usr/bin/podman rm -f $NAME seems to have
worked. Thanks for the suggestion.
I tried killing conmon and the container manually to see how it would
work and there wasn't any issues. I did get a locking issue in
ExecStop on podman rm, though not been able to reproduce it.

I'm running on a relatively tiny VM so it might explain hitting the
default timeout of 10s. Tried upping it 30 and there were no more
issues - thanks :)

Eric Gustavsson, RHCSA

He/Him/His

Software Engineer

Red Hat

IM: Telegram: @SpyTec

E1FE 044A E0DE 127D CBCA E7C7 BD1B 8DF2 C5A1 5384

On Mon, 3 Feb 2020 at 11:53, Valentin Rothberg <rothberg(a)redhat.com&gt; wrote:
...

 On Mon, Feb 3, 2020 at 11:39 AM Eric Gustavsson <egustavs(a)redhat.com&gt; wrote:
>
> Hi Valentin,
>
> Thanks for the suggestion, I gave it a try. Although it creates the container now, if
the main process ever gets killed or we restart, it will complain that the container is
already in use.

 If possible, I suggest to not use a static name for the container and let podman chose a
random one. This way we can prevent conflicts.

 In case we need to assign a name to the container, we could add the following line to the
service:
 /ExecStartPre=usr/bin/podman rm -f $NAME

 With $NAME being the containers name.

 Would that work for you?

>
> If we now run it, get the Main PID and kill it, the container will keep running.
> spytec@KeyraGuest1:~$ systemctl --user start container-bitwarden
> spytec@KeyraGuest1:~$ systemctl --user status container-bitwarden
> [...]
>  Main PID: 8828 (conmon)
> [...]
> spytec@KeyraGuest1:~$ sudo kill -9 8828
> spytec@KeyraGuest1:~$
>
> Looking at journalctl, which is also identical when running podman stop bitwarden
instead of kill.
> [...]
> Feb 03 10:28:01 KeyraGuest1 podman[8812]:
59932a3cb11ac5a95518fb5b016de23851d189483caf525ef2d4c1a67f3525da
> Feb 03 10:28:01 KeyraGuest1 systemd[837]: Started Bitwarden Podman container.
> Feb 03 10:28:21 KeyraGuest1 systemd[837]: container-bitwarden.service: Main process
exited, code=killed, status=9/KILL
> Feb 03 10:28:21 KeyraGuest1 systemd[837]: container-bitwarden.service: Failed with
result 'signal'.
> Feb 03 10:28:21 KeyraGuest1 systemd[837]: container-bitwarden.service: Service
RestartSec=100ms expired, scheduling restart.
> Feb 03 10:28:21 KeyraGuest1 systemd[837]: container-bitwarden.service: Scheduled
restart job, restart counter is at 1.
> Feb 03 10:28:21 KeyraGuest1 systemd[837]: Stopped Bitwarden Podman container.
> Feb 03 10:28:21 KeyraGuest1 systemd[837]: container-bitwarden.service: Found
left-over process 8839 (bitwarden_rs) in control group while starting unit. Ignoring.
> Feb 03 10:28:21 KeyraGuest1 systemd[837]: This usually indicates unclean termination
of a previous run, or service implementation deficiencies.
> Feb 03 10:28:21 KeyraGuest1 systemd[837]: Starting Bitwarden Podman container...
> Feb 03 10:28:21 KeyraGuest1 systemd[837]: container-bitwarden.service: Found
left-over process 8839 (bitwarden_rs) in control group while starting unit. Ignoring.
> Feb 03 10:28:21 KeyraGuest1 systemd[837]: This usually indicates unclean termination
of a previous run, or service implementation deficiencies.
> Feb 03 10:28:22 KeyraGuest1 podman[8918]: Error: error creating container storage:
the container name "bitwarden" is already in use by
"59932a3cb11ac5a95518fb5b016de23851d189483caf525ef2d4c1a67f3525da"
>
> The new service file:
> [Unit]
> Description=Bitwarden Podman container
>
> [Service]
> Restart=on-failure
> ExecStartPre=/usr/bin/rm -f /%t/%n-pid /%t/%n-cid
> ExecStart=/usr/bin/podman run --conmon-pidfile /%t/%n-pid
--env-file=/home/spytec/Bitwarden/bitwarden.conf -d -p 8080:8080 -v
/home/spytec/Bitwarden/bw-data:/data/:Z --name 'bitwarden'
bitwardenrs/server:latest
> ExecStop=/usr/bin/podman rm -f --cid-file /%t/%n-cid
> KillMode=none
> Type=forking
> PIDFile=/%t/%n-pid
>
> [Install]
> WantedBy=multi-user.target
>
> I tried running podman rm myself without and I had to run it twice.
> spytec@KeyraGuest1:~$ podman rm -f bitwarden
> Error: cannot remove container
59932a3cb11ac5a95518fb5b016de23851d189483caf525ef2d4c1a67f3525da as it could not be
stopped: timed out waiting for file
/run/user/1000/libpod/tmp/exits/59932a3cb11ac5a95518fb5b016de23851d189483caf525ef2d4c1a67f3525da:
internal libpod error
> spytec@KeyraGuest1:~$ podman rm -f bitwarden
> 59932a3cb11ac5a95518fb5b016de23851d189483caf525ef2d4c1a67f3525da

 Interesting. We have a default timeout of 10 seconds. Depending on the host's load
and the container, it might take longer than 10 seconds, which would explain your
observation.

>
> Thanks,
> Eric
>
> On Mon, 3 Feb 2020 at 09:19, Valentin Rothberg <rothberg(a)redhat.com&gt; wrote:
>>
>>
>>
>> On Sat, Feb 1, 2020 at 1:32 AM Eric Gustavsson <egustavs(a)redhat.com&gt;
wrote:
>>>
>>> Given that one container worked right off the bat, I removed bitwarden
container and started up a new one from an image. Though now there's systemd issues.
>>> conmon.pid seems to disappear as soon as I start the container service. Only
modification I did to the container-bitwarden.service was to add User and Group
>>>
>>> spytec@KeyraGuest1:~$ podman run --name bitwarden [...]
>>> spytec@KeyraGuest1:~$ podman generate systemd --name bitwarden -f
>>> /home/spytec/container-bitwarden.service
>>> spytec@KeyraGuest1:~$ ls -l
/run/user/1000/vfs-containers/ed80bfc884aac5ba8c3046f148d686b891b05e21585be8461997c82fa2909223/userdata
| grep conmon
>>> -rw-r--r--. 1 spytec spytec   4 Feb  1 00:22 conmon.pid
>>> spytec@KeyraGuest1:~$ sudo systemctl enable
/usr/lib/systemd/system/container-bitwarden.service
>>> Created symlink
/etc/systemd/system/multi-user.target.wants/container-bitwarden.service →
/usr/lib/systemd/system/container-bitwarden.service.
>>> spytec@KeyraGuest1:~$ sudo systemctl start container-bitwarden
>>> Job for container-bitwarden.service failed because the control process exited
with error code.
>>> See "systemctl status container-bitwarden.service" and
"journalctl -xe" for details.
>>> spytec@KeyraGuest1:~$ ls -l
/run/user/1000/vfs-containers/ed80bfc884aac5ba8c3046f148d686b891b05e21585be8461997c82fa2909223/userdata
| grep conmon
>>> spytec@KeyraGuest1:~$
>>>
>>> from journalctl of the service:
>>> Feb 01 00:12:14 KeyraGuest1 systemd[1]: Starting Podman
container-bitwarden.service...
>>> Feb 01 00:12:14 KeyraGuest1 podman[5867]: bitwarden
>>> Feb 01 00:12:14 KeyraGuest1 systemd[1]: container-bitwarden.service:
Can't open PID file
/run/user/1000/vfs-containers/f58e338ec3ff083fff993c97c715665bbc243eaf48ddf115095209d997982182/userdata/conmon.pid
(yet?) after start: No such file>
>>
>>
>> I haven't seen this error yet. It could very well be a race. We recently
published a blog about improved systemd services [1] that create new containers on each
start. I suggest using that instead. With the next release of Podman, systemd generate
comes with --new flag to generate such service files.
>>
>> Kind regards,
>>  Valentin
>>
>> [1] https://www.redhat.com/sysadmin/podman-shareable-systemd-services
>>
>>>
>>> Thanks,
>>> Eric
>>>
>>>
>>> On Fri, 31 Jan 2020 at 15:42, Eric Gustavsson <egustavs(a)redhat.com&gt;
wrote:
>>>>
>>>> Hi Matt,
>>>>
>>>> Seems I forgot to include transcripts of me kill those processes. Even
though I do kill the processes and try to run again, it still hangs.
>>>> Doing podman ps after killing the processes works, but starting my
bitwarden container just doesn't got anywhere no matter if I kill all processes or
restart - it always hangs.
>>>> For slirp4netns I had 0.3.0-2.git4992082.fc30. Upgrading to
0.4.0-4.git19d199a.fc30 did make podman run work, and testing on that new container seems
to work fine. Would my bitwarden container be corrupted somehow?
>>>>
>>>> spytec@KeyraGuest1:~$ podman run -p 5432:5432 -d --name test postgres
>>>> d73a11bc83b1de7811b8a6eb393e7c7de2ea98dda968ae11c4b490b1c16eb444
>>>> spytec@KeyraGuest1:~$ podman ps
>>>> CONTAINER ID  IMAGE                              COMMAND   CREATED       
STATUS            PORTS                   NAMES
>>>> d73a11bc83b1  docker.io/library/postgres:latest  postgres  8 seconds ago 
Up 3 seconds ago  0.0.0.0:5432->5432/tcp  test
>>>> spytec@KeyraGuest1:~$ podman stop test
>>>> d73a11bc83b1de7811b8a6eb393e7c7de2ea98dda968ae11c4b490b1c16eb444
>>>> spytec@KeyraGuest1:~$ podman start test
>>>> test
>>>> spytec@KeyraGuest1:~$ podman start bitwarden
>>>> ^C
>>>>
>>>>
>>>> On Fri, 31 Jan 2020 at 15:33, Matt Heon <mheon(a)redhat.com&gt; wrote:
>>>>>
>>>>> On 2020-01-31 15:11, Eric Gustavsson wrote:
>>>>> >Hi all,
>>>>> >
>>>>> >I have unit file generated by podman running, though as soon as I
run it
>>>>> >there's issues with running any other command that needs to
do something
>>>>> >with containers. podman ps for example will be completely
unresponsive and
>>>>> >not return anything, even after waiting minutes. Not only that,
but even
>>>>> >running podman start x by itself will hang or creating new
containers
>>>>> >
>>>>> >This is with Fedora 30 and Kernel 5.1.8-300.fc30.x86_64
>>>>> >
>>>>> >spytec@KeyraGuest1:~$ podman --version
>>>>> >podman version 1.7.0
>>>>> >spytec@KeyraGuest1:~$ podman start bitwarden -a
>>>>> >^C
>>>>> >spytec@KeyraGuest1:~$ sudo systemctl start bitwarden
>>>>> >^C
>>>>> >spytec@KeyraGuest1:~$ sudo systemctl status bitwarden
>>>>> >[... output omitted...]
>>>>> >Jan 31 13:53:14 KeyraGuest1 systemd[1]: Starting Podman
>>>>> >container-bitwarden.service...
>>>>> >spytec@KeyraGuest1:~$ sudo systemctl stop bitwarden
>>>>> >spytec@KeyraGuest1:~$ podman ps
>>>>> >^C
>>>>> >spytec@KeyraGuest1:~$ ps auxww | grep podman
>>>>> >spytec    1097  0.0  0.8  62816 33808 ?        S    13:52   0:00
podman
>>>>> >spytec    1171  0.0  1.3 681944 55064 ?        Ssl  13:53   0:00
>>>>> >/usr/bin/podman start bitwarden
>>>>> >spytec    1178  0.0  1.4 755824 56680 ?        Sl   13:53   0:00
>>>>> >/usr/bin/podman start bitwarden
>>>>> >spytec    1224  0.0  0.0   9360   880 pts/0    S+   13:54   0:00
grep
>>>>> >--color=auto podman
>>>>> >spytec@KeyraGuest1:~$ journalctl -u bitwarden | tail -n 5
>>>>> >Jan 31 13:51:50 KeyraGuest1 systemd[1]: bitwarden.service: Failed
with
>>>>> >result 'exit-code'.
>>>>> >Jan 31 13:51:50 KeyraGuest1 systemd[1]: Failed to start Podman
>>>>> >container-bitwarden.service.
>>>>> >Jan 31 13:53:14 KeyraGuest1 systemd[1]: Starting Podman
>>>>> >container-bitwarden.service...
>>>>> >Jan 31 13:54:26 KeyraGuest1 systemd[1]: bitwarden.service:
Succeeded.
>>>>> >Jan 31 13:54:26 KeyraGuest1 systemd[1]: Stopped Podman
>>>>> >container-bitwarden.service.
>>>>> >spytec@KeyraGuest1:~$ ps auxww | grep podman
>>>>> >spytec    1097  0.0  0.8  62816 33808 ?        S    13:52   0:00
podman
>>>>> >spytec    1171  0.0  1.3 682008 55064 ?        Ssl  13:53   0:00
>>>>> >/usr/bin/podman start bitwarden
>>>>> >spytec    1178  0.0  1.4 755824 56680 ?        Sl   13:53   0:00
>>>>> >/usr/bin/podman start bitwarden
>>>>> >spytec    1235  0.0  0.0   9360   816 pts/0    S+   13:55   0:00
grep
>>>>> >--color=auto podman
>>>>> >spytec@KeyraGuest1:~$ kill 1181
>>>>> >spytec@KeyraGuest1:~$ kill 1097
>>>>> >spytec@KeyraGuest1:~$ podman ps -a
>>>>> >CONTAINER ID  IMAGE                                COMMAND       
CREATED
>>>>> >   STATUS   PORTS                   NAMES
>>>>> >baa2f3d6ed39  docker.io/bitwardenrs/server:latest  /bitwarden_rs 
3 weeks
>>>>> >ago  Created  0.0.0.0:8080->8080/tcp  bitwarden
>>>>> >
>>>>> >And creating a whole new container
>>>>> >spytec@KeyraGuest1:~$ podman run -d --name test postgres
>>>>> >Trying to pull docker.io/library/postgres...
>>>>> >[... output omitted...]
>>>>> >Writing manifest to image destination
>>>>> >Storing signatures
>>>>> >Error: slirp4netns failed: "/usr/bin/slirp4netns:
unrecognized option
>>>>> >'--netns-type=path'\nUsage: /usr/bin/slirp4netns
[OPTION]... PID
>>>>> >TAPNAME\nUser-mode networking for unprivileged network
namespaces.\n\n-c,
>>>>> >--configure          bring up the interface\n-e, --exit-fd=FD
>>>>> >specify the FD for terminating slirp4netns\n-r, --ready-fd=FD
>>>>> > specify the FD to write to when the network is configured\n-m,
--mtu=MTU
>>>>> >         specify MTU (default=1500, max=65521)\n--cidr=CIDR
>>>>> > specify network address CIDR
(default=10.0.2.0/24)\n--disable-host-loopback
>>>>> > prohibit connecting to 127.0.0.1:* on the host namespace\n-a,
>>>>> >--api-socket=PATH    specify API socket path\n-6, --enable-ipv6
>>>>> > enable IPv6 (experimental)\n-h, --help               show this
help and
>>>>> >exit\n-v, --version            show version and exit\n"
>>>>> >
>>>>> >Thanks,
>>>>> >
>>>>> >Eric Gustavsson, RHCSA
>>>>> >
>>>>> >He/Him/His
>>>>> >
>>>>> >Software Engineer
>>>>> >
>>>>> >Red Hat <https://www.redhat.com>
>>>>> >
>>>>> >IM: Telegram: @SpyTec
>>>>> >
>>>>> >E1FE 044A E0DE 127D CBCA E7C7 BD1B 8DF2 C5A1 5384
>>>>> ><https://www.redhat.com>
>>>>>
>>>>> >_______________________________________________
>>>>> >Podman mailing list -- podman(a)lists.podman.io
>>>>> >To unsubscribe send an email to podman-leave(a)lists.podman.io
>>>>>
>>>>> It seems like you have some Podman processes hanging in the
background
>>>>> from what you sent - if you kill those, do things go back to normal,
>>>>> with regards to creation of new containers, `podman ps`, etc? This
>>>>> sounds like a hanging process that is holding a lock, preventing
>>>>> anything else from running.
>>>>>
>>>>> The slirp error seems like a version mismatch - what are the RPM
>>>>> versions of Podman and slirp4netns over there? I suspect this is not
>>>>> the same issue causing the hanging issue, it seems like it exits
>>>>> immediately which would not hold a lock (or does it not exit, and
hang
>>>>> instead?).
>>>>>
>>>>> Thanks,
>>>>> Matt Heon
>>>
>>> _______________________________________________
>>> Podman mailing list -- podman(a)lists.podman.io
>>> To unsubscribe send an email to podman-leave(a)lists.podman.io 

2025

2024

2023

2022

2021

2020

2019

[Podman] Re: Podman hanging on start, ps, and erroring out with slirp4netns on run