Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
Steps to reproduce the issue:
sudo podman run --name=test_port -d -p 9999:9999 ubuntu:18.04 sleep infinitysudo podman --log-level debug restart test_portDescribe the results you received:
When restarting, it shows "address already in use".
If I stop, and then start, it works however.
DEBU[0000] Initializing boltdb state at /home/docker-data/libpod/bolt_state.db
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /home/docker-data
DEBU[0000] Using run root /var/run/containers/storage
DEBU[0000] Using static dir /home/docker-data/libpod
DEBU[0000] Using tmp dir /var/run/libpod
DEBU[0000] Using volume path /home/docker-data/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] [graphdriver] trying provided driver "overlay"
DEBU[0000] cached value indicated that overlay is supported
DEBU[0000] cached value indicated that metacopy is not being used
DEBU[0000] cached value indicated that native-diff is usable
DEBU[0000] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false
DEBU[0000] Initializing event backend journald
INFO[0000] Found CNI network podman (type=bridge) at /etc/cni/net.d/87-podman-bridge.conflist
DEBU[0000] Setting maximum workers to 2
DEBU[0000] Stopping ctr 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767 (timeout 10)
DEBU[0000] Stopping container 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767 (PID 1253)
DEBU[0000] Sending signal 15 to container 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767
DEBU[0010] container 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767 did not die within timeout 10000000000
WARN[0010] Timed out stopping container 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767, resorting to SIGKILL
DEBU[0010] Created root filesystem for container 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767 at /home/docker-data/overlay/f1399b213c963b07f07ccd091f8b3ce133d4aaacec12b859c99fd1b106c2a024/merged
DEBU[0010] Recreating container 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767 in OCI runtime
DEBU[0010] Successfully cleaned up container 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767
DEBU[0010] /etc/system-fips does not exist on host, not mounting FIPS mode secret
DEBU[0010] Setting CGroups for container 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767 to machine.slice:libpod:5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767
DEBU[0010] reading hooks from /usr/share/containers/oci/hooks.d
DEBU[0010] reading hooks from /etc/containers/oci/hooks.d
DEBU[0010] Created OCI spec for container 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767 at /home/docker-data/overlay-containers/5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767/userdata/config.json
DEBU[0010] /usr/libexec/podman/conmon messages will be logged to syslog
DEBU[0010] running conmon: /usr/libexec/podman/conmon args="[-s -c 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767 -u 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767 -n test_port -r /usr/sbin/runc -b /home/docker-data/overlay-containers/5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767/userdata -p /var/run/containers/storage/overlay-containers/5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767/userdata/pidfile --exit-dir /var/run/libpod/exits --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/docker-data --exit-command-arg --runroot --exit-command-arg /var/run/containers/storage --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --runtime --exit-command-arg runc --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767 --socket-dir-path /var/run/libpod/socket -l k8s-file:/home/docker-data/overlay-containers/5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767/userdata/ctr.log --log-level debug --syslog]"
DEBU[0010] Cleaning up container 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767
DEBU[0010] Tearing down network namespace at /var/run/netns/cni-a36cd478-3857-e576-6e7d-face72a879d1 for container 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767
INFO[0010] Got pod network &{Name:test_port Namespace:test_port ID:5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767 NetNS:/var/run/netns/cni-a36cd478-3857-e576-6e7d-face72a879d1 PortMappings:[{HostPort:9999 ContainerPort:9999 Protocol:tcp HostIP:}] Networks:[] NetworkConfig:map[]}
INFO[0010] About to del CNI network podman (type=bridge)
DEBU[0010] unmounted container "5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767"
DEBU[0010] Failed to restart container 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767: cannot listen on the TCP port: listen tcp4 :9999: bind: address already in use
DEBU[0010] Worker#0 finished job [(*LocalRuntime) Restart func1]/5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767 (cannot listen on the TCP port: listen tcp4 :9999: bind: address already in use)
DEBU[0010] Pool[restart, 5871e240b73da769154be80cc039e9fdbd0bbe8167f0f3d02e3649cd0e7d5767: cannot listen on the TCP port: listen tcp4 :9999: bind: address already in use]
ERRO[0010] cannot listen on the TCP port: listen tcp4 :9999: bind: address already in use
Describe the results you expected:
Should be able to restart successfully.
Additional information you deem important (e.g. issue happens only occasionally):
Output of podman version:
podman version 1.4.4
Output of podman info --debug:
debug:
compiler: gc
git commit: ""
go version: go1.10.8
podman version: 1.4.4
host:
BuildahVersion: 1.9.0
Conmon:
package: podman-1.4.4-2.el7.x86_64
path: /usr/libexec/podman/conmon
version: 'conmon version 0.3.0, commit: unknown'
Distribution:
distribution: '"rhel"'
version: "7.6"
MemFree: 176705536
MemTotal: 1920000000
OCIRuntime:
package: containerd.io-1.2.5-3.1.el7.x86_64
path: /usr/sbin/runc
version: |-
runc version 1.0.0-rc6+dev
commit: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
spec: 1.0.1-dev
SwapFree: 0
SwapTotal: 0
arch: amd64
cpus: 1
hostname: MASKED
kernel: 3.10.0-957.21.3.el7.MASKED.20190617.34.x86_64
os: linux
rootless: false
uptime: 114h 57m 18.7s (Approximately 4.75 days)
registries:
blocked: null
insecure: null
search:
- registry.access.redhat.com
- docker.io
- registry.fedoraproject.org
- quay.io
- registry.centos.org
store:
ConfigFile: /etc/containers/storage.conf
ContainerStore:
number: 8
GraphDriverName: overlay
GraphOptions: null
GraphRoot: /home/docker-data
GraphStatus:
Backing Filesystem: xfs
Native Overlay Diff: "true"
Supports d_type: "true"
Using metacopy: "false"
ImageStore:
number: 15
RunRoot: /var/run/containers/storage
VolumePath: /home/docker-data/volumes
Additional environment details (AWS, VirtualBox, physical, etc.):
OpenStack VM
@mheon Looks like a race condition, where the container is not fully stopped before the second container starts?
Looks like it. We're hitting the timeout, sending SIGKILL, and then try to start.
Perhaps in restart we should wait for the container cleanup to complete before starting?
@rhatdan, we're getting closer. The issue is that the network isn't cleaned up when stopping the container during restart.
Ah, there's more behind. Still tracking it down. Maybe @mheon or @baude know where to shoot?
Looks like we need to find a way to avoid binding the ports (and store them in the container).
The ports are held open by Conmon. We don't want to avoid new bindings entirely - this is likely the old container's Conmon not dying in time, and we won't have the ports held open if we don't try to reserve them again.
Per the logs, we timed out killing with the normal stop signal, and resorted to SIGKILL - but the container did die. Restart should be waiting for the exit file to be created before transitioning to the 'start' part, so we should have a guarantee that conmon is dead (or very close to it) by the time we're getting to restarting the container. However, seemingly, the old conmon is managing to stick around.
@mheon I am exactly on that track atm. Playing around with killing conmon. Looks like we're on a race.
It's a bug in conmon, preparing a fix.
This issue had no activity for 30 days. In the absence of activity or the "do-not-close" label, the issue will be automatically closed within 7 days.
We fixed this one by forcibly killing Conmon on restart, I believe
I seem to be running into this issue. I get unable to start container "dcn": cannot listen on the TCP port: listen tcp4 :39123: bind: address already in use when starting a (root) container again after stopping it.
EDIT: What conmon version was this fixed in? Maybe I don't have the fix in the latest podman release yet.
EDIT2:
Logs:
DEBU[0000] using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /var/lib/containers/storage/libpod/bolt_state.db
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /var/lib/containers/storage
DEBU[0000] Using run root /var/run/containers/storage
DEBU[0000] Using static dir /var/lib/containers/storage/libpod
DEBU[0000] Using tmp dir /var/run/libpod
DEBU[0000] Using volume path /var/lib/containers/storage/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] [graphdriver] trying provided driver "overlay"
DEBU[0000] cached value indicated that overlay is supported
DEBU[0000] cached value indicated that metacopy is not being used
DEBU[0000] cached value indicated that native-diff is usable
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false
DEBU[0000] Initializing event backend journald
DEBU[0000] using runtime "/usr/sbin/runc"
WARN[0000] Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument
INFO[0000] Found CNI network podman (type=bridge) at /etc/cni/net.d/87-podman-bridge.conflist
DEBU[0000] Made network namespace at /var/run/netns/cni-ab20fd67-055b-0cf3-89cc-180ad7222413 for container 0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8
INFO[0000] Got pod network &{Name:dcn Namespace:dcn ID:0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8 NetNS:/var/run/netns/cni-ab20fd67-055b-0cf3-89cc-180ad7222413 Networks:[] RuntimeConfig:map[podman:{IP: PortMappings:[{HostPort:39123 ContainerPort:22 Protocol:tcp HostIP:}] Bandwidth:<nil> IpRanges:[]}]}
INFO[0000] About to add CNI network cni-loopback (type=loopback)
DEBU[0000] overlay: mount_data=lowerdir=/var/lib/containers/storage/overlay/l/CNS7KFQZNRKEVVKUEZQUCEYZ4Y:/var/lib/containers/storage/overlay/l/U74DVPD2TW4QCUCVGFYDMDEKDQ:/var/lib/containers/storage/overlay/l/QV6WF7LAWVMLMGUYHHQ2GW4QYC:/var/lib/containers/storage/overlay/l/UJY6PXZTMY37QO7KIRSEFLIQVB:/var/lib/containers/storage/overlay/l/XOTIGD4LKSS4DHRWWWA4OYPLCW:/var/lib/containers/storage/overlay/l/UHY7O5NDAP4OWEJV4ICTJLDHCN:/var/lib/containers/storage/overlay/l/KD27AID45U7QBLDLACU23VKYSB:/var/lib/containers/storage/overlay/l/VOGUIGMYO74L23BF4CP3U74PHC:/var/lib/containers/storage/overlay/l/QZVOALXN67FMJJEOL7H4BXIOWF:/var/lib/containers/storage/overlay/l/AOQO6WE5LKDGQ2HJVSHJ3YBKQY:/var/lib/containers/storage/overlay/l/TMLF45SQGBYNUWEG3ETXOYPXSM:/var/lib/containers/storage/overlay/l/PQ43PKUKJNFTUJ3536QRVMB4JD:/var/lib/containers/storage/overlay/l/Y7ZJFNE2K2BANCPJGX2M6RYV6X,upperdir=/var/lib/containers/storage/overlay/44553b727e9b7fdf13f9a5e4168867b9618a60f81d72ac96d6a7f9556b62738b/diff,workdir=/var/lib/containers/storage/overlay/44553b727e9b7fdf13f9a5e4168867b9618a60f81d72ac96d6a7f9556b62738b/work
DEBU[0000] mounted container "0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8" at "/var/lib/containers/storage/overlay/44553b727e9b7fdf13f9a5e4168867b9618a60f81d72ac96d6a7f9556b62738b/merged"
DEBU[0000] Created root filesystem for container 0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8 at /var/lib/containers/storage/overlay/44553b727e9b7fdf13f9a5e4168867b9618a60f81d72ac96d6a7f9556b62738b/merged
INFO[0000] Got pod network &{Name:dcn Namespace:dcn ID:0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8 NetNS:/var/run/netns/cni-ab20fd67-055b-0cf3-89cc-180ad7222413 Networks:[] RuntimeConfig:map[podman:{IP: PortMappings:[{HostPort:39123 ContainerPort:22 Protocol:tcp HostIP:}] Bandwidth:<nil> IpRanges:[]}]}
INFO[0000] About to add CNI network podman (type=bridge)
DEBU[0000] [0] CNI result: Interfaces:[{Name:cni-podman0 Mac:ee:9a:95:0b:cb:ef Sandbox:} {Name:veth18b0249d Mac:1e:35:fe:9c:04:01 Sandbox:} {Name:eth0 Mac:ea:07:a4:52:05:95 Sandbox:/var/run/netns/cni-ab20fd67-055b-0cf3-89cc-180ad7222413}], IP:[{Version:4 Interface:0xc4203cb408 Address:{IP:10.88.0.22 Mask:ffff0000} Gateway:10.88.0.1}], Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:<nil>}], DNS:{Nameservers:[] Domain: Search:[] Options:[]}
INFO[0000] No non-localhost DNS nameservers are left in resolv.conf. Using default external servers: [nameserver 8.8.8.8 nameserver 8.8.4.4]
INFO[0000] IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]
DEBU[0000] /etc/system-fips does not exist on host, not mounting FIPS mode secret
DEBU[0000] Setting CGroups for container 0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8 to machine.slice:libpod:0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8
DEBU[0000] reading hooks from /usr/share/containers/oci/hooks.d
DEBU[0000] reading hooks from /etc/containers/oci/hooks.d
DEBU[0000] Created OCI spec for container 0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8 at /var/lib/containers/storage/overlay-containers/0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8/userdata/config.json
DEBU[0000] /usr/bin/conmon messages will be logged to syslog
DEBU[0000] running conmon: /usr/bin/conmon args="[--api-version 1 -s -c 0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8 -u 0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8 -r /usr/sbin/runc -b /var/lib/containers/storage/overlay-containers/0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8/userdata -p /var/run/containers/storage/overlay-containers/0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8/userdata/pidfile -l k8s-file:/var/lib/containers/storage/overlay-containers/0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8/userdata/ctr.log --exit-dir /var/run/libpod/exits --socket-dir-path /var/run/libpod/socket --log-level debug --syslog --conmon-pidfile /var/run/containers/storage/overlay-containers/0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /var/run/containers/storage --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --runtime --exit-command-arg runc --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8]"
DEBU[0000] Cleaning up container 0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8
DEBU[0000] Tearing down network namespace at /var/run/netns/cni-ab20fd67-055b-0cf3-89cc-180ad7222413 for container 0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8
INFO[0000] Got pod network &{Name:dcn Namespace:dcn ID:0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8 NetNS:/var/run/netns/cni-ab20fd67-055b-0cf3-89cc-180ad7222413 Networks:[] RuntimeConfig:map[podman:{IP: PortMappings:[{HostPort:39123 ContainerPort:22 Protocol:tcp HostIP:}] Bandwidth:<nil> IpRanges:[]}]}
INFO[0000] About to del CNI network podman (type=bridge)
DEBU[0000] unmounted container "0338c8ce9ab4aa02a433018157ae8f255a2510a3c1d3868faae436ba0e831da8"
ERRO[0000] unable to start container "dcn": cannot listen on the TCP port: listen tcp4 :39123: bind: address already in use
Netstat:
tcp 1 0 localhost:39123 localhost:40300 CLOSE_WAIT
tcp 0 0 localhost:40352 localhost:39123 FIN_WAIT2
tcp 1 0 localhost:39123 localhost:40282 CLOSE_WAIT
tcp 1 0 localhost:39123 localhost:40274 CLOSE_WAIT
tcp 1 0 localhost:39123 localhost:40374 CLOSE_WAIT
tcp 1 0 localhost:39123 localhost:40316 CLOSE_WAIT
tcp 1 0 localhost:39123 localhost:40278 CLOSE_WAIT
tcp 0 0 localhost:40374 localhost:39123 FIN_WAIT2
tcp 1 0 localhost:39123 localhost:40270 CLOSE_WAIT
tcp 1 0 localhost:39123 localhost:40352 CLOSE_WAIT
Can you get a podman info to see what version Conmon is?
host:
BuildahVersion: 1.11.3
CgroupVersion: v1
Conmon:
package: 'conmon: /usr/bin/conmon'
path: /usr/bin/conmon
version: 'conmon version 2.0.3, commit: unknown'
Distribution:
distribution: ubuntu
version: "18.04"
IDMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
MemFree: 29040357376
MemTotal: 33498562560
OCIRuntime:
name: runc
package: 'runc: /usr/sbin/runc'
path: /usr/sbin/runc
version: 'runc version spec: 1.0.1-dev'
SwapFree: 2147479552
SwapTotal: 2147479552
arch: amd64
cpus: 12
eventlogger: journald
hostname: pcl001477a
kernel: 5.0.0-37-generic
os: linux
rootless: true
slirp4netns:
Executable: /usr/bin/slirp4netns
Package: 'slirp4netns: /usr/bin/slirp4netns'
Version: |-
slirp4netns version 0.4.2
commit: unknown
uptime: 1h 50m 53.42s (Approximately 0.04 days)
registries:
blocked: null
insecure: null
search:
- docker.io
- quay.io
store:
ConfigFile: /home/daan_dm/.config/containers/storage.conf
ContainerStore:
number: 0
GraphDriverName: vfs
GraphOptions: {}
GraphRoot: /home/daan_dm/.local/share/containers/storage
GraphStatus: {}
ImageStore:
number: 0
RunRoot: /run/user/1000
VolumePath: /home/daan_dm/.local/share/containers/storage/volumes
I'm pretty sure my issue is caused by the conmon issue I just created (it's a side effect of the winsz bug).
This issue is effecting my system as well, this effects several containers.
This is worked around by calling this in the middle of a po
podman stop containername
for i in `netstat -pnl|sed '/8112/!d; /conmon/!d' -|awk '{split($NF,a,"/"); print a[1]}'`; do kill -9 $i; done
podman start containername
Where 8112 is the port in question, this is not ideal as it complicates restarting services with systemd
ExecSartPre=bash -c 'for i innetstat -pnl|sed '/8112/!d; /conmon/!d' -|awk '{split($NF,a,"/"); print a[1]}'; do kill -9 $i; done'
conmon --version
conmon version 2.0.2
commit: 186a550ba0866ce799d74006dab97969a2107979
podman info
host:
BuildahVersion: 1.11.3
CgroupVersion: v2
Conmon:
package: conmon-2.0.2-1.fc31.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.2, commit: 186a550ba0866ce799d74006dab97969a2107979'
Distribution:
distribution: fedora
version: "31"
MemFree: 616427520
MemTotal: 12562640896
OCIRuntime:
name: crun
package: crun-0.10.6-1.fc31.x86_64
path: /usr/bin/crun
version: |-
crun version 0.10.6
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
SwapFree: 1455120384
SwapTotal: 1719660544
arch: amd64
cpus: 8
eventlogger: journald
hostname: svhqdownload01.0xcbf.net
kernel: 5.3.14-300.fc31.x86_64
os: linux
rootless: false
uptime: 17h 23m 19.14s (Approximately 0.71 days)
registries:
blocked: null
insecure: null
search:
- docker.io
- registry.fedoraproject.org
- registry.access.redhat.com
- registry.centos.org
- quay.io
store:
ConfigFile: /etc/containers/storage.conf
ContainerStore:
number: 8
GraphDriverName: overlay
GraphOptions:
overlay.mountopt: nodev,metacopy=on
GraphRoot: /var/lib/containers/storage
GraphStatus:
Backing Filesystem: btrfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "true"
ImageStore:
number: 11
RunRoot: /var/run/containers/storage
VolumePath: /var/lib/containers/storage/volumes
cat /etc/redhat-release
Fedora release 31 (Thirty One)
i cannot replicate this on master on fedora ...
$ sudo podman info
host:
BuildahVersion: 1.11.5
CgroupVersion: v2
Conmon:
package: conmon-2.0.2-1.fc31.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.2, commit: 186a550ba0866ce799d74006dab97969a2107979'
Distribution:
distribution: fedora
version: "31"
MemFree: 14543925248
MemTotal: 33021677568
OCIRuntime:
name: crun
package: crun-0.10.6-1.fc31.x86_64
path: /usr/bin/crun
version: |-
crun version 0.10.6
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
SwapFree: 0
SwapTotal: 0
arch: amd64
cpus: 8
eventlogger: journald
hostname: DESKTOP-SH5EG3J.localdomain
kernel: 5.3.15-300.fc31.x86_64
os: linux
rootless: false
uptime: 10h 24m 23.57s (Approximately 0.42 days)
registries:
blocked: null
insecure: null
search:
- docker.io
- registry.fedoraproject.org
- quay.io
- registry.access.redhat.com
- registry.centos.org
store:
ConfigFile: /etc/containers/storage.conf
ContainerStore:
number: 3
GraphDriverName: overlay
GraphOptions:
overlay.mountopt: nodev,metacopy=on
GraphRoot: /var/lib/containers/storage
GraphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "true"
ImageStore:
number: 3
RunRoot: /var/run/containers/storage
VolumePath: /var/lib/containers/storage/volumes
From the podman info output, @ACiDGRiM looks like he has a more than new enough version to have both the Conmon fix and the Podman patches to manually kill Conmon from @vrothberg
I wonder if we aren't talking about a different issue here. I see mention of systemd unit files in his example; I wonder if we aren't seeing an improperly-formatted one that's not working as advertised, causing Podman to die but not killing Conmon.
@ACiDGRiM Can you provide the systemd unit files you're using? Were they generated by podman generate systemd?
I've regenerated the systemd units with podman generate systemd.
They used to be approximately as below:
[Unit]
Description=Daemon
After=network.target mnt-content.mount mnt-data.mount
Requires=mnt-content.mount mnt-data.mount
[Service]
Restart=always
ExecStart=/usr/bin/podman start -a container
ExecStop=/usr/bin/podman stop -t 10 container
[Install]
WantedBy=multi-user.target
I'm testing with the newly generated sysd units now, the noticeable difference being the type is forking instead of a standard unit.
Getting the service files right is _very_ tough. We will publish a blog next Monday getting into some details and presenting a new best practice.
If you want to create new containers (in contrast to using existing ones as podman generate systemd currently does), we recommend the following pattern:
[Unit]
Description=Podman in Systemd
[Service]
Restart=on-failure
ExecStartPre=/usr/bin/rm -f /%t/%n-pid /%t/%n-cid
ExecStart=/usr/bin/podman run --conmon-pidfile **/%t/%n-pid** --cidfile /%t/%n-cid -d alpine:latest top
ExecStop=/usr/bin/sh -c "/usr/bin/podman rm -f `cat /%t/%n-cid`"
KillMode=none
Type=forking
PIDFile=**/%t/%n-pid**
[Install]
WantedBy=multi-user.target
The article has been published: https://www.redhat.com/sysadmin/podman-shareable-systemd-services
I've created the services with podman generate systemd and reliability is improved, however if the container dies without cleaning up, conman still hangs around and must be killed via PreStart or manually.
Furthermore I also run into this issue after releasing the port by killing conmon:
Error: unable to start container "plex": sd-bus call: File exists: OCI runtime error
Also to note, podman generate systemd doesn't include the ExecStartPre section your template indicates.
# container-ac522937d6e06fd0500b75e2e94608dae452f3a5b75500dad38a9edeefa60b5a.service
# autogenerated by Podman 1.6.2
# Wed Dec 18 18:48:49 MST 2019
[Unit]
Description=Podman container-ac522937d6e06fd0500b75e2e94608dae452f3a5b75500dad38a9edeefa60b5a.service
Documentation=man:podman-generate-systemd(1)
[Service]
Restart=on-failure
ExecStart=/usr/bin/podman start ac522937d6e06fd0500b75e2e94608dae452f3a5b75500dad38a9edeefa60b5a
ExecStop=/usr/bin/podman stop -t 10 ac522937d6e06fd0500b75e2e94608dae452f3a5b75500dad38a9edeefa60b5a
KillMode=none
Type=forking
PIDFile=/var/run/containers/storage/overlay-containers/ac522937d6e06fd0500b75e2e94608dae452f3a5b75500dad38a9edeefa60b5a/userdata/conmon.pid
[Install]
WantedBy=multi-user.target
Oh,...It's closed