Podman: Have to use podman system migrate after every reboot

Created on 18 Sep 2019 · 10Comments · Source: containers/podman

/kind bug

Description

I use Jenkins to build root-less containers using Podman. However, I notice that every time I reboot, I'm presented with the following message when trying to interact with Podman as the Jenkins user through the command line:

-bash-4.2$ podman ps --all
ERRO[0000] cannot join pause process.  You may need to remove /tmp/run-996/libpod/pause.pid and stop all containers 
ERRO[0000] you can use `system migrate` to recreate the pause process 
ERRO[0000] open /proc/3959/ns/user: no such file or directory

Steps to reproduce the issue:

Get Podman in a working state
Create a Jenkins job which performs certain interactions with Podman through a SHELL step
Try interacting with Podman as the Jenkins user through the command line

Describe the results you received:
Podman complains about not being able to join the pause process.
Doing a podman system migrate seem to usually solve the problem, but doesn't seem very convenient.

Describe the results you expected:
Interacting with Podman as the Jenkins user within the Java process should be the same as interacting with Podman as the Jenkins user through the command line.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

-bash-4.2$ podman version
Version:            1.4.4
RemoteAPI Version:  1
Go Version:         go1.10.3
OS/Arch:            linux/amd64

Output of podman info --debug:

ERRO[0000] cannot join pause process.  You may need to remove /tmp/run-996/libpod/pause.pid and stop all containers 
ERRO[0000] you can use `system migrate` to recreate the pause process 
ERRO[0000] open /proc/3959/ns/user: no such file or directory

After getting it back to work by deleting the contents of /tmp/run-996/libpod:

```debug:
compiler: gc
git commit: ""
go version: go1.10.3
podman version: 1.4.4
host:
BuildahVersion: 1.9.0
Conmon:
package: podman-1.4.4-4.el7.centos.x86_64
path: /usr/libexec/podman/conmon
version: 'conmon version 0.3.0, commit: unknown'
Distribution:
distribution: '"centos"'
version: "7"
MemFree: 750735360
MemTotal: 1927163904
OCIRuntime:
package: runc-1.0.0-65.rc8.el7.centos.x86_64
path: /usr/bin/runc
version: 'runc version spec: 1.0.1-dev'
SwapFree: 2147479552
SwapTotal: 2147479552
arch: amd64
cpus: 2
hostname: jenkins
kernel: 3.10.0-1062.1.1.el7.x86_64
os: linux
rootless: true
uptime: 10m 17.93s
registries:
blocked: null
insecure:

localhost:5000
search:
localhost:5000
registry.access.redhat.com
docker.io
registry.fedoraproject.org
quay.io
registry.centos.org
store:
ConfigFile: /var/lib/jenkins/.config/containers/storage.conf
ContainerStore:
number: 1
GraphDriverName: vfs
GraphOptions: null
GraphRoot: /var/lib/jenkins/.local/share/containers/storage
GraphStatus: {}
ImageStore:
number: 6
RunRoot: /tmp/run-996
VolumePath: /var/lib/jenkins/.local/share/containers/storage/volumes


**Package info (e.g. output of `rpm -q podman` or `apt list podman`):**

podman-1.4.4-4.el7.centos.x86_64


**Additional environment details (AWS, VirtualBox, physical, etc.):**

-bash-4.2$ rpm -q slirp4netns
slirp4netns-0.3.0-1.el7.x86_64

-bash-4.2$ cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)

-bash-4.2$ cat /etc/subuid
jenkins:110000:655360000
-bash-4.2$ cat /etc/subgid
jenkins:110000:655360000

sysctl -w user.max_user_namespaces=15076
```

kinbug

Source

carroarmato0

👍1

All 10 comments

is /tmp really a tmpfs? I'd suggest to ensure /tmp is really cleaned up after each reboot, as there are other things that rely on that behaviour.

giuseppe on 18 Sep 2019

is /tmp really a tmpfs? I'd suggest to ensure /tmp is really cleaned up after each reboot, as there are other things that rely on that behaviour.

Aha, you're right. /tmp is just directly part of /, and hence not cleaned up.
How come it relies on being clean though? Wouldn't the things being dependent on it just pickup where they left?

/dev/sda1 on / type xfs (rw,relatime,seclabel,attr2,inode64,noquota)

[root@jenkins ~]# mount | grep tmp
devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=933556k,nr_inodes=233389,mode=755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel)
tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=188200k,mode=700,uid=1000,gid=1000)

carroarmato0 on 18 Sep 2019

I used the following commands on Centos (7.7) to enable tmpfs for /tmp and then rebooted.
Seems to have solved my issue, thanks for the pointer @giuseppe !

systemctl enable tmp.mount
systemctl start tmp.mount

carroarmato0 on 18 Sep 2019

How come it relies on being clean though?

in this particular case, we store there the PID for the pause process so it might not exist (or worse be a different process), when you reboot.

Other state that is supposed to not be persistent can be stored there.

XDG_RUNTIMED_DIR usually is under /run. Since you have it and it is a tmpfs, why not forcing XDG_RUNTIME_DIRto be under/run`?

giuseppe on 18 Sep 2019

That's indeed another solution. Thank you for the explanation.

carroarmato0 on 18 Sep 2019

Just manually re-running a jenkins project with podman commands in it produces the error.

@delenius Please check out issue #4655. Are we encountering the same problem?

kedmison on 6 Dec 2019

Just manually re-running a jenkins project with podman commands in it produces the error.

@delenius Please check out issue #4655. Are we encountering the same problem?

Yes, same problem. I ended up removing my comment because I am running an older version of podman, (1.4.4, same as in #4655, on RHEL 7.7), and I figured it might have gotten fixed since then. I also found a workaround, which is to just add

rm /tmp/run-`id -u`/libpod/pause.pid

before the podman command, in the jenkins shell script. Not sure if this has some inherent dangers, mind you ;)

delenius on 6 Dec 2019

I was having this issue on fedora 31 after updating from 30

$ podman container ls
Error: could not get runtime: open /proc/1445/ns/user: no such file or directory

I fixed it with:

mv /run/user/$(id -u)/libpod{,-000}

wishachu on 16 Dec 2019

👍1

I had to do podman system reset -f after reboot.

aucampia on 6 Feb 2020

👍1

@wishachu:

That worked well for me. Thumbs up!

barseghyanartur on 9 Mar 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Trying to use --init fails with "container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory"

jlebon · 58Comments

After podman 2 upgrade, systemd fails to start in containers on cgroups v1 hosts

markstos · 148Comments

Error: stat /sys/fs/cgroup/systemd/org/freedesktop/ConsoleKit/Session1: no such file or directory

Noah-Huppert · 51Comments

Is there any chance to run rootless podman container inside another one?

psmolkin · 57Comments

Impossible to recreate a container with the same name that a container already removed

4383 · 58Comments