/kind bug
Description
I'm unable to run a rootless container, podman returns the following error:
$ podman run --rm golang:1.14-alpine go version
Error: setrlimit `RLIMIT_NPROC`: Invalid argument: OCI runtime error
I'm on Fedora 32. I'm sure running a rootless container worked with Fedora 31 but I'm not sure if the problem appeared as soon as I migrated to Fedora 32 or if it appeared later since I don't use podman that often.
Steps to reproduce the issue:
podman run --rm golang:1.14-alpine go version (or any image really)Describe the results you received:
Error: setrlimit `RLIMIT_NPROC`: Invalid argument: OCI runtime error
Describe the results you expected:
I expect the container to run correctly.
Additional information you deem important (e.g. issue happens only occasionally):
I tried:
$HOME/.local/share/containers and $HOME/.config.containersbut the error remains.
Output of podman version:
Version: 1.9.2
RemoteAPI Version: 1
Go Version: go1.14.2
OS/Arch: linux/amd64
Output of podman info --debug:
debug:
compiler: gc
gitCommit: ""
goVersion: go1.14.2
podmanVersion: 1.9.2
host:
arch: amd64
buildahVersion: 1.14.8
cgroupVersion: v2
conmon:
package: conmon-2.0.16-2.fc32.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.16, commit: 1044176f7dd177c100779d1c63931d6022e419bd'
cpus: 8
distribution:
distribution: fedora
version: "32"
eventLogger: file
hostname: thor
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
kernel: 5.6.14-300.fc32.x86_64+debug
memFree: 3684970496
memTotal: 16685846528
ociRuntime:
name: crun
package: crun-0.13-2.fc32.x86_64
path: /usr/bin/crun
version: |-
crun version 0.13
commit: e79e4de4ac16da0ce48777afb72c6241de870525
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
os: linux
rootless: true
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.0.0-1.fc32.x86_64
version: |-
slirp4netns version 1.0.0
commit: a3be729152a33e692cd28b52f664defbf2e7810a
libslirp: 4.2.0
swapFree: 2147479552
swapTotal: 2147479552
uptime: 1h 11m 1.23s (Approximately 0.04 days)
registries:
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- registry.centos.org
- docker.io
store:
configFile: /home/vincent/.config/containers/storage.conf
containerStore:
number: 4
paused: 0
running: 0
stopped: 4
graphDriverName: overlay
graphOptions:
overlay.mount_program:
Executable: /usr/bin/fuse-overlayfs
Package: fuse-overlayfs-1.0.0-1.fc32.x86_64
Version: |-
fusermount3 version: 3.9.1
fuse-overlayfs: version 1.0.0
FUSE library version 3.9.1
using FUSE kernel interface version 7.31
graphRoot: /home/vincent/.local/share/containers/storage
graphStatus:
Backing Filesystem: btrfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "false"
imageStore:
number: 8
runRoot: /run/user/1000/containers
volumePath: /home/vincent/.local/share/containers/storage/volumes
Package info (e.g. output of rpm -q podman or apt list podman):
podman-1.9.2-1.fc32.x86_64
It's a desktop PC.
Are you using cgroup v1?
Do you have a libpod.conf in your homedir? If so remove it.
rm ~/.config/containers/libpod.conf
Also what does this command show?
$ podman run --help | grep pids-limit
--pids-limit int Tune container pids limit (set 0 for unlimited, -1 for server defaults)
As far as I can tell I'm not using cgroups v1, I don't see systemd.unified_cgroup_hierarchy=0 in my /proc/cmdline and I'm pretty sure I never changed it.
I don't have a libpod.conf file, in fact right now I don't even have the directory $HOME/.config/containers.
The output from your command:
$ podman run --help | grep pids-limit
--pids-limit int Tune container pids limit (set 0 for unlimited) (default 2048)
$ ls -l /usr/share/containers/containers.conf /etc/containers/containers.conf
$ rpm -q podman
podman-1.9.2-1.fc32.x86_64
$ grep pids_limit /etc/containers/containers.conf
$ grep pids_limit /usr/share/containers/containers.conf
# pids_limit = 2048
# cat /proc/self/cgroup
11:perf_event:/
10:cpu,cpuacct:/
9:pids:/user.slice/user-3267.slice/[email protected]
8:cpuset:/
7:devices:/user.slice
6:freezer:/
5:memory:/user.slice/user-3267.slice/[email protected]
4:blkio:/
3:hugetlb:/
2:net_cls,net_prio:/
1:name=systemd:/user.slice/user-3267.slice/[email protected]/apps.slice/apps-org.gnome.Terminal.slice/vte-spawn-960193fe-002b-412d-909f-e1ee7bdde126.scope
0::/user.slice/user-3267.slice/[email protected]/apps.slice/apps-org.gnome.Terminal.slice/vte-spawn-960193fe-002b-412d-909f-e1ee7bdde126.scope
$ podman info | grep cgroup
cgroupVersion: v1
Actually looking at your podman info, I see you are in cgroup V2? Which even makes this more strange. @giuseppe thoughts?
I'm assuming you want me to run the command you posted above ?
$ ls -l /usr/share/containers/containers.conf /etc/containers/containers.conf
ls: cannot access '/etc/containers/containers.conf': No such file or directory
-rw-r--r--. 1 root root 12725 Apr 9 22:11 /usr/share/containers/containers.conf
$ rpm -q podman
podman-1.9.2-1.fc32.x86_64
$ grep pids_limit /usr/share/containers/containers.conf
# pids_limit = 2048
[root@thor vincent]# cat /proc/self/cgroup
0::/user.slice/user-1000.slice/[email protected]/gnome-launched-Alacritty.desktop-12748.scope
````
$ podman info | grep cgroup
cgroupVersion: v2
```
can you show me the output for $ cat /proc/self/limits?
Also, do you have any override for default_ulimits? You can easily find it out with something like grep -A 10 default_ulimits /etc/containers/* /usr/share/containers/* ~/.config/containers/*.
$ cat /proc/self/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 262144 524288 processes
Max open files 1024 2097152 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 63491 63491 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
I only have the file /usr/share/containers/containers.conf with this:
# default_ulimits = [
# “nofile”=”1280:2560”,
# ]
Everything looks fine, and it works for me, but not for you.
My user account has:
Max processes 62461 62461 processes
Max open files 1024 524288 files
Which are smaller then yours.
Running with strace I'm getting this:
$ strace -e setrlimit podman run --rm -ti golang:1.14-alpine go version
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=88160, si_uid=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=88160, si_uid=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=88160, si_uid=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=88160, si_uid=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=88160, si_uid=0} ---
setrlimit(RLIMIT_NPROC, {rlim_cur=4096*1024, rlim_max=4096*1024}) = -1 EPERM (Operation not permitted)
setrlimit(RLIMIT_NPROC, {rlim_cur=1024*1024, rlim_max=1024*1024}) = -1 EPERM (Operation not permitted)
setrlimit(RLIMIT_NPROC, {rlim_cur=4096*1024, rlim_max=4096*1024}) = -1 EPERM (Operation not permitted)
setrlimit(RLIMIT_NPROC, {rlim_cur=1024*1024, rlim_max=1024*1024}) = -1 EPERM (Operation not permitted)
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=88160, si_uid=0} ---
setrlimit(RLIMIT_NPROC, {rlim_cur=4096*1024, rlim_max=4096*1024}) = -1 EPERM (Operation not permitted)
setrlimit(RLIMIT_NPROC, {rlim_cur=1024*1024, rlim_max=1024*1024}) = -1 EPERM (Operation not permitted)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=88170, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=88160, si_uid=0} ---
setrlimit(RLIMIT_NOFILE, {rlim_cur=1024*1024, rlim_max=1024*1024}) = 0
setrlimit(RLIMIT_NPROC, {rlim_cur=4096*1024, rlim_max=4096*1024}) = -1 EPERM (Operation not permitted)
setrlimit(RLIMIT_NPROC, {rlim_cur=1024*1024, rlim_max=1024*1024}) = -1 EPERM (Operation not permitted)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=88171, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=88160, si_uid=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=88160, si_uid=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=88160, si_uid=0} ---
Error: setrlimit `RLIMIT_NPROC`: Invalid argument: OCI runtime error
The man page says this about EPERM:
EPERM
An unprivileged process tried to raise the hard limit; the CAP_SYS_RESOURCE capability is required to do this. Or, the caller tried to increase the hard RLIMIT_NOFILE limit above the current kernel maximum (NR_OPEN). Or, the calling process did not have permission to set limits for the process specified by pid.
I checked with getcap /usr/bin/podman but it doesn't return anything, which I assume means there's no capabilities. However I don't know if I'm on the right path here, is any of the binary from podman/conmon/runc supposed to have CAP_SYS_RESOURCE ?
Ok so I figured out a workaround.
Based on the setrlimit call above it's trying to set the soft/hard nproc limit to 4194304 however my hard limit was set to 524288. I changed the hard limit to 8388608 in /etc/security/limits.conf and now it works.
The code is not supposed to do this though. It looks like we have a bug. The code is supposed to set the limit to the limit of the user.
Before the settings change, did the podman unshare ulimits show any thing intersting?
I reverted my change and ran podman unshare ulimit -a (ulimits doesn't seem to exist):
$ podman unshare ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 63491
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1048576
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 262144
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I think the big number must be coming from here?
cat /proc/sys/kernel/pid_max
const (
oldMaxSize = uint64(1048576)
)
// getDefaultProcessLimits returns the nproc for the current process in ulimits format
// Note that nfile sometimes cannot be set to unlimited, and the limit is hardcoded
// to (oldMaxSize) 1048576 (2^20), see: http://stackoverflow.com/a/1213069/1811501
// In rootless containers this will fail, and the process will just use its current limits
func getDefaultProcessLimits() []string {
rlim := unix.Rlimit{Cur: oldMaxSize, Max: oldMaxSize}
oldrlim := rlim
// Attempt to set file limit and process limit to pid_max in OS
dat, err := ioutil.ReadFile("/proc/sys/kernel/pid_max")
if err == nil {
val := strings.TrimSuffix(string(dat), "\n")
max, err := strconv.ParseUint(val, 10, 64)
if err == nil {
rlim = unix.Rlimit{Cur: uint64(max), Max: uint64(max)}
}
}
defaultLimits := []string{}
if err := unix.Setrlimit(unix.RLIMIT_NPROC, &rlim); err == nil {
defaultLimits = append(defaultLimits, fmt.Sprintf("nproc=%d:%d", rlim.Cur, rlim.Max))
} else {
if err := unix.Setrlimit(unix.RLIMIT_NPROC, &oldrlim); err == nil {
defaultLimits = append(defaultLimits, fmt.Sprintf("nproc=%d:%d", oldrlim.Cur, oldrlim.Max))
}
}
return defaultLimits
}
But the first one should fail, and then we should set the second.
This is indeed what I have:
$ cat /proc/sys/kernel/pid_max
4194304
but looking at the strace:
setrlimit(RLIMIT_NPROC, {rlim_cur=4096*1024, rlim_max=4096*1024}) = -1 EPERM (Operation not permitted)
setrlimit(RLIMIT_NPROC, {rlim_cur=1024*1024, rlim_max=1024*1024}) = -1 EPERM (Operation not permitted)
setrlimit(RLIMIT_NPROC, {rlim_cur=4096*1024, rlim_max=4096*1024}) = -1 EPERM (Operation not permitted)
setrlimit(RLIMIT_NPROC, {rlim_cur=1024*1024, rlim_max=1024*1024}) = -1 EPERM (Operation not permitted)
it tries to set 4096*1024 or 1024*1024 which seems to be oldMaxSize in your code. But given that my hard limit is 524288 that will also fail.
Right if they both fail, the code should be returning.
defaultLimits := []string{}
Which should tell the system to not set the rlimits at all.
Oh right.
Right now I can confirm that the following workarounds work:
/etc/security/limits.conf$HOME/.config/containers/containers.conf with this:default_ulimits = [
"nproc=200000:400000",
]
@rhatdan What's the verdict here - does this look like a Podman bug?
I want to try to get it to happen locally but have not had time. Perhaps an issue for an Intern.
@sujil02 Could you see if you can get this to fail?
@sujil02 Could you see if you can get this to fail?
Sure thing will have a look.
FWIW I started getting this error on Arch Linux also once Linux 5.7 hit the core repositories a few days ago. I'd previously been running rootless containers on this system just fine for months.
I am also using cgroups v1:
$ podman info | grep cgroup
cgroupVersion: v1
Edit: Oops, I just downgraded my kernel to 5.6.15 on Arch and I still can't run any containers. Something else must have broken it. For reference:
$ podman version
Version: 1.9.3
RemoteAPI Version: 1
Go Version: go1.14.3
Git Commit: 5d44534fff6877b1cb15b760242279ae6293154c
Built: Mon May 25 22:25:50 2020
OS/Arch: linux/amd64
I've definitely run containers in the past week, so it must have broken sometime around then.
this error/bug has just start for me as well... none of my containers will start :/
@sujil02 Were you able to check this out?
@sujil02 Were you able to check this out?
Could not simulate. I used Podman-2.0 dev on fedora 32 with cgroups: v1
Just to note, I have the problem with cgroup v2. I didn’t test with cgroup v1.
I'm also in cgroup v2... only fix was to do what @vrischmann suggested https://github.com/containers/libpod/issues/6389#issuecomment-634258120
I had this issue as well. I found I had an old file in /etc/security/limits.d/ which set nproc. I deleted that file and the issue went away for me.
The way the code is supposed to work is to examine your current settings and then set it no higher then the user has.
@rhatdan I hear ya... I am not certain but I think it happen with a kernel update. Both my home and work machines were effected.
I keep updating everything on my desktop weekly and this phenomenon (can start existing containers but can't create new ones) surfaced right after updating libpod to 2.0 and downgrading to 1.9.3 immediately made it work again. (I'm running Gentoo and kernel version 5.6.19 with cgroups v1 - I've been using 5.6 for a while now.)
I have the same issue with podman version 1.9.3 and podman 2.0.0 on Arch Linux, kernels 5.6.15, 5.7.2, and 5.7.4. In short the error is:
$ strace -e setrlimit podman start postgres10
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=535583, si_uid=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=535583, si_uid=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=535583, si_uid=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=535583, si_uid=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=535583, si_uid=0} ---
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=535583, si_uid=0} ---
setrlimit(RLIMIT_NPROC, {rlim_cur=4096*1024, rlim_max=4096*1024}) = -1 EPERM (Operation not permitted)
setrlimit(RLIMIT_NPROC, {rlim_cur=1024*1024, rlim_max=1024*1024}) = -1 EPERM (Operation not permitted)
setrlimit(RLIMIT_NPROC, {rlim_cur=4096*1024, rlim_max=4096*1024}) = -1 EPERM (Operation not permitted)
setrlimit(RLIMIT_NPROC, {rlim_cur=1024*1024, rlim_max=1024*1024}) = -1 EPERM (Operation not permitted)
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=535583, si_uid=0} ---
...
setrlimit(RLIMIT_NPROC, {rlim_cur=4096*1024, rlim_max=4096*1024}) = -1 EPERM (Operation not permitted)
setrlimit(RLIMIT_NPROC, {rlim_cur=1024*1024, rlim_max=1024*1024}) = -1 EPERM (Operation not permitted)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=535591, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=535615, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=535636, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
Error: unable to start container "085b86b34400b94915889b1175af2c1a40f06aba73d0ff004f8b741d4cea107f": container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:378: setting rlimits for ready process caused \\\"error setting rlimit type 6: operation not permitted\\\"\"": OCI runtime permission denied error
+++ exited with 125 +++
Full strace here: podman-setrlimit.txt. Other information:
$ podman info | grep cgroup
cgroupVersion: v1
$ cat /proc/self/limits | grep -E "Max (open files|processes)"
Max processes 62972 62972 processes
Max open files 1024 524288 files
Could you try this with crun instead of runc?
setrlimit(RLIMIT_NPROC, {rlim_cur=40961024, rlim_max=40961024})
Why is there a multiplier of 1024 on these?
Ok with some of my older containers, I am seeing these fields being set.
$ podman inspect charming_ride --format '{{ .HostConfig.Ulimits }}'
[{RLIMIT_NOFILE 300 300} {RLIMIT_NPROC 50 50}]
But newer containers I am creating with podman 2.0. Does not create the Ulimits.
$ podman create -ti alpine sh
fc7abf0fddfa30e1c375e44f0c70f180ff63be8a19f816cbbcb46d292a76f750
$ podman inspect -l --format '{{ .HostConfig.Ulimits }}'
[]
@rhatdan yes you're right. I see the same thing here: old containers have ulimits set, new containers created with podman 2.0 don't. I re-created my old containers that had been refusing to start and now they start properly in user mode. :partying_face:
Since 2.0 is being released, I am going to close this as fixed in 2.0.
@rhatdan to be clear we have to re-create our containers, though. Perhaps mention that in some release notes or a tweet or something?
I believe this is only effecting people with /etc/security/limits.conf being set for their rootless users.
So I just got back to my computer where this bug is happening and there's nothing set in /etc/security/limits.conf.
With your inspect command @rhatdan I think I found out the bug:
$ podman inspect --format '{{ printf "%+v" .HostConfig.Ulimits }}' determined_boyd
[{Name:RLIMIT_NOFILE Soft:1048576 Hard:1048576} {Name:RLIMIT_NPROC Soft:524288 Hard:262144}]
but ulimit says this:
$ ulimit -u --hard
524288
$ ulimit -u --soft
262144
so basically it looks like the container is created with soft/hard the wrong way around.
for anyone out there still having this issue, you have to recreate the affected containers
for anyone out there still having this issue, you have to recreate the affected containers
Sorry what does it mean? Can I recreate the containers keeping all theirs data?
I got my existing toolbox container working by setting the nproc limit for my user in /etc/security/limits.conf to the value reported for the existing container with podman inspect --format '{{ printf "%+v" .HostConfig.Ulimits }}'
@llunved please share your limits.conf
thanks
I got same error message when I try to restart the running pod. It started happening just today, was working totally fine few days before. No changes in limits.conf
for anyone out there still having this issue, you have to recreate the affected containers
Sorry what does it mean? Can I recreate the containers keeping all theirs data?
FWIW, I managed to backup my containers in F32, then rebase on F33 and restore the backups using this tutorial: https://fedoramagazine.org/backup-and-restore-toolboxes-with-podman/
@llunved please share your
limits.conf
thanks
The container in question had NPROC set to 62509, while the system default was 62508:
# podman inspect --format '{{ printf "%+v" .HostConfig.Ulimits }}' fedora-toolbox-32
[{Name:RLIMIT_NOFILE Soft:524288 Hard:524288} {Name:RLIMIT_NPROC Soft:62509 Hard:62509}]
So I added the following line to limits.conf (my user is in the group wheel - but you could use another one or the user):
@wheel hard nproc 62509
Most helpful comment
Sorry what does it mean? Can I recreate the containers keeping all theirs data?