Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
A while ago, I had some issues with a container image I was sending changes to which the maintainer wasn't able to repro on his machine. We were both running rootless podman, but I would consistently get EPERM errors when starting the container. At the time I was on Fedora 31. I was able to mostly work around the issue with bizarre combinations of --cap-add=DAC_OVERRIDE and edits to the entrypoint (which I can't find anymore ): )
Today, I pulled down the redis and rabbitmq docker images and hit similar seeming issues. When I dug into it, it appears that their use of gosu to setuid/setgid to an unprivileged user in the container is being denied by something. I figured it might be selinux denying some transition, but there don't appear to be any denied entries in my audit.log. Adding the SETUID and SETGID caps don't help either (in fact adding ALL doesn't help).
My understanding is that docker containers sometimes not working isn't unknown, but most of the issues are documented as being related to cgroups2 and manifest differently. This feels like something seccomp-y or selinuxy is getting in the way.
Steps to reproduce the issue:
$ podman run -t redis
error: exec: "/usr/local/bin/docker-entrypoint.sh": stat /usr/local/bin/docker-entrypoint.sh: permission denied
$ podman run -t --cap-add=SETUID,SETGID redis
error: exec: "/usr/local/bin/docker-entrypoint.sh": stat /usr/local/bin/docker-entrypoint.sh: permission denied
2.
$ cat >Dockerfile <<EOF
FROM redis
RUN sed -i 's/exec gosu/##/' /usr/local/bin/docker-entrypoint.sh
EOF
$ podman build -t redis:debug .
podman run -t redis:debug -> should work normally now
podman run -ti --user redis --cap-add=DAC_OVERRIDE redis also works but this trick doesn't work for rabbitmq
Describe the results you received:
Unmodified redis image fails to run
Describe the results you expected:
It would be nice if these containers Just Worked (TM)
Additional information you deem important (e.g. issue happens only occasionally):
SELinux labels on my ${GRAPHROOT}/storage/
$ ls -lZ ~/.local/share/containers/storage
total 152
drwx------+ 2 user group unconfined_u:object_r:container_var_lib_t:s0 4096 Jan 20 19:30 cache
drwx------+ 2 user group unconfined_u:object_r:container_var_lib_t:s0 4096 Jan 20 19:30 libpod
drwx------+ 2 user group unconfined_u:object_r:container_var_lib_t:s0 4096 Jan 20 19:30 mounts
drwx--x--x+ 57 user group unconfined_u:object_r:container_ro_file_t:s0 28672 Jun 30 10:19 overlay
drwx--x--x+ 19 user group unconfined_u:object_r:container_var_lib_t:s0 12288 Jun 30 10:19 overlay-containers
drwx------+ 19 user group unconfined_u:object_r:container_ro_file_t:s0 12288 Jun 30 10:16 overlay-images
drwx------+ 2 user group unconfined_u:object_r:container_ro_file_t:s0 24576 Jun 30 10:19 overlay-layers
-rw-------. 1 user group unconfined_u:object_r:container_var_lib_t:s0 64 Jun 30 10:24 storage.lock
drwx------+ 2 user group unconfined_u:object_r:container_var_lib_t:s0 4096 Jan 20 19:30 tmp
-rw-------. 1 user group unconfined_u:object_r:container_var_lib_t:s0 0 May 4 14:41 userns.lock
drwx--x--x+ 91 user group unconfined_u:object_r:container_var_lib_t:s0 12288 Jun 30 10:19 volumes
Trying to run rabbitmq with --user:
$ podman run -ti --user rabbitmq --cap-add=DAC_OVERRIDE rabbitmq
:eacces
00:28:38.131 [error]
00:28:38.132 [error] BOOT FAILED
BOOT FAILED
00:28:38.133 [error] ===========
===========
00:28:38.133 [error] Exception during startup:
Exception during startup:
00:28:38.133 [error]
00:28:38.133 [error] supervisor:'-start_children/2-fun-0-'/3 line 355
supervisor:'-start_children/2-fun-0-'/3 line 355
00:28:38.133 [error] supervisor:do_start_child/2 line 371
00:28:38.133 [error] supervisor:do_start_child_i/3 line 385
00:28:38.133 [error] rabbit_prelaunch:run_prelaunch_first_phase/0 line 27
00:28:38.133 [error] rabbit_prelaunch:do_run/0 line 111
00:28:38.133 [error] rabbit_prelaunch_dist:setup/1 line 12
00:28:38.133 [error] rabbit_nodes_common:do_ensure_epmd/2 line 93
00:28:38.133 [error] erlang:open_port({spawn_executable,"/usr/local/lib/erlang/erts-11.0.2/bin/erl"}, [{args,["-boot","no_dot_erlang","-sname","epmd-starter-112824848","-noinput","-s","erlang","hal..."]},...])
supervisor:do_start_child/2 line 371
supervisor:do_start_child_i/3 line 385
rabbit_prelaunch:run_prelaunch_first_phase/0 line 27
rabbit_prelaunch:do_run/0 line 111
rabbit_prelaunch_dist:setup/1 line 12
rabbit_nodes_common:do_ensure_epmd/2 line 93
erlang:open_port({spawn_executable,"/usr/local/lib/erlang/erts-11.0.2/bin/erl"}, [{args,["-boot","no_dot_erlang","-sname","epmd-starter-112824848","-noinput","-s","erlang","hal..."]},...])
00:28:38.133 [error] error:eacces
error:eacces
00:28:38.133 [error]
00:28:39.135 [error] Supervisor rabbit_prelaunch_sup had child prelaunch started with rabbit_prelaunch:run_prelaunch_first_phase() at undefined exit with reason eacces in context start_error
00:28:39.136 [error] CRASH REPORT Process <0.153.0> with 0 neighbours exited with reason: {{shutdown,{failed_to_start_child,prelaunch,eacces}},{rabbit_prelaunch_app,start,[normal,[]]}} in application_master:init/4 line 138
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,eacces}},{rabbit_prelaunch_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,eacces}},{rabbit_prelaunch_app,start,[normal,[]]}}})
Crash dump is being written to: erl_crash.dump...done
Output of podman version:
Using my distro podman at the moment:
Version: 1.9.3
RemoteAPI Version: 1
Go Version: go1.14.2
OS/Arch: linux/amd64
Output of podman info --debug:
debug:
compiler: gc
gitCommit: ""
goVersion: go1.14.2
podmanVersion: 1.9.3
host:
arch: amd64
buildahVersion: 1.14.9
cgroupVersion: v2
conmon:
package: conmon-2.0.18-1.fc32.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.18, commit: 6e8799f576f11f902cd8a8d8b45b2b2caf636a85'
cpus: 8
distribution:
distribution: fedora
version: "32"
eventLogger: file
hostname: host
idMappings:
gidmap:
- container_id: 0
host_id: 31337
size: 1
- container_id: 1
host_id: 165536
size: 65536
uidmap:
- container_id: 0
host_id: 1001
size: 1
- container_id: 1
host_id: 165536
size: 65536
kernel: 5.6.14-300.fc32.x86_64
memFree: 677883904
memTotal: 8235126784
ociRuntime:
name: crun
package: crun-0.13-2.fc32.x86_64
path: /usr/bin/crun
version: |-
crun version 0.13
commit: e79e4de4ac16da0ce48777afb72c6241de870525
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
os: linux
rootless: true
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.1.1-1.fc32.x86_64
version: |-
slirp4netns version 1.1.1
commit: bbf27c5acd4356edb97fa639b4e15e0cd56a39d5
libslirp: 4.2.0
SLIRP_CONFIG_VERSION_MAX: 2
swapFree: 7357460480
swapTotal: 8392798208
uptime: 169h 4m 51.29s (Approximately 7.04 days)
registries:
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- registry.centos.org
- docker.io
store:
configFile: /home/user/.config/containers/storage.conf
containerStore:
number: 17
paused: 0
running: 0
stopped: 17
graphDriverName: overlay
graphOptions:
overlay.mount_program:
Executable: /usr/bin/fuse-overlayfs
Package: fuse-overlayfs-1.0.0-1.fc32.x86_64
Version: |-
fusermount3 version: 3.9.1
fuse-overlayfs: version 1.0.0
FUSE library version 3.9.1
using FUSE kernel interface version 7.31
graphRoot: /home/user/.local/share/containers/storage
graphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "false"
imageStore:
number: 17
runRoot: /run/user/1001/containers
volumePath: /home/user/.local/share/containers/storage/volumes
Package info (e.g. output of rpm -q podman or apt list podman):
podman-1.9.3-1.fc32.x86_64
Additional environment details (AWS, VirtualBox, physical, etc.):
Physical host
With podman 2.0 redis works fine for me
$ podman run -t redis
Trying to pull registry.fedoraproject.org/redis...
manifest unknown: manifest unknown
Trying to pull registry.access.redhat.com/redis...
name unknown: Repo not found
Trying to pull registry.centos.org/redis...
manifest unknown: manifest unknown
Trying to pull docker.io/library/redis...
Getting image source signatures
Copying blob 8559a31e96f4 skipped: already exists
Copying blob 5ce7b314b19c done
Copying blob b69876b7abed done
Copying blob 85a6a5c53ff0 done
Copying blob 04c4bfb0b023 done
Copying blob a72d84b9df6a done
Copying config 2355926154 done
Writing manifest to image destination
Storing signatures
1:C 30 Jun 2020 12:14:04.007 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 30 Jun 2020 12:14:04.007 # Redis version=6.0.5, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 30 Jun 2020 12:14:04.007 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
1:M 30 Jun 2020 12:14:04.007 * Increased maximum number of open files to 10032 (it was originally set to 1024).
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 6.0.5 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in standalone mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 6379
| `-._ `._ / _.-' | PID: 1
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
1:M 30 Jun 2020 12:14:04.008 # Server initialized
1:M 30 Jun 2020 12:14:04.008 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 30 Jun 2020 12:14:04.008 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 30 Jun 2020 12:14:04.008 * Ready to accept connections
Both
$ podman run -ti --user rabbitmq --cap-add=DAC_OVERRIDE rabbitmq
and
$ podman run -ti --user rabbitmq rabbitmq
Worked fine on Fedora 32.
$ podman info
host:
arch: amd64
buildahVersion: 1.15.0
cgroupVersion: v1
conmon:
package: conmon-2.0.18-1.fc32.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.18, commit: 6e8799f576f11f902cd8a8d8b45b2b2caf636a85'
cpus: 8
distribution:
distribution: fedora
version: "32"
eventLogger: file
hostname: localhost.localdomain
idMappings:
gidmap:
- container_id: 0
host_id: 3267
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 3267
size: 1
- container_id: 1
host_id: 100000
size: 65536
kernel: 5.6.18-300.fc32.x86_64
linkmode: dynamic
memFree: 849010688
memTotal: 16416161792
ociRuntime:
name: runc
package: containerd.io-1.2.10-3.2.fc31.x86_64
path: /usr/bin/runc
version: |-
runc version 1.0.0-rc8+dev
commit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
spec: 1.0.1-dev
os: linux
remoteSocket:
path: /run/user/3267/podman/podman.sock
rootless: true
slirp4netns:
executable: /bin/slirp4netns
package: slirp4netns-1.1.1-1.fc32.x86_64
version: |-
slirp4netns version 1.1.1
commit: bbf27c5acd4356edb97fa639b4e15e0cd56a39d5
libslirp: 4.2.0
SLIRP_CONFIG_VERSION_MAX: 2
swapFree: 6074658816
swapTotal: 8296329216
uptime: 280h 29m 8.18s (Approximately 11.67 days)
registries:
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- registry.centos.org
- docker.io
store:
configFile: /home/dwalsh/.config/containers/storage.conf
containerStore:
number: 108
paused: 0
running: 0
stopped: 108
graphDriverName: overlay
graphOptions:
overlay.ignore_chown_errors: "false"
overlay.mount_program:
Executable: /usr/bin/fuse-overlayfs
Package: fuse-overlayfs-1.1.1-1.fc32.x86_64
Version: |-
fusermount3 version: 3.9.1
fuse-overlayfs: version 1.1.0
FUSE library version 3.9.1
using FUSE kernel interface version 7.31
graphRoot: /home/dwalsh/.local/share/containers/storage
graphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "false"
imageStore:
number: 35
runRoot: /run/user/3267/containers
volumePath: /home/dwalsh/.local/share/containers/storage/volumes
version:
APIVersion: 1
Built: 1593512221
BuiltTime: Tue Jun 30 06:17:01 2020
GitCommit: b54a24499facb701ee76198ef6af17af1a172dfa
GoVersion: go1.14.3
OsArch: linux/amd64
Version: 2.1.0-dev
Yesterday I tried one of your semanage incantations from a different issue @rhatdan , since I noticed that the SELinux labels on my graphroot seemed wrong (a lot of user_home_t labels) but it didn't fix things :(
Any tips on what I might be able to do to debug further on my machine? It's been a nagging issue and my alternative is to blow away my laptop and see if a fresh install ends up being happier.
What AVCs are you seeing?
ausearch -m avc -ts recent
Looks like none, which is a bit surprising. I've got a few more observations which might be relevant:
podman system reset and even using a different $HOME to force it to have no state doesn't make the issue go awayMore info on the path to now:
So it's obviously some bizarre user misconfiguration on my main account which I can't spot. I think my plan will be to just pivot to a new account and steal my homedir back. I'll keep the original account in case there's anything else you think I can check out.
Well this was certainly a fun adventure. It turns out this is all due to my use of extended ACLs to restrict permissions in my home dircetory. Essentially, I mask everything with d:user:rwx and this obviously causes issues with the user namespace mapping that takes place when we run containers. For any namespaced ID which doesn't just map to my real UID external to the NS (ie. not root in the container), things wouldn't play nicely, any of RWX access would get rejected and that's why things like su/gosu would explode.
I just had to setfacl -Rb ~/.local/share/containers (I probably should have done it to the storage/ subdir only) and suddenly everything works quite happily. I'm going to close this issue since it's clearly not a libpod thing to have to deal with user-set ACLs. It might be worth adding to the troubleshooting doc though since it's fairly esoteric and manifests in really opaque permission errors.
Well this was certainly a fun adventure. It turns out this is all due to my use of extended ACLs to restrict permissions in my home dircetory. Essentially, I mask everything with
d:user:rwxand this obviously causes issues with the user namespace mapping that takes place when we run containers. For any namespaced ID which doesn't just map to my real UID external to the NS (ie. not root in the container), things wouldn't play nicely, any of RWX access would get rejected and that's why things likesu/gosuwould explode.I just had to
setfacl -Rb ~/.local/share/containers(I probably should have done it to thestorage/subdir only) and suddenly everything works quite happily. I'm going to close this issue since it's clearly not a libpod thing to have to deal with user-set ACLs. It might be worth adding to the troubleshooting doc though since it's fairly esoteric and manifests in really opaque permission errors.
Worked for me. Thanks!
Most helpful comment
Well this was certainly a fun adventure. It turns out this is all due to my use of extended ACLs to restrict permissions in my home dircetory. Essentially, I mask everything with
d:user:rwxand this obviously causes issues with the user namespace mapping that takes place when we run containers. For any namespaced ID which doesn't just map to my real UID external to the NS (ie. not root in the container), things wouldn't play nicely, any of RWX access would get rejected and that's why things likesu/gosuwould explode.I just had to
setfacl -Rb ~/.local/share/containers(I probably should have done it to thestorage/subdir only) and suddenly everything works quite happily. I'm going to close this issue since it's clearly not a libpod thing to have to deal with user-set ACLs. It might be worth adding to the troubleshooting doc though since it's fairly esoteric and manifests in really opaque permission errors.