/kind bug
Description
We plan to run our Nagios Monitoring Checks from Container via Docker or Podman.
So far, we don't have any issues with standard Docker, but we can't figure out a proper
solution for CentOS/Podman setups.
We want to run our plugins from Nagios via podman run.
For our test via just configured podman run --rm hello-world as our Plugin call. We got the following error message in
our monitoring system:
Error: could not get runtime: error generating default config from memory: cannot mkdir /run/user/0/libpod: mkdir /run/user/0/libpod: permission denied
So I tried to reproduce this error without running within Nagios.
Steps to reproduce the issue:
Being logged in with the root user I try to switch the user with su nagios
Running podman run --rm hello-world gives me the error I mentioned above.
Describe the results you received:
Running the podman command always gets me:
Error: could not get runtime: error generating default config from memory: cannot mkdir /run/user/0/libpod: mkdir /run/user/0/libpod: permission denied
Describe the results you expected:
Output should be:
Hello from Docker!
This message shows that your installation appears to be working correctly.
......
Additional information you deem important (e.g. issue happens only occasionally):
I already spend a lot of time finding solutions online. This issue (5049) gave me some hints and also refered to the troubleshooting document, but it doesn't really help me.
They main difference is: When I switch the user via su - nagios the proper
environment is populated and the podman run command does work. Switching the user with su nagios (and thats probably comparable with the nagios daemon call) the problem appears.
A good hint from this discussion was checking the XDG_RUNTIME_DIR env variable.
With su - nagios the value is empty,
with su nagios the value is /run/user/0 (which probably could be the problem)
Output of podman version:
With su - nagios it is:
Version: 1.6.4
RemoteAPI Version: 1
Go Version: go1.13.4
OS/Arch: linux/amd64
With su nagios it is (always the same error of course):
Error: could not get runtime: error generating default config from memory: cannot mkdir /run/user/0/libpod: mkdir /run/user/0/libpod: permission denied
Output of podman info --debug:
With su - nagios it is:
debug:
compiler: gc
git commit: ""
go version: go1.13.4
podman version: 1.6.4
host:
BuildahVersion: 1.12.0-dev
CgroupVersion: v1
Conmon:
package: conmon-2.0.6-1.module_el8.1.0+298+41f9343a.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.6, commit: 2721f230f94894671f141762bd0d1af2fb263239'
Distribution:
distribution: '"centos"'
version: "8"
IDMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
MemFree: 702074880
MemTotal: 2035539968
OCIRuntime:
name: runc
package: runc-1.0.0-64.rc9.module_el8.1.0+298+41f9343a.x86_64
path: /usr/bin/runc
version: 'runc version spec: 1.0.1-dev'
SwapFree: 0
SwapTotal: 0
arch: amd64
cpus: 1
eventlogger: journald
hostname: uat1
kernel: 4.18.0-147.8.1.el8_1.x86_64
os: linux
rootless: true
slirp4netns:
Executable: /usr/bin/slirp4netns
Package: slirp4netns-0.4.2-3.git21fdece.module_el8.1.0+298+41f9343a.x86_64
Version: |-
slirp4netns version 0.4.2+dev
commit: 21fdece2737dc24ffa3f01a341b8a6854f8b13b4
uptime: 144h 15m 41.41s (Approximately 6.00 days)
registries:
blocked: null
insecure: null
search:
- registry.access.redhat.com
- registry.fedoraproject.org
- registry.centos.org
- docker.io
store:
ConfigFile: /home/nagios/.config/containers/storage.conf
ContainerStore:
number: 3
GraphDriverName: overlay
GraphOptions:
overlay.mount_program:
Executable: /usr/bin/fuse-overlayfs
Package: fuse-overlayfs-0.7.2-5.module_el8.1.0+298+41f9343a.x86_64
Version: |-
fuse-overlayfs: version 0.7.2
FUSE library version 3.2.1
using FUSE kernel interface version 7.26
GraphRoot: /home/nagios/.local/share/containers/storage
GraphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "false"
ImageStore:
number: 1
RunRoot: /tmp/run-1000
VolumePath: /home/nagios/.local/share/containers/storage/volumes
With su nagios it is (always the same error of course):
Error: could not get runtime: error generating default config from memory: cannot mkdir /run/user/0/libpod: mkdir /run/user/0/libpod: permission denied
Package info (output of rpm -q podman):
podman-1.6.4-4.module_el8.1.0+298+41f9343a.x86_64
The problem is that you are not using a full user account. So XDG_RUNTIME_DIR is not enabled.
@giuseppe @mheon @vrothberg Can't we at least make the error message more helpful, since this is such a common issue that people hit.
I think that's a good idea :+1: Let's see what our systemd friends come up with. I still hope we find a way to properly support that (or give guidelines).
su nagios probably leaves XDG_RUNTIME_DIR set to the wrong value, so that is used.
Can you try su nagios printenv XDG_RUNTIME_DIR? What is the output? Any difference with su -l?
su nagiosprobably leavesXDG_RUNTIME_DIRset to the wrong value, so that is used.Yes, that's the problem. It does not get changed.
Can you try
su nagios printenv XDG_RUNTIME_DIR? What is the output? Any difference withsu -l?
The result is /run/user/0 which is from the previous logged in user.
When I su -l with the XDG_RUNTIME_DIR is not set anymore.
I know, when doing a proper login with su -l everything works. And actually that's not the real issue. I just try to reproduce the error on the cli to find out what could be the real issue and maybe what could be a workaround.
Because the actual setup is the following:
We have a running nagios systemd service, which runs on the nagios daemon user.
So the systemd service is running the actual podman run ... command. It always gives me a 125 exit code, so I tried to reproduce this error outside of systemd. And I thought using su nagios somehow gives me more or less the same environment. But I'm actually not 100% sure.
So the real problem is rather running podman run from a systemd service I guess.
Thank you for the quick responses!
are you trying to achieve something similar to https://github.com/containers/libpod/issues/6400 ?
Could we close this issue as duplicate of #6400 ?
@giuseppe I don't think it's really a duplicate
I now got some more insights and the real error message from nagios.
The error message from podman run is actually
Error: could not get runtime: error generating default config from memory: cannot stat /root/.config/containers/storage.conf: stat /root/.config/containers/storage.conf: permission denied
So, my question is now: Is there any other environment varibale which leds to the reading of the wrong storage.conf file?
The executing user is not root, and the execution is done via a process which runs under systemd
Some additonal information:
The id command returns
uid=1000(nagios) gid=1000(nagios) groups=1000(nagios),1001(nagioscore) context=system_u:system_r:unconfined_service_t:s0
in that context.
So, my question is now: Is there any other environment varibale which leds to the reading of the wrong
storage.conffile?
maybe HOME is still pointing to /root?
maybe
HOMEis still pointing to/root?
yes, i just figured that out HOME is not set properly from systemd !
I found https://github.com/systemd/systemd/issues/9652 which somehow describes a similar problem.
For us, we now use this fix to get podman running properly from systemd:
export XDG_RUNTIME_DIR=
export HOME=/home/`id -u -n`
I will investigate a little bit more into systemd, maybe I can find a proper solution.
@ck-schmidi could you give a try to machinectl --uid $USER-UID shell and use that environment?
You may need to set linger mode so that the containers are left around when you terminate the user session
A friendly reminder that this issue had no activity for 30 days.
I think we have a work around closing, reopen if I am mistaken.
Most helpful comment
yes, i just figured that out
HOMEis not set properly fromsystemd!I found https://github.com/systemd/systemd/issues/9652 which somehow describes a similar problem.
For us, we now use this fix to get podman running properly from systemd:
I will investigate a little bit more into systemd, maybe I can find a proper solution.