Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
Podman healthcheck command never changes status from 'starting' state.
$ podman inspect rqlite-5.4.0 | jq '.[0]["State"]["Healthcheck"]'
{
"Status": "starting",
"FailingStreak": 0,
"Log": null
}
Steps to reproduce the issue:
podman pull rqlite/rqlite:5.4.0
podman create --name rqlite-5.4.0 rqlite/rqlite:5.4.0
podman generate systemd --files --new --name --restart-policy=always rqlite-5.4.0
ExecStart to:ExecStart=/usr/bin/podman run \
--conmon-pidfile %t/container-rqlite-5.4.0.pid \
--cidfile %t/container-rqlite-5.4.0.ctr-id \
--cgroups=no-conmon \
-d \
--replace \
--publish 4001:4001 \
--publish 4002:4002 \
--healthcheck-command 'CMD-SHELL curl http://localhost:4001 && curl http://localhost:4002 && echo "Okay" && exit 0 || exit 1' \
--healthcheck-start-period 5s \
--healthcheck-retries 5 \
--name rqlite-5.4.0 \
rqlite/rqlite:5.4.0
cp container-rqlite-5.4.0.service $HOME/.config/systemd/user/container-rqlite-5.4.0.service
systemctl --user enable container-rqlite-5.4.0
systemctl --user start container-rqlite-5.4.0
systemctl --user status container-rqlite-5.4.0
Describe the results you received:
$ podman inspect rqlite-5.4.0 | jq '.[0]["State"]["Healthcheck"]'
{
"Status": "starting",
"FailingStreak": 0,
"Log": null
}
Describe the results you expected:
Additional information you deem important (e.g. issue happens only occasionally):
Reproducible.
Output of podman version:
$ podman version
Version: 2.0.4
API Version: 1
Go Version: go1.14.4
Built: Thu Jan 1 10:00:00 1970
OS/Arch: linux/amd64
Output of podman info --debug:
host:
arch: amd64
buildahVersion: 1.15.0
cgroupVersion: v1
conmon:
package: 'conmon: /usr/libexec/podman/conmon'
path: /usr/libexec/podman/conmon
version: 'conmon version 2.0.18, commit: '
cpus: 2
distribution:
distribution: ubuntu
version: "18.04"
eventLogger: file
hostname: desktop.local.lan
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
kernel: 5.4.0-42-generic
linkmode: dynamic
memFree: 258641920
memTotal: 16652058624
ociRuntime:
name: runc
package: 'containerd.io: /usr/bin/runc'
path: /usr/bin/runc
version: |-
runc version 1.0.0-rc10
commit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
spec: 1.0.1-dev
os: linux
remoteSocket:
path: /run/user/1000/podman/podman.sock
rootless: true
slirp4netns:
executable: /usr/bin/slirp4netns
package: 'slirp4netns: /usr/bin/slirp4netns'
version: |-
slirp4netns version 0.4.3
commit: unknown
swapFree: 4149723136
swapTotal: 4156551168
uptime: 42h 46m 44.24s (Approximately 1.75 days)
registries:
search:
- registry.access.redhat.com
- docker.io
store:
configFile: /home/<redacted>/.config/containers/storage.conf
containerStore:
number: 15
paused: 0
running: 1
stopped: 14
graphDriverName: vfs
graphOptions: {}
graphRoot: /home/<redacted>/.local/share/containers/storage
graphStatus: {}
imageStore:
number: 77
runRoot: /run/user/1000/containers
volumePath: /home/<redacted>/.local/share/containers/storage/volumes
version:
APIVersion: 1
Built: 0
BuiltTime: Thu Jan 1 10:00:00 1970
GitCommit: ""
GoVersion: go1.14.4
OsArch: linux/amd64
Version: 2.0.4
Package info (e.g. output of rpm -q podman or apt list podman):
$ apt list podman
Listing... Done
podman/unknown,now 2.0.4~1 amd64 [installed]
Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?
Yes
Additional environment details (AWS, VirtualBox, physical, etc.):
Physical
Does systemctl --user status <full ctr id>.service works and and shows healthy in the log?
As requested, I believe this shows the service is started without error:
$ systemctl --user status container-rqlite-5.4.0.service
* container-rqlite-5.4.0.service - Podman container-rqlite-5.4.0.service
Loaded: loaded (/home/<redacted>/.config/systemd/user/container-rqlite-5.4.0.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2020-08-31 14:36:34 AEST; 15s ago
Docs: man:podman-generate-systemd(1)
Process: 8016 ExecStopPost=/usr/bin/podman rm --ignore -f --cidfile /run/user/1000/container-rqlite-5.4.0.ctr-id (code=exited, status=0/SUCCESS)
Process: 8056 ExecStart=/usr/bin/podman run --conmon-pidfile /run/user/1000/container-rqlite-5.4.0.pid --cidfile /run/user/1000/container-rqlite-5.4.0.ctr-id --cgroups=no-conmon --detach --replace --publish 4001:4001 --publish 4002:4002 --healthcheck-command CMD-SHELL curl http://localhost:4001 && curl http://localhost:4002 && echo "Healthy" && exit 0 || exit 1 --healthcheck-start-period 5s --healthcheck-retries 5 --name rqlite-5.4.0 rqlite/rqlite:5.4.0 (code=exited, status=0/SUCCESS)
Process: 8055 ExecStartPre=/bin/rm -f /run/user/1000/container-rqlite-5.4.0.pid /run/user/1000/container-rqlite-5.4.0.ctr-id (code=exited, status=0/SUCCESS)
Main PID: 8250 (conmon)
CGroup: /user.slice/user-1000.slice/[email protected]/container-rqlite-5.4.0.service
|-8213 /usr/bin/slirp4netns --disable-host-loopback --mtu 65520 --enable-sandbox --enable-seccomp -c -e 3 -r 4 --netns-type=path /run/user/1000/netns/cni-b8e8efbd-1387-3d19-c79c-582a0af8794d tap0
|-8215 containers-rootlessport
|-8223 containers-rootlessport-child
|-8250 /usr/libexec/podman/conmon --api-version 1 -c 612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85 -u 612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85 -r /usr/bin/runc -b /home/<redacted>/.local/share/containers/storage/vfs-containers/612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85/userdata -p /run/user/1000/containers/vfs-containers/612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85/userdata/pidfile -n rqlite-5.4.0 --exit-dir /run/user/1000/libpod/tmp/exits --socket-dir-path /run/user/1000/libpod/tmp/socket -l k8s-file:/home/<redacted>/.local/share/containers/storage/vfs-containers/612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85/userdata/ctr.log --log-level error --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/run/user/1000/containers/vfs-containers/612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85/userdata/oci-log --conmon-pidfile /run/user/1000/container-rqlite-5.4.0.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/<redacted>/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1000/containers --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg runc --exit-command-arg --storage-driver --exit-command-arg vfs --exit-command-arg --events-backend --exit-command-arg file --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85
`-8267 rqlited -http-addr 0.0.0.0:4001 -raft-addr 0.0.0.0:4002 /rqlite/file/data
Aug 31 14:36:30 desktop.local.lan systemd[1569]: Starting Podman container-rqlite-5.4.0.service...
Aug 31 14:36:34 desktop.local.lan podman[8056]: time="2020-08-31T14:36:34+10:00" level=error msg="exit status 1"
Aug 31 14:36:34 desktop.local.lan podman[8056]: time="2020-08-31T14:36:34+10:00" level=error msg="Unit 612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85.service not found."
Aug 31 14:36:34 desktop.local.lan podman[8056]: 612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85
Aug 31 14:36:34 desktop.local.lan systemd[1569]: Started Podman container-rqlite-5.4.0.service.
On the host I am able to check ports 4001 (and 4002) respond:
$ curl localhost:4001
$ echo $?
0
I am not sure what to make of the two systemctl --user status ... entries:
Aug 31 14:36:34 desktop.local.lan podman[8056]: time="2020-08-31T14:36:34+10:00" level=error msg="exit status 1"
Aug 31 14:36:34 desktop.local.lan podman[8056]: time="2020-08-31T14:36:34+10:00" level=error msg="Unit 612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85.service not found."
No I really mean systemctl --user status <full ctr id>.service because podman creates a transient .service and .timer unit with this name to run the healthcheck. I get output like this:
$ systemctl --user status eca453dab7b219b3f43cf726da4e12f08f66e08c72e9d65b9c8f6f1d4bd56d20
● eca453dab7b219b3f43cf726da4e12f08f66e08c72e9d65b9c8f6f1d4bd56d20.service - /home/paul/go/src/github.com/containers/libpod/bin/podman healthcheck run eca453dab7b219b3f43cf726da4e12f08f66e08>
Loaded: loaded (/run/user/1000/systemd/transient/eca453dab7b219b3f43cf726da4e12f08f66e08c72e9d65b9c8f6f1d4bd56d20.service; transient)
Transient: yes
Active: failed (Result: exit-code) since Mon 2020-08-31 13:11:14 CEST; 9s ago
TriggeredBy: ● eca453dab7b219b3f43cf726da4e12f08f66e08c72e9d65b9c8f6f1d4bd56d20.timer
Process: 12280 ExecStart=/home/paul/go/src/github.com/containers/libpod/bin/podman healthcheck run eca453dab7b219b3f43cf726da4e12f08f66e08c72e9d65b9c8f6f1d4bd56d20 (code=exited, status=1>
Main PID: 12280 (code=exited, status=1/FAILURE)
CPU: 88ms
Aug 31 13:11:14 paul-pc systemd[1265]: Started /home/paul/go/src/github.com/containers/libpod/bin/podman healthcheck run eca453dab7b219b3f43cf726da4e12f08f66e08c72e9d65b9c8f6f1d4bd56d20.
Aug 31 13:11:14 paul-pc podman[12280]: 2020-08-31 13:11:14.407376131 +0200 CEST m=+0.062986276 container exec eca453dab7b219b3f43cf726da4e12f08f66e08c72e9d65b9c8f6f1d4bd56d20 (image=docker.i>
Aug 31 13:11:14 paul-pc podman[12280]: unhealthy
Aug 31 13:11:14 paul-pc systemd[1265]: eca453dab7b219b3f43cf726da4e12f08f66e08c72e9d65b9c8f6f1d4bd56d20.service: Main process exited, code=exited, status=1/FAILURE
Aug 31 13:11:14 paul-pc systemd[1265]: eca453dab7b219b3f43cf726da4e12f08f66e08c72e9d65b9c8f6f1d4bd56d20.service: Failed with result 'exit-code'.
podman inspect eca453dab7b219b3f43cf726da4e12f08f66e08c72e9d65b9c8f6f1d4bd56d20 | jq '.[0]["State"]["Healthcheck"]'
{
"Status": "unhealthy",
"FailingStreak": 8,
"Log": [
{
"Start": "2020-08-31T13:09:10.382942105+02:00",
"End": "2020-08-31T13:09:10.481246474+02:00",
"ExitCode": 1,
"Output": ""
},
{
"Start": "2020-08-31T13:09:41.382298249+02:00",
"End": "2020-08-31T13:09:41.488733922+02:00",
"ExitCode": 1,
"Output": ""
},
{
"Start": "2020-08-31T13:10:12.385250849+02:00",
"End": "2020-08-31T13:10:12.461637263+02:00",
"ExitCode": 1,
"Output": ""
},
{
"Start": "2020-08-31T13:10:43.386354714+02:00",
"End": "2020-08-31T13:10:43.492748757+02:00",
"ExitCode": 1,
"Output": ""
},
{
"Start": "2020-08-31T13:11:14.382488726+02:00",
"End": "2020-08-31T13:11:14.451178764+02:00",
"ExitCode": 1,
"Output": ""
}
]
}
I am not sure what to make of the two systemctl --user status ... entries:
Aug 31 14:36:34 desktop.local.lan podman[8056]: time="2020-08-31T14:36:34+10:00" level=error msg="exit status 1" Aug 31 14:36:34 desktop.local.lan podman[8056]: time="2020-08-31T14:36:34+10:00" level=error msg="Unit 612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85.service not found."
The Error seams to be indicating that this special unit did not get created by podman in your case.
@baude PTAL
outside the context of the system business, it seemed to work perfectly. im going to ask @vrothberg to take a peek at this to see if anything systemd is interfering.
@Luap99 correct, while the container is running ...
$ podman container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
612c66a9b900 docker.io/rqlite/rqlite:5.4.0 rqlited -http-add... 14 hours ago Up 14 hours ago 0.0.0.0:4001-4002->4001-4002/tcp rqlite-5.4.0
... the intermediate service is not created.
$ systemctl --user status 612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85.service
Unit 612c66a9b900d4736d03647a7328a009f20d4695355488f81295d9bb8a2c4e85.service could not be found.
I cannot reproduce the issue. The .service and .timer are always created on my F32 workstation with the latest Podman.
@bbros-dev, can you try running the container manually and see if healthchecks work outside of a systemd unit?
@baude, if I use CMD-SHELL ls / the run fails but it succeeds without the CMD-SHELL. I don't see this documented. What's the purpose of it?
@vrothberg
First
$ systemctl --user stop container-rqlite-5.4.0.service
* container-rqlite-5.4.0.service - Podman container-rqlite-5.4.0.service
Loaded: loaded (/home/<redacted>/.config/systemd/user/container-rqlite-5.4.0.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2020-09-08 10:45:42 AEST; 8min ago
Docs: man:podman-generate-systemd(1)
Process: 2811 ExecStopPost=/usr/bin/podman rm --ignore -f --cidfile /run/user/1000/container-rqlite-5.4.0.ctr-id (code=exited, status=0/SUCCESS)
Process: 2744 ExecStop=/usr/bin/podman stop --ignore --cidfile /run/user/1000/container-rqlite-5.4.0.ctr-id -t 10 (code=exited, status=0/SUCCESS)
Process: 26251 ExecStart=/usr/bin/podman run --conmon-pidfile /run/user/1000/container-rqlite-5.4.0.pid --cidfile /run/user/1000/container-rqlite-5.4.0.ctr-id --cgroups=no-conmon --detach --replace --publish 4001:4001 --publish 4002:4002 --healthcheck-command CMD-SHELL curl http://localhost:4001 && curl http://localhost:4002 && echo "Healthy" && exit 0 || exit 1 --healthcheck-start-period 5s --healthcheck-retries 5 --name rqlite-5.4.0 rqlite/rqlite:5.4.0 (code=exited, status=0/SUCCESS)
Process: 26242 ExecStartPre=/bin/rm -f /run/user/1000/container-rqlite-5.4.0.pid /run/user/1000/container-rqlite-5.4.0.ctr-id (code=exited, status=0/SUCCESS)
Main PID: 26496 (code=exited, status=2)
CGroup: /user.slice/user-1000.slice/[email protected]/container-rqlite-5.4.0.service
`-2572 /usr/bin/podman
Sep 08 10:41:39 desktop.local.lan podman[26251]: time="2020-09-08T10:41:39+10:00" level=error msg="exit status 1"
Sep 08 10:41:39 desktop.local.lan podman[26251]: time="2020-09-08T10:41:39+10:00" level=error msg="Unit 9b592768bcca63ccf736bc66aedded6ea8ba543c7c798decb099fc83aa447d37.service not found."
Sep 08 10:41:39 desktop.local.lan podman[26251]: 9b592768bcca63ccf736bc66aedded6ea8ba543c7c798decb099fc83aa447d37
Sep 08 10:41:39 desktop.local.lan systemd[2126]: Started Podman container-rqlite-5.4.0.service.
Sep 08 10:45:40 desktop.local.lan systemd[2126]: Stopping Podman container-rqlite-5.4.0.service...
Sep 08 10:45:41 desktop.local.lan podman[2744]: 9b592768bcca63ccf736bc66aedded6ea8ba543c7c798decb099fc83aa447d37
Sep 08 10:45:41 desktop.local.lan systemd[2126]: container-rqlite-5.4.0.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Sep 08 10:45:42 desktop.local.lan podman[2811]: 9b592768bcca63ccf736bc66aedded6ea8ba543c7c798decb099fc83aa447d37
Sep 08 10:45:42 desktop.local.lan systemd[2126]: container-rqlite-5.4.0.service: Failed with result 'exit-code'.
Sep 08 10:45:42 desktop.local.lan systemd[2126]: Stopped Podman container-rqlite-5.4.0.service.
then
$ /usr/bin/podman run --conmon-pidfile /run/user/1000/container-rqlite-5.4.0.pid --cidfile /run/user/1000/container-rqlite-5.4.0.ctr-id --cgroups=no-conmon --detach --replace --publish 4001:4001 --publish 4002:4002 --healthcheck-command CMD-SHELL curl http://localhost:4001 && curl http://localhost:4002 && echo "Healthy" && exit 0 || exit 1 --healthcheck-start-period 5s --healthcheck-retries 5 --name rqlite-5.4.0 qlite/rqlite:5.4.0
Error: container id file exists. Ensure another container is not using it or delete /run/user/1000/container-rqlite-5.4.0.ctr-id
exit
bash: exit: too many arguments
then
$ cat /run/user/1000/container-rqlite-5.4.0.ctr-id
9b592768bcca63ccf736bc66aedded6ea8ba543c7c798decb099fc83aa447d37
I won't clean this up in case you require some additional data from the current state.
Standing-by.
Thanks for checking, @bbros-dev!
The Error: container id file exists. [...] forces us to remove the specified id file before we can run the container. Could you remove the file(s) and try again? Once the container is up and running, try running healthchecks again and please also check if the transient systemd timer and service exist (e.g., via systemctl --user $containerID.{service,timer}).
Possibly more informative error:
$ /usr/bin/podman run --conmon-pidfile /run/user/1000/container-rqlite-5.4.0.pid --cidfile /run/user/1000/container-r
qlite-5.4.0.ctr-id --cgroups=no-conmon --detach --replace --publish 4001:4001 --publish 4002:4002 --healthcheck-command CMD
-SHELL 'curl http://localhost:4001 && curl http://localhost:4002 && echo "Healthy" && exit 0 || exit 1' --healthcheck-start
-period 5s --healthcheck-retries 5 --name rqlite-5.4.0 rqlite/rqlite:5.4.0
Error: invalid reference format
please also check if the transient systemd timer and service exist (e.g., via
systemctl --user $containerID.{service,timer}).
The file /run/user/1000/container-rqlite-5.4.0.ctr-id exists but is empty so no containerID service unit files to lookup.
Possibly more informative error:
$ /usr/bin/podman run --conmon-pidfile /run/user/1000/container-rqlite-5.4.0.pid --cidfile /run/user/1000/container-r qlite-5.4.0.ctr-id --cgroups=no-conmon --detach --replace --publish 4001:4001 --publish 4002:4002 --healthcheck-command CMD -SHELL 'curl http://localhost:4001 && curl http://localhost:4002 && echo "Healthy" && exit 0 || exit 1' --healthcheck-start -period 5s --healthcheck-retries 5 --name rqlite-5.4.0 rqlite/rqlite:5.4.0 Error: invalid reference format
^ this is missing quotes around the --health-check-command. Adding quotes around that works for me:
/usr/bin/podman run --conmon-pidfile /run/user/1000/container-rqlite-5.4.0.pid --cidfile /run/user/1000/container-rqlite-5.4.0.ctr-id --cgroups=no-conmon --detach --replace --publish 4001:4001 --publish 4002:4002 --healthcheck-command 'CMD-SHELL curl http://localhost:4001 && curl http://localhost:4002 && echo "Healthy" && exit 0 || exit 1' --healthcheck-start-period 5s --healthcheck-retries 5 --name rqlite-5.4.0 rqlite/rqlite:5.4.0
@bbros-dev, does it work with the corrected quoting?
Apologies. This will take a few days to get back to.
Apolgies @vrothberg that command was missing quotes - not sure why that was.
But if I go back to the original command which has correct quotes around I still see the error:
$ /usr/bin/podman run --conmon-pidfile ./container-rqlite-5.4.0.pid --cidfile ./container-rqlite-5.4.0.ctr-id --cgroups=no-conmon -d --replace --publish 4001:4001 --publish 4002:4002 --healthcheck-command 'CMD-SHELL curl http://localhost:4001 && curl http://localhost:4002 && echo "Okay" && exit 0 || exit 1' --healthcheck-start-period 5s --healthcheck-retries 5 --name rqlite-5.4.0 rqlite/rqlite:5.4.0
Trying to pull docker.io/rqlite/rqlite:5.4.0...
Getting image source signatures
Copying blob 7e6591854262 done
Copying blob 9c461696bc09 done
Copying blob 45085432511a done
Copying blob 089d60cb4e0a done
Copying blob 54aee0b95676 done
Copying blob 9697ac90a2b5 done
Copying blob ae590f327014 done
Copying config 452a727bb4 done
Writing manifest to image destination
Storing signatures
ERRO[0041] exit status 1
ERRO[0041] Unit 8bde150132b0773c3360763ff63636a1d371ff4e28f4eccaf4adc4a8702ef4e0.service not found.
8bde150132b0773c3360763ff63636a1d371ff4e28f4eccaf4adc4a8702ef4e0
Thanks for coming back! I'll set up an Ubuntu VM and see if I can reproduce there.
Thanks for coming back! I'll set up an Ubuntu VM and see if I can reproduce there.
I can finally reproduce on Ubuntu 18.04.
Systemd does not like the user namespace:
$ podman unshare
# root@ubuntu:~/podman# strace -s1000 -e trace=%network systemctl --user
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
getsockopt(3, SOL_SOCKET, SO_RCVBUF, [212992], [4]) = 0
setsockopt(3, SOL_SOCKET, SO_RCVBUFFORCE, [8388608], 4) = -1 EPERM (Operation not permitted)
setsockopt(3, SOL_SOCKET, SO_RCVBUF, [8388608], 4) = 0
getsockopt(3, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0
setsockopt(3, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = -1 EPERM (Operation not permitted)
setsockopt(3, SOL_SOCKET, SO_SNDBUF, [8388608], 4) = 0
connect(3, {sa_family=AF_UNIX, sun_path="/run/user/1000/systemd/private"}, 32) = 0
getsockopt(3, SOL_SOCKET, SO_PEERCRED, {pid=1475, uid=0, gid=0}, [12]) = 0
getsockopt(3, SOL_SOCKET, SO_PEERSEC, "unconfined", [64->10]) = 0
getsockopt(3, SOL_SOCKET, SO_PEERGROUPS, "\376\377\0\0\376\377\0\0\376\377\0\0\376\377\0\0\376\377\0\0\376\377\0\0\0\0\0\0", [256->28]) = 0
getsockopt(3, SOL_SOCKET, SO_ACCEPTCONN, [0], [4]) = 0
getsockname(3, {sa_family=AF_UNIX}, [128->2]) = 0
sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0AUTH EXTERNAL ", iov_len=15}, {iov_base="30", iov_len=2}, {iov_base="\r\nNEGOTIATE_UNIX_FD\r\nBEGIN\r\n", iov_len=28}], msg_iovlen=3, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 45
getsockopt(3, SOL_SOCKET, SO_PEERCRED, {pid=1475, uid=0, gid=0}, [12]) = 0
recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="REJECTED\r\nERROR\r\nERROR\r\n", iov_len=256}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
Failed to list units: Access denied
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=29541, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 1 +++
@giuseppe, you may know what's going on? :)
systemd 237
Doing a better search helped. It's a known bug in systemd (see https://bugzilla.redhat.com/show_bug.cgi?id=1838081) as it reject cross-uid-namespace connections.
Unfortunately, there's nothing Podman can do. I suggest opening a bug against Ubuntu. Note that it works as root.
Thanks again for opening the issue and your help debugging the issue!
@baude @rhatdan FYI
Nice work @vrothberg