@cevich is seeing frequent system-test errors [1] with podman-remote
$ ./bin/podman-remote rmi -f build_test
# Error: error removing containers [10b343023c7d6187aa521fad47934b5d558adc53964ab491017e34d909782751] for image "8659137a98bd8c7f3528c1c2c09a0fc8115bc32ce7659bc7a1d31b7d73dd3029": could not remove container "10b343023c7d6187aa521fad47934b5d558adc53964ab491017e34d909782751": remove /var/lib/containers/storage/overlay-containers/10b343023c7d6187aa521fad47934b5d558adc53964ab491017e34d909782751/userdata/shm: device or resource busy
This is (I think) Ubuntu. I cannot reproduce on my Fedora laptop, even running the varlink server with nice 19 and even with cores disabled to emulate a slow system. My attempted reproducer is a simplified variant of the build system test:
#!/bin/bash
set -xe
tmpdir=$(mktemp -d --tmpdir podman-remote-test.XXXXXX)
cat >$tmpdir/Dockerfile <<EOF
FROM quay.io/libpod/alpine_labels:latest
RUN apk add nginx
RUN echo hi >/myfile
EOF
pr() { /path/to/podman-remote "$@"; }
while :;do
pr build -t build_test --format=docker $tmpdir
pr run --rm build_test cat /myfile
pr rmi -f build_test
done
My only guess is that this might be a race condition with podman-remote run --rm: does anyone know if it is guaranteed to block?
master @ 1ad8fe52412c8cef215e5a5af312e5db680c3916. I am sorry for not being able to reproduce it but Chris seems to run into it constantly.
Thanks Ed.
Thinking about this more, I'm not seeing those particular failures anywhere else. So there must be something _special_ about PR #3754. That could simply be new/different VM imges - including images with embedded bugs.
Note: The two failures are on F29 and F30 (not Ubuntu) images:
fedora-30-libpod-5420341990522880fedora-29-libpod-5420341990522880I'll try your reproducer (above) using those images vs what we have on master...
@edsantiago good news is, I was able to reproduce the error using your script above and the F30 VM image from my PR. Trying the VM image from master (expecting no error) now...
...as expected, no error using the VM image from master (fedora-30-libpod-5751722641719296).
Excellent, so now (hopefully) it's a matter of comparing differences. Since the libpod code is common in both VMs, they're running in the same cloud-environment, so there must be a configuration or packaging difference or something like that.
Same on both VMs:
libvarlink-util-18-1.fc30.x86_64container-selinux-2.113.0-1.dev.git4f7d6bb.fc30.noarchrunc-1.0.0-93.dev.gitb9b6cc6.fc30.x86_64systemd-241-10.git511646b.fc30.x86_64slirp4netns-0.4.0-4.git19d199a.fc30.x86_64containernetworking-plugins-0.8.1-1.fc30.x86_64containers-common-0.1.37-2.fc30.x86_64hrmmmmm.
Ran reproducer again, and it worked fine for a while, failing after maybe 10 passes. So there's def. some race condition here that's somehow more likely with the images in my PR vs from master, even with the exact same libpod code :confounded:
Maybe I should try taking the podman and podman-remote binaries from the breaking-VM over to the working-VM to see if the behavior moves/changes?
Not sure what else could be involved here, @mheon have any ideas?
confirmed: it's not the binaries, behavior remains the same after copying from breaking -> working VM.
Noticed this difference though, the breaking VM is running kernel-5.2.8-200 vs working VM (from master) is using kernel-5.2.7-200. Going to try upgrading that kernel, rebooting, and see if the problem starts happening...
JJF($!!343$hg80&#//* :angry:
updated both VMs to 5.2.9-200.fc30.x86_64 rebooted, but the pattern remains: Problem reproduces on fedora-30-libpod-5420341990522880 but not fedora-30-libpod-5751722641719296.
This could definitely cause a race.
pr build -t build_test --format=docker $tmpdir
pr run --rm build_test cat /myfile
When this command finishes the image is still mounted. conmon will execute podman container cleanup, but after the run command finishes.
pr rmi -f build_test
If this command executes before podman container cleanup finishes you can get this error.
@mheon Is my assumption correct. Does podman container cleanup take care of the --rm as well?
Thanks @rhatdan. If this is the case, would it be possible to have run --rm block until conmon finishes cleanup?
And if that's not possible, I'd like to request a clear documentation update stating something like:
podman-remote run --rm actually means "remove the container some time in
the future, not necessarily now", and is not actually guaranteed to remove
the container. If you actually want to deterministically remove the container,
perhaps because you want to rmi its parent container, please run [some sort
of TBI command that blocks until container is removed]
Yes I agree it should be cleaned up, but I am not sure if my theory is correct.
podman run --rm should, if attached (not run with -d), block until container removal is complete. I don't think there's a potential for races, either.
Well @edsantiago is seeing what looks like a race when using the varlink connection.
Small correction: I've never seen it; only @cevich has seen it.
@mheon wrong answer. The correct answer is: Oh, I saw this problem yesterday and just merged a fix :grinning:
@mheon remember this is podman-remote -> podman. Perhaps podman-remote is exiting before podman/conmon finish their job?
With Ed's reproducer script, this very reliably reproduces for me (using hack/get_ci_vm.sh on the images mentioned). Would running in debug mode help at all? How can we prove/disprove Dan's theory? Or maybe is there a way we can tell the audit subsystem to log process-exits (with timestamps)?
@mheon I managed to catch this in the act by instrumenting the system-test that reproduces it:
--- a/test/system/070-build.bats
+++ b/test/system/070-build.bats
@@ -26,9 +26,11 @@ EOF
PODMAN_TIMEOUT=240 run_podman build -t build_test --format=docker $tmpdir
is "$output" ".*STEP 4: COMMIT" "COMMIT seen in log"
+ run_podman ps -a
run_podman run --rm build_test cat /$rand_filename
is "$output" "$rand_content" "reading generated file in image"
+ run_podman ps -a
run_podman rmi -f build_test
}
You can see what the test is doing in the diff. When it reproduces, it happens on that last rm -f line:
[+0100s] not ok 28 podman build - basic test
[+0100s] # (from function `die' in file test/system/helpers.bash, line 264,
[+0100s] # from function `run_podman' in file test/system/helpers.bash, line 164,
[+0100s] # in test file test/system/070-build.bats, line 34)
[+0100s] # `run_podman rmi -f build_test' failed with status 2
[+0100s] # $ ./bin/podman-remote rm --all --force
[+0100s] # $ ./bin/podman-remote images --all --format {{.Repository}}:{{.Tag}} {{.ID}}
[+0100s] # quay.io/libpod/alpine_labels:latest 4fab981df737
[+0100s] # $ ./bin/podman-remote build -t build_test --format=docker /tmp/podman_bats.hPzLvX/build-test
[+0100s] # STEP 1: FROM quay.io/libpod/alpine_labels:latest
[+0100s] # STEP 2: RUN apk add nginx
[+0100s] # fetch http://dl-cdn.alpinelinux.org/alpine/v3.8/main/x86_64/APKINDEX.tar.gz
[+0100s] # fetch http://dl-cdn.alpinelinux.org/alpine/v3.8/community/x86_64/APKINDEX.tar.gz
[+0100s] # (1/2) Installing pcre (8.42-r0)
[+0100s] # (2/2) Installing nginx (1.14.2-r1)
[+0100s] # Executing nginx-1.14.2-r1.pre-install
[+0100s] # Executing busybox-1.28.4-r1.trigger
[+0100s] # OK: 6 MiB in 15 packages
[+0100s] # 27af317be9360cdb9cf9ec92e3852518e34831223001012fb28aaf496d2462ff
[+0100s] # STEP 3: RUN echo UkyW3wKhN0Uk8z1zdTzgjXxyUypI0WBjXmaKij34vb2vC9cHA2 > /UdFKEYczgNKHkBKXaNwL
[+0100s] # STEP 4: COMMIT build_test
[+0100s] # 102f5447866173d6e131165b85c0955f970cba4d5f2197cd40e1bd506824ce78
[+0100s] # $ ./bin/podman-remote ps -a
[+0100s] # CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
[+0100s] # $ ./bin/podman-remote run --rm build_test cat /UdFKEYczgNKHkBKXaNwL
[+0100s] # UkyW3wKhN0Uk8z1zdTzgjXxyUypI0WBjXmaKij34vb2vC9cHA2
[+0100s] # $ ./bin/podman-remote ps -a
[+0100s] # CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
[+0100s] # e32470ce70ef localhost/build_test:latest cat /UdFKEYczgNKH... Less than a second ago Exited (0) Less than a second ago keen_cohen
[+0100s] # $ ./bin/podman-remote rmi -f build_test
[+0100s] # Error: error removing containers [e32470ce70ef1fabf805c6db9d2bdb763bb0b802e79ac6d2a8b38e21171ac8e4] for image "102f5447866173d6e131165b85c0955f970cba4d5f2197cd40e1bd506824ce78": could not remove container "e32470ce70ef1fabf805c6db9d2bdb763bb0b802e79ac6d2a8b38e21171ac8e4": unlinkat /var/lib/containers/storage/overlay-containers/e32470ce70ef1fabf805c6db9d2bdb763bb0b802e79ac6d2a8b38e21171ac8e4/userdata/shm: device or resource busy
[+0100s] # [ rc=2 (** EXPECTED 0 **) ]
[+0100s] # #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
[+0100s] # #| FAIL: exit code is 2; expected 0
[+0100s] # #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
So clearly a race here is possible, but perhaps (somehow) more likely when podman-remote is being used?
is it impossible for the run ... --rm to verify removal before exit?
Have we ever seen this without remote being involved? This should not be possible for local Podman.
@mheon As best as I can recall, I have not seen it. Even with podman-remote it doesn't always happen. Is it possible for podman-remote to special-case this, and verify removal happens before exiting (assuming there was no networking-error)?
Looking at the difference between pkg/adapter/containers.go and pkg/adapter/containers_remote.go.
Containers.go has a call to
if ecode, err := ctr.Wait(); err != nil {
And containers_remote.go does not. I would guess this is the issue.
Oh. That sounds like it.
Oh that's a HUGE relief, thanks @rhatdan !
Closing as #3934 is merged
Most helpful comment
Looking at the difference between pkg/adapter/containers.go and pkg/adapter/containers_remote.go.
Containers.go has a call to
And containers_remote.go does not. I would guess this is the issue.