Podman: Container storage leak when the podman process is SIGKILL

Created on 29 Aug 2019 · 21Comments · Source: containers/podman

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

When the podman process is SIGKILL midway, the container storage leaks and it is not obvious how to clean it up.

Steps to reproduce the issue:

Run this script:

#!/bin/python3

import subprocess, time

argv = ["podman", "run", "--rm", "--name", "test-server",
        "-e", "PYTHONUNBUFFERED=1",
        "fedora", "python3", "-m", "http.server"]

while True:
    try:
        print("Press ctrl-C if http server start...")
        proc = subprocess.Popen(argv)
        try:
            proc.communicate()
        except KeyboardInterrupt:
            proc.terminate()
            proc.kill()
            raise
    except KeyboardInterrupt:
        subprocess.Popen(["podman", "kill", "test-server"]).wait()
    time.sleep(1)

Sample output:

$ python3 repro.py 
Press ctrl-C if http server start...
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
^C
Keyboard interrupt received, exiting.
Error: no container with name or ID test-server found: no such container
Press ctrl-C if http server start...
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
^C
Keyboard interrupt received, exiting.
Error: no container with name or ID test-server found: no such container
Press ctrl-C if http server start...
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
^C
Keyboard interrupt received, exiting.
Error: no container with name or ID test-server found: no such container
Press ctrl-C if http server start...
Error: error creating container storage: the container name "test-server" is already in use by "478dbcf9e5f5d262ad6b21ad3fe94907260c086b718ce29d54d05a311caf2abf". You have to remove that container to be able to reuse that name.: that name is already in use
Press ctrl-C if http server start...
Error: error creating container storage: the container name "test-server" is already in use by "478dbcf9e5f5d262ad6b21ad3fe94907260c086b718ce29d54d05a311caf2abf". You have to remove that container to be able to reuse that name.: that name is already in use

Describe the results you received:

Once this happens, to be able to re-use the test-server name, the container storage needs to be removed manually:

$ podman container inspect test-server
Error: error looking up container "test-server": no container with name or ID test-server found: no such container
$ podman container inspect 478dbcf9e5f5d262ad6b21ad3fe94907260c086b718ce29d54d05a311caf2abf
Error: error looking up container "478dbcf9e5f5d262ad6b21ad3fe94907260c086b718ce29d54d05a311caf2abf": no container with name or ID 478dbcf9e5f5d262ad6b21ad3fe94907260c086b718ce29d54d05a311caf2abf found: no such container
$ podman rm test-server
Error: no container with name or ID test-server found: no such container
$ podman rm -f test-server
Error: no container with name or ID test-server found: no such container
$ podman rm -f --storage test-server
test-server

Describe the results you expected:

podman inspect should tell that the container storage still exists instead of "no such container".

Additional information you deem important (e.g. issue happens only occasionally):

The reproducer is probably not correct as it SIGKILL podman while it is cleaning up the container, but perhaps podman could remove the storage first?

Output of podman version:

Version:            1.4.5-dev
RemoteAPI Version:  1
Go Version:         go1.12.2
OS/Arch:            linux/amd64

Output of podman info --debug:

debug:
  compiler: gc
  git commit: ""
  go version: go1.12.2
  podman version: 1.4.5-dev
host:
  BuildahVersion: 1.9.0
  Conmon:
    package: podman-1.5.1-3.fc30.x86_64
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.0, commit: d728afa06cd2df86a27f32a4692c7099a56acc97-dirty'
  Distribution:
    distribution: fedora
    version: "30"
  MemFree: 11724386304
  MemTotal: 16815677440
  OCIRuntime:
    package: containerd.io-1.2.6-3.3.fc30.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc8
      commit: 425e105d5a03fabd737a126ad93d62a9eeede87f
      spec: 1.0.1-dev
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 8
  hostname: ocp.rdocloud
  kernel: 5.2.9-200.fc30.x86_64
  os: linux
  rootless: true
  uptime: 1h 31m 26.12s (Approximately 0.04 days)
registries:
  blocked: null
  insecure: null
  search:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  ConfigFile: /home/fedora/.config/containers/storage.conf
  ContainerStore:
    number: 9
  GraphDriverName: vfs
  GraphOptions: null
  GraphRoot: /home/fedora/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 6
  RunRoot: /tmp/1000
  VolumePath: /home/fedora/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

podman-1.5.1-3.fc30.x86_64

Additional environment details (AWS, VirtualBox, physical, etc.):

The host is running in OpenStack

do-not-close kinbug stale-issue

Source

TristanCacqueray

Most helpful comment

I've identified something that may be part of the problem - Podman may not be unmounting the SHM mount before the removal. I'm adding some patches that might address it to #3931

mheon on 5 Sep 2019

❤1 🎉1 👍1

All 21 comments

Hi @TristanCacqueray, thanks for opening the issue. There's something odd with the version since podman-version claims 1.4.5-devbut the package is podman-1.5.1-3.fc30.x86_64. I have the very same package installed on my machine but podman returns the correct version.

Can you check if you used the correct binary?

vrothberg on 29 Aug 2019

Oops, it was a devel version in ~/.local/bin... Using the system podman resulted in the same behavior. Here are the new debug info:

Version:            1.5.1
RemoteAPI Version:  1
Go Version:         go1.12.7
OS/Arch:            linux/amd64

debug:
  compiler: gc
  git commit: ""
  go version: go1.12.7
  podman version: 1.5.1
host:
  BuildahVersion: 1.10.1
  Conmon:
    package: podman-1.5.1-3.fc30.x86_64
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.0, commit: d728afa06cd2df86a27f32a4692c7099a56acc97-dirty'
  Distribution:
    distribution: fedora
    version: "30"
  MemFree: 11273560064
  MemTotal: 16815677440
  OCIRuntime:
    package: containerd.io-1.2.6-3.3.fc30.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc8
      commit: 425e105d5a03fabd737a126ad93d62a9eeede87f
      spec: 1.0.1-dev
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 8
  eventlogger: file
  hostname: ocp.rdocloud
  kernel: 5.2.9-200.fc30.x86_64
  os: linux
  rootless: true
  uptime: 1h 45m 53.63s (Approximately 0.04 days)
registries:
  blocked: null
  insecure: null
  search:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  ConfigFile: /home/fedora/.config/containers/storage.conf
  ContainerStore:
    number: 10
  GraphDriverName: vfs
  GraphOptions: null
  GraphRoot: /home/fedora/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 6
  RunRoot: /tmp/1000
  VolumePath: /home/fedora/.local/share/containers/storage/volumes

TristanCacqueray on 29 Aug 2019

If we finally have a consistent reproducer for this, that's a very good thing - removing storage happens before containers are evicted from the database, so figuring out why it's failing but the container is still not in the Libpod DB might let us fix this for good.

On inspect showing the container - I would not expect inspect to be able to locate containers not in our database. There is some talk of a podman ps --storage will display containers in storage (in a somewhat limited form, as we can't get status, etc).

mheon on 29 Aug 2019

@mheon, I can share some of my cycles. Want me to have a look?

vrothberg on 29 Aug 2019

Here is a new sample log using system podman:

$ python3 repro.py 
Press ctrl-C if http server start...
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
^C
Keyboard interrupt received, exiting.
Error: no container with name or ID test-server found: no such container
Press ctrl-C if http server start...
Error: error creating container storage: the container name "test-server" is already in use by "1c5629586bd866f80c05cc90d38b6e395c0114aac01c000446c52dfe3708d841". You have to remove that container to be able to reuse that name.: that name is already in use
Press ctrl-C if http server start...
Error: error creating container storage: the container name "test-server" is already in use by "1c5629586bd866f80c05cc90d38b6e395c0114aac01c000446c52dfe3708d841". You have to remove that container to be able to reuse that name.: that name is already in use
Press ctrl-C if http server start...
^CError: no container with name or ID test-server found: no such container
^CTraceback (most recent call last):
  File "repro.py", line 22, in <module>
    time.sleep(1)
KeyboardInterrupt
$ podman container inspect 1c5629586bd866f80c05cc90d38b6e395c0114aac01c000446c52dfe3708d841
Error: error looking up container "1c5629586bd866f80c05cc90d38b6e395c0114aac01c000446c52dfe3708d841": no container with name or ID 1c5629586bd866f80c05cc90d38b6e395c0114aac01c000446c52dfe3708d841 found: no such container
$ podman container inspect test-server
Error: error looking up container "test-server": no container with name or ID test-server found: no such container
$ podman rm test-server
Error: no container with name or ID test-server found: no such container
$ podman rm 1c5629586bd866f80c05cc90d38b6e395c0114aac01c000446c52dfe3708d841
Error: no container with name or ID 1c5629586bd866f80c05cc90d38b6e395c0114aac01c000446c52dfe3708d841 found: no such container
$ podman rm 1c5629586bd866f80c05cc90d38b6e395c0114aac01c000446c52dfe3708d841^C
$ podman run -it --rm --name test-server fedora
Error: error creating container storage: the container name "test-server" is already in use by "1c5629586bd866f80c05cc90d38b6e395c0114aac01c000446c52dfe3708d841". You have to remove that container to be able to reuse that name.: that name is already in use
$ podman rm -f 1c5629586bd866f80c05cc90d38b6e395c0114aac01c000446c52dfe3708d841
Error: no container with name or ID 1c5629586bd866f80c05cc90d38b6e395c0114aac01c000446c52dfe3708d841 found: no such container
$ podman rm -f test-server
Error: no container with name or ID test-server found: no such container
$ podman rm -f --storage test-server
test-server
$ podman run -it --rm --name test-server fedora echo it is working
it is working

TristanCacqueray on 29 Aug 2019

@mheon perhaps the podman run failure in that situation could hint at using "podman rm --storage" or something. It currently says "You have to remove that container to be able to reuse that name.: that name is already in use".

TristanCacqueray on 29 Aug 2019

podman should act with /var/lib/containers/storage/overlay-containers/containers.json based on ACID. SIGTERM or SIGKILL shouldn't corrupt containers.json. Having containers.lock is not enough to implement ACID.

holser on 3 Sep 2019

I haven't seen any corrupted containers.json yet. Here, we're seeing a failure to remove for reasons we've yet to determine - but the storage metadata is intact. The container can be removed, but later.

@holser If you have a corrupted containers.json that's a separate issue - and significantly more serious, because that sounds like it would break storage a lot worse than this does.

mheon on 3 Sep 2019

I've identified something that may be part of the problem - Podman may not be unmounting the SHM mount before the removal. I'm adding some patches that might address it to #3931

mheon on 5 Sep 2019

❤1 🎉1 👍1

I'm going to test this now that #3931 is merged and see if we can call it fixed

mheon on 23 Sep 2019

Does not appear to be fixed. Looking deeper...

mheon on 23 Sep 2019

SHM is still mounted, but I'm not seeing any error messages out of Podman.

mheon on 23 Sep 2019

Alright, I think the reproducer here has introduced a wierd race.

podman run --rm invokes container removal twice: once in the Conmon cleanup process, once in the main podman process. This duplication is to ensure that, if the user detaches from the container, it will still be removed in the background; but if the container remains alive, it is guaranteed to be removed before podman run --rm exits. With our locking model, this is quite safe - one of the processes will lose, but the resulting "no such container" error is safely handled, and the container is definitely gone.

In this case, podman run --rm appears to be consistently winning the race as to who gets to remove the container; which makes sense, because it doesn't need to launch a fresh process. It begins container removal, successfully deletes it from the database, and then is immediately whacked by a SIGKILL from the reproducer before it can completely clean up the mount.

The cleanup process then arrives and attempts to remove the container, but it's already gone from the database; that's one of the first steps we take, to ensure that we don't leave unusable partially-deleted containers around. As such, podman cleanup can't find the container, determines it has been removed by the other Podman process, and as such doesn't complain much.

I don't know how well we can handle this. If podman takes a SIGKILL in the middle of removing a container, things are going to be left undone, so we're left with the decision of whether to leave partially-removed, unusable containers around in podman ps, or potentially leave storage containers around that block names from re-use.

Perhaps the simplest takeaway here is to not whack podman processes with SIGKILL unless they are actually stuck and timing out (well, really, Podman processes that are expected to remove containers - podman run --rm, podman rm).

I'll think about this more, but I don't know if there's an easy solution on our end.

mheon on 23 Sep 2019

(For reference - adding a time.sleep(1) before the proc.kill() completely fixes the bug for me.
Podman will handle other signals via sig-proxy, but we can't catch SIGKILL, and can't sanely handle the situation.

mheon on 23 Sep 2019

Please note we cannot in general protect podman from sudden "SIGKILL", AFAIK, a fenced controller would end up with the same situation, like if using SIGKILL but doing a force power off instead. Or systemd might send KILL when it times-out shutting down some container via the service unit.

So there needs to be some transactions replaying or something. Not sure, but Galera DB recovers from such crashes well, for example...

bogdando on 3 Oct 2019

@mheon SIGKILL is a valid signal which can be used by user. There is no way to intercept it, that's true. Though, there are many ways how to handle it. A lot of software guarantees constancy data whatever happens (databases, file systems, clustering software). Thus I mentioned ACID. For instance there can be a transaction log (journal) of podman operations. If podman was killed by SIGKILL the transaction log won't be clean. It means that extra steps should be taken to assure that consistency is present and nothing will blow up.

holser on 3 Oct 2019

Not sure if my issue is related to SIGKILL issue here, but I'm encountering this case of container that seems to didn't make to the execution of command inside the container. The podman ps -a didn't even detect the container.

Oct 09 19:50:34 prod.moe.ph podman[6966]: time="2019-10-09T19:50:34+08:00" level=debug msg="setting container name nginx"
Oct 09 19:50:34 prod.moe.ph podman[6966]: time="2019-10-09T19:50:34+08:00" level=debug msg="created OCI spec and options for new container"
Oct 09 19:50:34 prod.moe.ph podman[6966]: time="2019-10-09T19:50:34+08:00" level=debug msg="Allocated lock 21 for container 6e3076659ef6c90a9c35d7f2e7d43cccb0a25fd7d2ab5edc9859c8deb64c2da0"
Oct 09 19:50:34 prod.moe.ph podman[6966]: time="2019-10-09T19:50:34+08:00" level=debug msg="parsed reference into \"[overlay@/var/lib/containers/storage+/var/run/containers/storage:overlay.mountopt=nodev,metacopy=on]@0dac5b41d811ca6e1bfe68d31bec5fb1f5c37485b741674619a1a2a3dec5cc0e\""
Oct 09 19:50:34 prod.moe.ph podman[6966]: time="2019-10-09T19:50:34+08:00" level=debug msg="exporting opaque data as blob \"sha256:0dac5b41d811ca6e1bfe68d31bec5fb1f5c37485b741674619a1a2a3dec5cc0e\""
Oct 09 19:50:34 prod.moe.ph podman[6966]: time="2019-10-09T19:50:34+08:00" level=debug msg="created container \"6e3076659ef6c90a9c35d7f2e7d43cccb0a25fd7d2ab5edc9859c8deb64c2da0\""
Oct 09 19:50:34 prod.moe.ph podman[6966]: time="2019-10-09T19:50:34+08:00" level=debug msg="container \"6e3076659ef6c90a9c35d7f2e7d43cccb0a25fd7d2ab5edc9859c8deb64c2da0\" has work directory \"/var/lib/containers/storage/overlay-containers/6e3076659ef6c90a9c35d7f2e7d43cccb0a25fd7d2ab5edc9859c8deb64c2da0/userdata\""
Oct 09 19:50:34 prod.moe.ph podman[6966]: time="2019-10-09T19:50:34+08:00" level=debug msg="container \"6e3076659ef6c90a9c35d7f2e7d43cccb0a25fd7d2ab5edc9859c8deb64c2da0\" has run directory \"/var/run/containers/storage/overlay-containers/6e3076659ef6c90a9c35d7f2e7d43cccb0a25fd7d2ab5edc9859c8deb64c2da0/userdata\""
Oct 09 20:17:36 prod.moe.ph systemd[1]: Stopping nginx container service...
Oct 09 20:17:36 prod.moe.ph podman[9147]: Error: no container with name or ID nginx found: no such container
Oct 09 20:17:36 prod.moe.ph systemd[1]: cnginx.service: Succeeded.
Oct 09 20:17:36 prod.moe.ph systemd[1]: Stopped nginx container service.

I'm assuming that if the podman didn't started the container successfully, it won't be able to cleanup the shm mounts afterwards. On my case, the container was still in the containers.json, so I have to remove it using podman rm --storage.

Here's my podman info details.

[root@prod storage]# podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.12.9
  podman version: 1.6.1
host:
  BuildahVersion: 1.11.2
  CgroupVersion: v1
  Conmon:
    package: conmon-2.0.1-1.fc30.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.1, commit: 4346fbe0b2634b05857973bdf663598081240374'
  Distribution:
    distribution: fedora
    version: "30"
  MemFree: 332312576
  MemTotal: 4133416960
  OCIRuntime:
    package: runc-1.0.0-93.dev.gitb9b6cc6.fc30.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc8+dev
      commit: e3b4c1108f7d1bf0d09ab612ea09927d9b59b4e3
      spec: 1.0.1-dev
  SwapFree: 5341835264
  SwapTotal: 5368705024
  arch: amd64
  cpus: 2
  eventlogger: journald
  hostname: prod.moe.ph
  kernel: 5.2.14-200.fc30.x86_64
  os: linux
  rootless: false
  uptime: 19h 17m 42.67s (Approximately 0.79 days)
registries:
  blocked: null
  insecure: null
  search:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  ConfigFile: /etc/containers/storage.conf
  ContainerStore:
    number: 9
  GraphDriverName: overlay
  GraphOptions:
    overlay.mountopt: nodev,metacopy=on
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  ImageStore:
    number: 44
  RunRoot: /var/run/containers/storage
  VolumePath: /var/lib/containers/storage/volumes

Edit 1:
Never mind. It seems that my case here is related to containers/storage. It seems that my issue is caused by mounting 3 volumes or more or something else.

Edit 2:
It's really related to the containers/storage. I was able to fix my issue by stopping all containers and run podman system renumber.

Significant changes were made to Podman volumes in this release. If you have pre-existing volumes, it is strongly recommended to run podman system renumber after upgrading.