BUG REPORT? : not sure...
/kind bug
Description
Upon go get ./cmd/... in an application, the container consistently hangs when fetching a specific module. The container becomes un-useable and they only way to kill it is with kill -9 $(pidof podman). This leaves many artifacts behind that effect the behaviour of simple commands like podman images.
Steps to reproduce the issue:
$ podman run -u myuser -it --name app-c -v $PWD:/home/myuser/projects/app:Z -w /home/myuser/projects/app --hostname="app-c" localhost/el7-base-go /bin/bash
go get ./cmd/...[myuser@app-c app]$ go get ./cmd/... --verbose
go: finding github.com/go-sql-driver/mysql v1.4.1
go: finding github.com/jmoiron/sqlx v1.2.0
go: finding github.com/gorilla/handlers v1.4.0
go: finding github.com/gorilla/mux v1.7.0
go: finding gopkg.in/yaml.v2 v2.2.2
go: finding github.com/go-sql-driver/mysql v1.4.0
Although the issue appears to be go module related, I am unsure how to handle the following situation. The container just hangs with that out.
Describe the results you received:
The following info is available:
$ podman attach app-c
$ podman stats app-c
Error: unable to load cgroup at /libpod_parent/libpod-26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6: cgroups: cgroup deleted
$ podman stop app-c
Error: container 26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6 did not die within timeout
$ podman kill app-c
26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6
$ podman ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
26c70b225c77 localhost/el7-base-go:latest /bin/bash 15 minutes ago Up 15 minutes ago app-c
$ podman events
2019-06-10 20:07:22.732887043 +1000 AEST container stop 26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6 (image=localhost/el7-base-go:latest, name=app-c)
2019-06-10 20:07:51.875878169 +1000 AEST container kill 26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6 (image=localhost/el7-base-go:latest, name=app-c)
I am now forced to kill podman from the host.
$ for p in $(pidof podman);do kill -9 $p;done
$ ps aux | grep podman
deefin 13926 0.0 0.0 77860 1884 ? Ssl 19:52 0:00 /usr/libexec/podman/conmon -c 26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6 -u 26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6 -r /usr/bin/runc -b /home/deefin/.local/share/containers/storage/overlay-containers/26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6/userdata -p /tmp/1000/overlay-containers/26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6/userdata/pidfile -l /home/deefin/.local/share/containers/storage/overlay-containers/26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6/userdata/ctr.log --exit-dir /run/user/1000/libpod/tmp/exits --conmon-pidfile /home/deefin/.local/share/containers/storage/overlay-containers/26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/deefin/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /tmp/1000 --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /run/user/1000/libpod/tmp --exit-command-arg --runtime --exit-command-arg runc --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6 --socket-dir-path /run/user/1000/libpod/tmp/socket -t --log-level error
$ kill -9 13926
$ podman images
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x130 pc=0x5611916de493]
goroutine 1 [running]:
panic(0x561192133180, 0x56119320d0f0)
/usr/lib/golang/src/runtime/panic.go:565 +0x2c9 fp=0xc0008e9730 sp=0xc0008e96a0 pc=0x561190afe7b9
runtime.panicmem(...)
/usr/lib/golang/src/runtime/panic.go:82
runtime.sigpanic()
/usr/lib/golang/src/runtime/signal_unix.go:390 +0x415 fp=0xc0008e9760 sp=0xc0008e9730 pc=0x561190b14315
github.com/containers/libpod/libpod/image.(*Runtime).GetImages(0xc0009184b0, 0x2, 0x2, 0xc00072fb80, 0x3e, 0xc0008e9c40)
/builddir/build/BUILD/libpod-7210727e205c333af9a2d0ed0bb66adcf92a6369/_build/src/github.com/containers/libpod/libpod/image/image.go:443 +0x43 fp=0xc0008e9bc0 sp=0xc0008e9760 pc=0x5611916de493
github.com/containers/libpod/pkg/adapter.(*LocalRuntime).GetImages(0xc000872e50, 0x0, 0x0, 0x0, 0x0, 0x561191aacdd9)
/builddir/build/BUILD/libpod-7210727e205c333af9a2d0ed0bb66adcf92a6369/_build/src/github.com/containers/libpod/pkg/adapter/runtime.go:74 +0x4d fp=0xc0008e9c50 sp=0xc0008e9bc0 pc=0x561191990d2d
main.imagesCmd(0x56119328dec0, 0x0, 0x0)
/builddir/build/BUILD/libpod-7210727e205c333af9a2d0ed0bb66adcf92a6369/_build/src/github.com/containers/libpod/cmd/podman/images.go:173 +0x2cf fp=0xc0008e9d70 sp=0xc0008e9c50 pc=0x561191a4f59f
main.glob..func50(0x561193220280, 0x5611932ac908, 0x0, 0x0, 0x0, 0x0)
/builddir/build/BUILD/libpod-7210727e205c333af9a2d0ed0bb66adcf92a6369/_build/src/github.com/containers/libpod/cmd/podman/images.go:100 +0x88 fp=0xc0008e9d98 sp=0xc0008e9d70 pc=0x561191a94fa8
github.com/containers/libpod/vendor/github.com/spf13/cobra.(*Command).execute(0x561193220280, 0xc00000e090, 0x0, 0x0, 0x561193220280, 0xc00000e090)
/builddir/build/BUILD/libpod-7210727e205c333af9a2d0ed0bb66adcf92a6369/_build/src/github.com/containers/libpod/vendor/github.com/spf13/cobra/command.go:762 +0x467 fp=0xc0008e9e80 sp=0xc0008e9d98 pc=0x561190ca7ed7
github.com/containers/libpod/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x561193229a80, 0xc0000becc0, 0x7ffdbd950031, 0x6)
/builddir/build/BUILD/libpod-7210727e205c333af9a2d0ed0bb66adcf92a6369/_build/src/github.com/containers/libpod/vendor/github.com/spf13/cobra/command.go:852 +0x2ee fp=0xc0008e9f50 sp=0xc0008e9e80 pc=0x561190ca897e
github.com/containers/libpod/vendor/github.com/spf13/cobra.(*Command).Execute(...)
/builddir/build/BUILD/libpod-7210727e205c333af9a2d0ed0bb66adcf92a6369/_build/src/github.com/containers/libpod/vendor/github.com/spf13/cobra/command.go:800
main.main()
/builddir/build/BUILD/libpod-7210727e205c333af9a2d0ed0bb66adcf92a6369/_build/src/github.com/containers/libpod/cmd/podman/main.go:142 +0x8a fp=0xc0008e9f98 sp=0xc0008e9f50 pc=0x561191a5a58a
runtime.main()
/usr/lib/golang/src/runtime/proc.go:200 +0x214 fp=0xc0008e9fe0 sp=0xc0008e9f98 pc=0x561190b00514
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc0008e9fe8 sp=0xc0008e9fe0 pc=0x561190b2c4a1
goroutine 2 [force gc (idle)]:
runtime.gopark(0x56119231d068, 0x561193284410, 0x1410, 0x1)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc000074fb0 sp=0xc000074f90 pc=0x561190b00915
runtime.goparkunlock(...)
/usr/lib/golang/src/runtime/proc.go:307
runtime.forcegchelper()
/usr/lib/golang/src/runtime/proc.go:250 +0xbb fp=0xc000074fe0 sp=0xc000074fb0 pc=0x561190b007ab
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000074fe8 sp=0xc000074fe0 pc=0x561190b2c4a1
created by runtime.init.6
/usr/lib/golang/src/runtime/proc.go:239 +0x37
goroutine 3 [GC sweep wait]:
runtime.gopark(0x56119231d068, 0x5611932849c0, 0x140c, 0x1)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc0000757a8 sp=0xc000075788 pc=0x561190b00915
runtime.goparkunlock(...)
/usr/lib/golang/src/runtime/proc.go:307
runtime.bgsweep(0xc00009c000)
/usr/lib/golang/src/runtime/mgcsweep.go:89 +0x138 fp=0xc0000757d8 sp=0xc0000757a8 pc=0x561190af3cc8
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc0000757e0 sp=0xc0000757d8 pc=0x561190b2c4a1
created by runtime.gcenable
/usr/lib/golang/src/runtime/mgc.go:208 +0x5a
goroutine 4 [finalizer wait]:
runtime.gopark(0x56119231d068, 0x5611932ac868, 0x140f, 0x1)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc000075f58 sp=0xc000075f38 pc=0x561190b00915
runtime.goparkunlock(...)
/usr/lib/golang/src/runtime/proc.go:307
runtime.runfinq()
/usr/lib/golang/src/runtime/mfinal.go:175 +0xaa fp=0xc000075fe0 sp=0xc000075f58 pc=0x561190aea80a
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000075fe8 sp=0xc000075fe0 pc=0x561190b2c4a1
created by runtime.createfing
/usr/lib/golang/src/runtime/mfinal.go:156 +0x63
goroutine 5 [timer goroutine (idle)]:
runtime.gopark(0x56119231d068, 0x561193293b20, 0x5611912b1414, 0x1)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc000074760 sp=0xc000074740 pc=0x561190b00915
runtime.goparkunlock(...)
/usr/lib/golang/src/runtime/proc.go:307
runtime.timerproc(0x561193293b20)
/usr/lib/golang/src/runtime/time.go:303 +0x277 fp=0xc0000747d8 sp=0xc000074760 pc=0x561190b1d7e7
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc0000747e0 sp=0xc0000747d8 pc=0x561190b2c4a1
created by runtime.(*timersBucket).addtimerLocked
/usr/lib/golang/src/runtime/time.go:169 +0x110
goroutine 6 [syscall]:
runtime.notetsleepg(0x5611932acf60, 0xffffffffffffffff, 0x0)
/usr/lib/golang/src/runtime/lock_futex.go:227 +0x38 fp=0xc000076798 sp=0xc000076768 pc=0x561190adcf88
os/signal.signal_recv(0x0)
/usr/lib/golang/src/runtime/sigqueue.go:139 +0x9e fp=0xc0000767c0 sp=0xc000076798 pc=0x561190b1511e
os/signal.loop()
/usr/lib/golang/src/os/signal/signal_unix.go:23 +0x24 fp=0xc0000767e0 sp=0xc0000767c0 pc=0x561191062e24
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc0000767e8 sp=0xc0000767e0 pc=0x561190b2c4a1
created by os/signal.init.0
/usr/lib/golang/src/os/signal/signal_unix.go:29 +0x43
goroutine 7 [GC worker (idle)]:
runtime.gopark(0x56119231cf00, 0xc000451ac0, 0x1417, 0x0)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc000076f60 sp=0xc000076f40 pc=0x561190b00915
runtime.gcBgMarkWorker(0xc000054000)
/usr/lib/golang/src/runtime/mgc.go:1836 +0x105 fp=0xc000076fd8 sp=0xc000076f60 pc=0x561190aee325
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000076fe0 sp=0xc000076fd8 pc=0x561190b2c4a1
created by runtime.gcBgMarkStartWorkers
/usr/lib/golang/src/runtime/mgc.go:1784 +0x79
goroutine 8 [GC worker (idle)]:
runtime.gopark(0x56119231cf00, 0xc000451ad0, 0x1417, 0x0)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc000077760 sp=0xc000077740 pc=0x561190b00915
runtime.gcBgMarkWorker(0xc000056500)
/usr/lib/golang/src/runtime/mgc.go:1836 +0x105 fp=0xc0000777d8 sp=0xc000077760 pc=0x561190aee325
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc0000777e0 sp=0xc0000777d8 pc=0x561190b2c4a1
created by runtime.gcBgMarkStartWorkers
/usr/lib/golang/src/runtime/mgc.go:1784 +0x79
goroutine 9 [GC worker (idle)]:
runtime.gopark(0x56119231cf00, 0xc000451ae0, 0x1417, 0x0)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc000077f60 sp=0xc000077f40 pc=0x561190b00915
runtime.gcBgMarkWorker(0xc000058a00)
/usr/lib/golang/src/runtime/mgc.go:1836 +0x105 fp=0xc000077fd8 sp=0xc000077f60 pc=0x561190aee325
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000077fe0 sp=0xc000077fd8 pc=0x561190b2c4a1
created by runtime.gcBgMarkStartWorkers
/usr/lib/golang/src/runtime/mgc.go:1784 +0x79
goroutine 10 [GC worker (idle)]:
runtime.gopark(0x56119231cf00, 0xc000451af0, 0x1417, 0x0)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc000070760 sp=0xc000070740 pc=0x561190b00915
runtime.gcBgMarkWorker(0xc00005af00)
/usr/lib/golang/src/runtime/mgc.go:1836 +0x105 fp=0xc0000707d8 sp=0xc000070760 pc=0x561190aee325
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc0000707e0 sp=0xc0000707d8 pc=0x561190b2c4a1
created by runtime.gcBgMarkStartWorkers
/usr/lib/golang/src/runtime/mgc.go:1784 +0x79
goroutine 18 [GC worker (idle)]:
runtime.gopark(0x56119231cf00, 0xc000451b00, 0x1417, 0x0)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc0004f8760 sp=0xc0004f8740 pc=0x561190b00915
runtime.gcBgMarkWorker(0xc00005d400)
/usr/lib/golang/src/runtime/mgc.go:1836 +0x105 fp=0xc0004f87d8 sp=0xc0004f8760 pc=0x561190aee325
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc0004f87e0 sp=0xc0004f87d8 pc=0x561190b2c4a1
created by runtime.gcBgMarkStartWorkers
/usr/lib/golang/src/runtime/mgc.go:1784 +0x79
goroutine 11 [GC worker (idle)]:
runtime.gopark(0x56119231cf00, 0xc000451b10, 0x1417, 0x0)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc000070f60 sp=0xc000070f40 pc=0x561190b00915
runtime.gcBgMarkWorker(0xc00005f900)
/usr/lib/golang/src/runtime/mgc.go:1836 +0x105 fp=0xc000070fd8 sp=0xc000070f60 pc=0x561190aee325
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000070fe0 sp=0xc000070fd8 pc=0x561190b2c4a1
created by runtime.gcBgMarkStartWorkers
/usr/lib/golang/src/runtime/mgc.go:1784 +0x79
goroutine 12 [GC worker (idle)]:
runtime.gopark(0x56119231cf00, 0xc000504000, 0x1417, 0x0)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc000071760 sp=0xc000071740 pc=0x561190b00915
runtime.gcBgMarkWorker(0xc000062000)
/usr/lib/golang/src/runtime/mgc.go:1836 +0x105 fp=0xc0000717d8 sp=0xc000071760 pc=0x561190aee325
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc0000717e0 sp=0xc0000717d8 pc=0x561190b2c4a1
created by runtime.gcBgMarkStartWorkers
/usr/lib/golang/src/runtime/mgc.go:1784 +0x79
goroutine 13 [GC worker (idle)]:
runtime.gopark(0x56119231cf00, 0xc000504010, 0x1417, 0x0)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc000071f60 sp=0xc000071f40 pc=0x561190b00915
runtime.gcBgMarkWorker(0xc000064500)
/usr/lib/golang/src/runtime/mgc.go:1836 +0x105 fp=0xc000071fd8 sp=0xc000071f60 pc=0x561190aee325
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000071fe0 sp=0xc000071fd8 pc=0x561190b2c4a1
created by runtime.gcBgMarkStartWorkers
/usr/lib/golang/src/runtime/mgc.go:1784 +0x79
goroutine 14 [chan receive]:
runtime.gopark(0x56119231d068, 0xc000378058, 0x170d, 0x3)
/usr/lib/golang/src/runtime/proc.go:301 +0xf5 fp=0xc0004fa6d0 sp=0xc0004fa6b0 pc=0x561190b00915
runtime.goparkunlock(...)
/usr/lib/golang/src/runtime/proc.go:307
runtime.chanrecv(0xc000378000, 0xc0004fa7b0, 0xc0002e2001, 0xc000378000)
/usr/lib/golang/src/runtime/chan.go:524 +0x2ee fp=0xc0004fa760 sp=0xc0004fa6d0 pc=0x561190ad806e
runtime.chanrecv2(0xc000378000, 0xc0004fa7b0, 0x0)
/usr/lib/golang/src/runtime/chan.go:411 +0x2b fp=0xc0004fa790 sp=0xc0004fa760 pc=0x561190ad7d6b
github.com/containers/libpod/vendor/github.com/golang/glog.(*loggingT).flushDaemon(0x561193285bc0)
/builddir/build/BUILD/libpod-7210727e205c333af9a2d0ed0bb66adcf92a6369/_build/src/github.com/containers/libpod/vendor/github.com/golang/glog/glog.go:882 +0x8d fp=0xc0004fa7d8 sp=0xc0004fa790 pc=0x5611913f056d
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc0004fa7e0 sp=0xc0004fa7d8 pc=0x561190b2c4a1
created by github.com/containers/libpod/vendor/github.com/golang/glog.init.0
/builddir/build/BUILD/libpod-7210727e205c333af9a2d0ed0bb66adcf92a6369/_build/src/github.com/containers/libpod/vendor/github.com/golang/glog/glog.go:410 +0x274
goroutine 34 [syscall]:
runtime.notetsleepg(0x561193293c40, 0x6fc234bfe, 0x0)
/usr/lib/golang/src/runtime/lock_futex.go:227 +0x38 fp=0xc0004f5f60 sp=0xc0004f5f30 pc=0x561190adcf88
runtime.timerproc(0x561193293c20)
/usr/lib/golang/src/runtime/time.go:311 +0x2ee fp=0xc0004f5fd8 sp=0xc0004f5f60 pc=0x561190b1d85e
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc0004f5fe0 sp=0xc0004f5fd8 pc=0x561190b2c4a1
created by runtime.(*timersBucket).addtimerLocked
/usr/lib/golang/src/runtime/time.go:169 +0x110
[1] 18721 abort (core dumped) podman images
Only after deleting the bolt_state.db does podman become usable again.
$ rm /home/deefin/.local/share/containers/storage/libpod/bolt_state.db
I then have to force remove the image
$ buildah rmi a6beb02e9bd8
Could not remove image "a6beb02e9bd8" (must force) - container "26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6" is using its reference image: image is in use by a container
ERRO[0000] exit status 1
$ buildah rmi -f a6beb02e9bd8
a6beb02e9bd8ca0da34a034e51d28a36762fb2df29017718cfb113febb0b2600
Describe the results you expected:
Is this whats expected when a container freezes?
Is there a standard or better way to handle this?
Output of podman version:
$ podman version
Version: 1.3.1
RemoteAPI Version: 1
Go Version: go1.12.2
OS/Arch: linux/amd64
Output of podman info --debug:
$ podman info --debug
debug:
compiler: gc
git commit: ""
go version: go1.12.2
podman version: 1.3.1
host:
BuildahVersion: 1.8.2
Conmon:
package: podman-1.3.1-1.git7210727.fc30.x86_64
path: /usr/libexec/podman/conmon
version: 'conmon version 1.12.0-dev, commit: c9a4c48d1bff85033b7fc9b62d25961dd5048689'
Distribution:
distribution: fedora
version: "30"
MemFree: 3593756672
MemTotal: 16670965760
OCIRuntime:
package: runc-1.0.0-93.dev.gitb9b6cc6.fc30.x86_64
path: /usr/bin/runc
version: |-
runc version 1.0.0-rc8+dev
commit: e3b4c1108f7d1bf0d09ab612ea09927d9b59b4e3
spec: 1.0.1-dev
SwapFree: 8405381120
SwapTotal: 8405381120
arch: amd64
cpus: 8
hostname: deefin
kernel: 5.1.6-300.fc30.x86_64
os: linux
rootless: true
uptime: 3h 45m 17s (Approximately 0.12 days)
registries:
blocked: null
insecure:
- registry.local:5000
search:
- docker.io
- registry.fedoraproject.org
- quay.io
- registry.access.redhat.com
- registry.centos.org
store:
ConfigFile: /home/deefin/.config/containers/storage.conf
ContainerStore:
number: 6
GraphDriverName: overlay
GraphOptions:
- overlay.mount_program=/usr/bin/fuse-overlayfs
GraphRoot: /home/deefin/.local/share/containers/storage
GraphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "false"
ImageStore:
number: 2
RunRoot: /tmp/1000
VolumePath: /home/deefin/.local/share/containers/storage/volumes
Additional environment details (AWS, VirtualBox, physical, etc.):
$ cat /etc/*-release
Fedora release 30 (Thirty)
NAME=Fedora
VERSION="30 (Workstation Edition)"
ID=fedora
VERSION_ID=30
VERSION_CODENAME=""
PLATFORM_ID="platform:f30"
PRETTY_NAME="Fedora 30 (Workstation Edition)"
ANSI_COLOR="0;34"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:30"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f30/system-administrators-guide/"
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=30
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=30
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Workstation Edition"
VARIANT_ID=workstation
Fedora release 30 (Thirty)
Fedora release 30 (Thirty)
@mheon PTAL
To update, I can only reproduce this behaviour when using go get with a go.mod. Outside the project dir, I can go get github.com/go-sql-driver/mysql without issues...
[myuser@app-c1 app]$ go version
go version go1.12.5 linux/amd64
There seem to be two issues here.
The first is a stalled go get in a container, which could be a bug.
The second is kill -9 leaving artifacts around. That one's a lot easier to answer. We recommend that you never manually SIGKILL a container. If you have a nonresponsive container, podman kill and podman stop should be used instead; when the container exits, the Podman process will detect this and exit. If you're hitting Podman itself with a -9, you're also not killing the container, just the frontend; the container is daemonized with Conmon monitoring it.
That said, it seems like your container has managed to survive SIGKILL from podman stop and is, nearest Podman can tell, still running. Since that shouldn't be possible under normal circumstances, it's probably zombified, stuck in uninterruptable IO sleep. That's definitely not good, but I don't think it's Podman's fault - more likely whatever is running in the container.
The segfault on podman images looks like a real, serious bug. I'll poke around.
(Also, if you podman stop a container and it's still running, kill -9 isn't going to help - we've already done that as part of podman stop)
images segfault probably a nil pointer in the image runtime's store - which can potentially happen if we're running rootless, but probably shouldn't.
Hm. @giuseppe he's manually killing conmon - aren't we using conmon to hold open the namespaces for rootless? What happens if that gets a SIGKILL?
Hm. @giuseppe he's manually killing
conmon- aren't we usingconmonto hold open the namespaces for rootless? What happens if that gets a SIGKILL?
that should not be an issue anymore with the pause process.
The issue seems related to fuse-overlayfs, could you try the last version from bodhi? It solves exactly an issue where flock(2) would hang the fuse-overlayfs process and the container: https://bodhi.fedoraproject.org/updates/FEDORA-2019-fff1ded16e
Thanks for the response @mheon
(Also, if you
podman stopa container and it's still running,kill -9isn't going to help - we've already done that as part ofpodman stop)
I disagree, podman kill fails to kill the running container and this is evident when issuing a podman ps.
Thus far the only solution I have had is to kill -9 from my host user. Im more then happy to keep debugging the go get side of things. Ill have a look through podman code and see if i can get it to spew anywhere... If you have any recommendations on where to start, please feel free to advise :)
imagessegfault probably a nil pointer in the image runtime'sstore- which can potentially happen if we're running rootless, but probably shouldn't.
I can 100% confirm this implementation has been rootless
Hm. @giuseppe he's manually killing
conmon- aren't we usingconmonto hold open the namespaces for rootless? What happens if that gets a SIGKILL?
After replicating, I can see no change in namespaces pre/post killing conmon. I do get booted from from my podman unshare namespace when i kill with -9.
$ lsns | grep 'fuse\|bash'
4026532674 user 12 17308 deefin /usr/bin/fuse-overlayfs -o lowerdir=/home/deefin/.local/share/containers/storage/overlay/l/UL73MVPSNKKGHH4JPF75PQVPK7:/home/deefin/.local/share/containers/storage/overlay/l/KQ4JDPWGAIXVOTFKW67CSRUL6F,upperdir=/home/deefin/.local/share/containers/storage/overlay/99def45a2bab9ae00900acf0ad7bf2022a6a310be2795622c5d74d29979a2165/diff,workdir=/home/deefin/.local/share/containers/storage/overlay/99def45a2bab9ae00900acf0ad7bf2022a6a310be2795622c5d74d29979a2165/work,context="system_u:object_r:container_file_t:s0:c374,c444" /home/deefin/.local/share/containers/storage/overlay/99def45a2bab9ae00900acf0ad7bf2022a6a310be2795622c5d74d29979a2165/merged
4026532685 mnt 2 17308 deefin /usr/bin/fuse-overlayfs -o lowerdir=/home/deefin/.local/share/containers/storage/overlay/l/UL73MVPSNKKGHH4JPF75PQVPK7:/home/deefin/.local/share/containers/storage/overlay/l/KQ4JDPWGAIXVOTFKW67CSRUL6F,upperdir=/home/deefin/.local/share/containers/storage/overlay/99def45a2bab9ae00900acf0ad7bf2022a6a310be2795622c5d74d29979a2165/diff,workdir=/home/deefin/.local/share/containers/storage/overlay/99def45a2bab9ae00900acf0ad7bf2022a6a310be2795622c5d74d29979a2165/work,context="system_u:object_r:container_file_t:s0:c374,c444" /home/deefin/.local/share/containers/storage/overlay/99def45a2bab9ae00900acf0ad7bf2022a6a310be2795622c5d74d29979a2165/merged
4026532686 mnt 10 17322 100999 /bin/bash
4026532687 uts 10 17322 100999 /bin/bash
4026532688 ipc 10 17322 100999 /bin/bash
4026532689 pid 10 17322 100999 /bin/bash
4026532691 net 10 17322 100999 /bin/bash
In reference to my previous output of the below.
$ podman stats app-c
Error: unable to load cgroup at /libpod_parent/libpod-26c70b225c7728ae16fba9e764bdf98e4633829653183e22e93b875473f917e6: cgroups: cgroup deleted
This might related? I get it when debugging podman run
WARN[0000] Failed to add conmon to cgroupfs sandbox cgroup: mkdir /sys/fs/cgroup/systemd/libpod_parent: permission denied
Hm. @giuseppe he's manually killing
conmon- aren't we usingconmonto hold open the namespaces for rootless? What happens if that gets a SIGKILL?that should not be an issue anymore with the pause process.
The issue seems related to fuse-overlayfs, could you try the last version from bodhi? It solves exactly an issue where
flock(2)would hang the fuse-overlayfs process and the container: https://bodhi.fedoraproject.org/updates/FEDORA-2019-fff1ded16e
this resolved the go get issue :)
Is that WARN anything to worry about then?
Thanks for the response @mheon
(Also, if you
podman stopa container and it's still running,kill -9isn't going to help - we've already done that as part ofpodman stop)I disagree,
podman killfails to kill the running container and this is evident when issuing apodman ps.Thus far the only solution I have had is to kill -9 from my host user. Im more then happy to keep debugging the
go getside of things. Ill have a look through podman code and see if i can get it to spew anywhere... If you have any recommendations on where to start, please feel free to advise :)
Again, that won't help - we've tried that already as part of podman stop. You're killing Podman's command line process and conmon, our container monitor process, but the container itself is still running - if you grep through ps I bet you'll find a go get still running, zombified.
You can kill the Podman frontend without consequence - it's not actually frozen, just attached to a frozen container. You should be able to detach from it (default keys Control-p Control-q) or hit it with a SIGINT and it will close.
Killing conmon is more problematic. It's monitoring the container, and we use it to determine what the container's status is. The container itself is frozen here, so conmon is sitting there doing nothing, waiting for it to exit. I'd strongly recommend that you leave conmon around in cases like this - it'll keep Podman running smoothly, though you'll probably need a reboot to get rid of the zombie processes.
@mheon Thanks for the reply.
I was never able to drop out of the container with SIGINT(crtl-c), I always had to open a new terminal up and kill it that way.
Killing conmon is more problematic. It's monitoring the container, and we use it to determine what the container's status is. The container itself is frozen here, so conmon is sitting there doing nothing, waiting for it to exit. I'd strongly recommend that you leave conmon around in cases like this - it'll keep Podman running smoothly, though you'll probably need a reboot to get rid of the zombie processes.
So is there a recommendation on how to handle the following conditions:
podman X Seems like the only thing to do (besides rebooting) is to try kill all podman procs and leave conmon alone?
Is that
WARNanything to worry about then?
no, that is expected as rootless containers cannot (yet) use cgroups.
This is a fuse-overlayfs issue, I think we can close it as it is already been addressed and new packages are on their way
Please update and tests against the latest fuse-overlayfs.
Is that
WARNanything to worry about then?no, that is expected as rootless containers cannot (yet) use cgroups.
This is a fuse-overlayfs issue, I think we can close it as it is already been addressed and new packages are on their way
Yes, the latest fuse-overlayfs pkg resolved the go get hang.
Still unsure on how to handle that container state? or is the handling of this state no longer valid with the resolution?
Ideally this happens only very rarely - processes should not get themselves
stuck such that SIGKILL doesn't get rid of them often. In those cases, I
think that manually removing Podman precesses and leaving Conmon around is
the right call.
On Wed, Jun 12, 2019, 05:37 dom finn notifications@github.com wrote:
Is that WARN anything to worry about then?
no, that is expected as rootless containers cannot (yet) use cgroups.
This is a fuse-overlayfs issue, I think we can close it as it is already
been addressed and new packages are on their wayYes, the latest fuse-overlayfs pkg resolved the go get hang.
Still unsure on how to handle that container state? or is the handling of
this state no longer valid with the resolution?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/containers/libpod/issues/3289?email_source=notifications&email_token=AB3AOCDFYJ6RGWDREVUVBJ3P2C7UPA5CNFSM4HWSHVP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXP2UZQ#issuecomment-501197414,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AB3AOCBTWCI32GDI46N4UILP2C7UPANCNFSM4HWSHVPQ
.
Thank you all @mheon @rhatdan @giuseppe for your time. Much appreciated.
Most helpful comment
that should not be an issue anymore with the pause process.
The issue seems related to fuse-overlayfs, could you try the last version from bodhi? It solves exactly an issue where
flock(2)would hang the fuse-overlayfs process and the container: https://bodhi.fedoraproject.org/updates/FEDORA-2019-fff1ded16e