Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
Sometimes, podman logs crashes saying it has run out of memory. Looks like it's trying to load the whole file in RAM, and crashes when it doesn't fit.
Steps to reproduce the issue:
# podman logs -f --tail=10 some-containerDescribe the results you received:
fatal error: runtime: out of memory
runtime stack:
runtime.throw(0x1184bbe, 0x16)
/usr/lib/go-1.10/src/runtime/panic.go:616 +0x60
runtime.sysMap(0x1f990000, 0x17b10000, 0xbee34401, 0x1e21328)
/usr/lib/go-1.10/src/runtime/mem_linux.go:227 +0x11c
runtime.(*mheap).sysAlloc(0x1e13520, 0x17b10000, 0x2)
/usr/lib/go-1.10/src/runtime/malloc.go:470 +0x23c
runtime.(*mheap).grow(0x1e13520, 0xbd88, 0x0)
/usr/lib/go-1.10/src/runtime/mheap.go:907 +0x48
runtime.(*mheap).allocSpanLocked(0x1e13520, 0xbd85, 0x1e21338, 0xfffffade)
/usr/lib/go-1.10/src/runtime/mheap.go:820 +0x30c
runtime.(*mheap).alloc_m(0x1e13520, 0xbd85, 0x101, 0x1)
/usr/lib/go-1.10/src/runtime/mheap.go:686 +0x118
runtime.(*mheap).alloc.func1()
/usr/lib/go-1.10/src/runtime/mheap.go:753 +0x3c
runtime.(*mheap).alloc(0x1e13520, 0xbd85, 0x13010101, 0x0)
/usr/lib/go-1.10/src/runtime/mheap.go:752 +0x60
runtime.largeAlloc(0x17b081e0, 0x13740101, 0x137000e0)
/usr/lib/go-1.10/src/runtime/malloc.go:826 +0x74
runtime.mallocgc.func1()
/usr/lib/go-1.10/src/runtime/malloc.go:721 +0x38
runtime.systemstack(0x0)
/usr/lib/go-1.10/src/runtime/asm_arm.s:349 +0x80
runtime.mstart()
/usr/lib/go-1.10/src/runtime/proc.go:1175
goroutine 1 [running]:
runtime.systemstack_switch()
/usr/lib/go-1.10/src/runtime/asm_arm.s:294 +0x4 fp=0x13ba7af4 sp=0x13ba7af0 pc=0x6dac0
runtime.mallocgc(0x17b081e0, 0xf86268, 0x13ba7b01, 0xbd83ff0)
/usr/lib/go-1.10/src/runtime/malloc.go:720 +0x89c fp=0x13ba7b54 sp=0x13ba7af4 pc=0x287a8
runtime.makeslice(0xf86268, 0x17b081e0, 0x17b081e0, 0x13bc6000, 0xbd83ff0, 0xbd83ff0)
/usr/lib/go-1.10/src/runtime/slice.go:61 +0x68 fp=0x13ba7b68 sp=0x13ba7b54 pc=0x5b828
bytes.makeSlice(0x17b081e0, 0x0, 0x0, 0x0)
/usr/lib/go-1.10/src/bytes/buffer.go:230 +0x58 fp=0x13ba7b84 sp=0x13ba7b68 pc=0xe4bac
bytes.(*Buffer).grow(0x13a936e0, 0x200, 0xbd83ff0)
/usr/lib/go-1.10/src/bytes/buffer.go:144 +0x12c fp=0x13ba7ba8 sp=0x13ba7b84 pc=0xe457c
bytes.(*Buffer).ReadFrom(0x13a936e0, 0x1343580, 0x13917c70, 0x0, 0xc, 0x1a1f50, 0x13ba7c10)
/usr/lib/go-1.10/src/bytes/buffer.go:204 +0x4c fp=0x13ba7be4 sp=0x13ba7ba8 pc=0xe49d0
io/ioutil.readAll(0x1343580, 0x13917c70, 0xbd83ff0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/lib/go-1.10/src/io/ioutil/ioutil.go:36 +0x9c fp=0x13ba7c08 sp=0x13ba7be4 pc=0x1a1de0
io/ioutil.ReadFile(0x13a8ac00, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/lib/go-1.10/src/io/ioutil/ioutil.go:73 +0xb8 fp=0x13ba7c34 sp=0x13ba7c08 pc=0x1a1f9c
github.com/containers/libpod/libpod/logs.getTailLog(0x13a8ac00, 0x80, 0xa, 0x13917c68, 0x8, 0x8, 0x11633f0, 0x0)
src/github.com/containers/libpod/libpod/logs/log.go:80 +0x24 fp=0x13ba7ca4 sp=0x13ba7c34 pc=0xbd1414
github.com/containers/libpod/libpod/logs.GetLogFile(0x13a8ac00, 0x80, 0x13b67830, 0x1e1e8b8, 0x0, 0x13917c68, 0x139961d8, 0xbee34883, 0x2)
src/github.com/containers/libpod/libpod/logs/log.go:59 +0x12c fp=0x13ba7d08 sp=0x13ba7ca4 pc=0xbd1378
github.com/containers/libpod/libpod.(*Container).readFromLogFile(0x13a7fdc0, 0x13b67830, 0x13a7fd80, 0xd91d00, 0x30)
src/github.com/containers/libpod/libpod/container.log.go:32 +0x34 fp=0x13ba7d70 sp=0x13ba7d08 pc=0xc8f684
github.com/containers/libpod/libpod.(*Container).ReadLog(0x13a7fdc0, 0x13b67830, 0x13a7fd80, 0x0, 0x0)
src/github.com/containers/libpod/libpod/container.log.go:28 +0x40 fp=0x13ba7d88 sp=0x13ba7d70 pc=0xc8f5d0
github.com/containers/libpod/libpod.(*Runtime).Log(0x13996180, 0x13917c68, 0x1, 0x2, 0x13b67830, 0x13a7fd80, 0x1, 0x2)
src/github.com/containers/libpod/libpod/container.log.go:14 +0x50 fp=0x13ba7da4 sp=0x13ba7d88 pc=0xc8f558
github.com/containers/libpod/pkg/adapter.(*LocalRuntime).Log(0x13acc530, 0x1e0b410, 0x13b67830, 0x137c7d60, 0x0)
src/github.com/containers/libpod/pkg/adapter/containers.go:315 +0x108 fp=0x13ba7dec sp=0x13ba7da4 pc=0xd91e10
main.logsCmd(0x1e0b410, 0x0, 0x0)
src/github.com/containers/libpod/cmd/podman/logs.go:89 +0x148 fp=0x13ba7e68 sp=0x13ba7dec pc=0xe3f908
main.glob..func21(0x1db34c8, 0x13aee160, 0x1, 0x3, 0x0, 0x0)
src/github.com/containers/libpod/cmd/podman/logs.go:28 +0x68 fp=0x13ba7e78 sp=0x13ba7e68 pc=0xe6e284
github.com/containers/libpod/vendor/github.com/spf13/cobra.(*Command).execute(0x1db34c8, 0x13736040, 0x3, 0x4, 0x1db34c8, 0x13736040)
src/github.com/containers/libpod/vendor/github.com/spf13/cobra/command.go:826 +0x360 fp=0x13ba7eec sp=0x13ba7e78 pc=0x1b7b78
github.com/containers/libpod/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x1db6bc8, 0x0, 0x0,0x1370f100)
src/github.com/containers/libpod/vendor/github.com/spf13/cobra/command.go:914 +0x208 fp=0x13ba7f8c sp=0x13ba7eec pc=0x1b82a0
github.com/containers/libpod/vendor/github.com/spf13/cobra.(*Command).Execute(0x1db6bc8, 0x6, 0x116cda6)
src/github.com/containers/libpod/vendor/github.com/spf13/cobra/command.go:864 +0x1c fp=0x13ba7fa0 sp=0x13ba7f8c pc=0x1b8078
main.main()
src/github.com/containers/libpod/cmd/podman/main.go:160 +0x50 fp=0x13ba7fbc sp=0x13ba7fa0 pc=0xe40034
runtime.main()
/usr/lib/go-1.10/src/runtime/proc.go:198 +0x204 fp=0x13ba7fe4 sp=0x13ba7fbc pc=0x45880
runtime.goexit()
/usr/lib/go-1.10/src/runtime/asm_arm.s:1015 +0x4 fp=0x13ba7fe4 sp=0x13ba7fe4 pc=0x6fac0
goroutine 39 [chan receive]:
github.com/containers/libpod/vendor/k8s.io/klog.(*loggingT).flushDaemon(0x1e06a90)
src/github.com/containers/libpod/vendor/k8s.io/klog/klog.go:990 +0x70
created by github.com/containers/libpod/vendor/k8s.io/klog.init.0
src/github.com/containers/libpod/vendor/k8s.io/klog/klog.go:404 +0x68
goroutine 37 [syscall]:
os/signal.signal_recv(0x6fac0)
/usr/lib/go-1.10/src/runtime/sigqueue.go:139 +0x130
os/signal.loop()
/usr/lib/go-1.10/src/os/signal/signal_unix.go:22 +0x14
created by os/signal.init.0
/usr/lib/go-1.10/src/os/signal/signal_unix.go:28 +0x30
goroutine 15 [syscall]:
syscall.Syscall6(0xfc, 0x6, 0x1386fd90, 0x7, 0xffffffff, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/lib/go-1.10/src/syscall/asm_linux_arm.s:48 +0x8
github.com/containers/libpod/vendor/golang.org/x/sys/unix.EpollWait(0x6, 0x1386fd90, 0x7, 0x7, 0xffffffff, 0x0, 0x0, 0x1)
src/github.com/containers/libpod/vendor/golang.org/x/sys/unix/zsyscall_linux_arm.go:1959 +0x58
github.com/containers/libpod/vendor/github.com/fsnotify/fsnotify.(*fdPoller).wait(0x139e4b60, 0x1238300, 0x139e4b60, 0x0)
src/github.com/containers/libpod/vendor/github.com/fsnotify/fsnotify/inotify_poller.go:86 +0x4c
github.com/containers/libpod/vendor/github.com/fsnotify/fsnotify.(*Watcher).readEvents(0x13b677d0)
src/github.com/containers/libpod/vendor/github.com/fsnotify/fsnotify/inotify.go:192 +0x16c
created by github.com/containers/libpod/vendor/github.com/fsnotify/fsnotify.NewWatcher
src/github.com/containers/libpod/vendor/github.com/fsnotify/fsnotify/inotify.go:59 +0x174
goroutine 16 [select]:
github.com/containers/libpod/vendor/github.com/cri-o/ocicni/pkg/ocicni.(*cniNetworkPlugin).monitorConfDir(0x139b5c70, 0x139e4b90)
src/github.com/containers/libpod/vendor/github.com/cri-o/ocicni/pkg/ocicni/ocicni.go:144 +0x14c
created by github.com/containers/libpod/vendor/github.com/cri-o/ocicni/pkg/ocicni.initCNI
src/github.com/containers/libpod/vendor/github.com/cri-o/ocicni/pkg/ocicni/ocicni.go:236 +0x2c8
r
Describe the results you expected:
Additional information you deem important (e.g. issue happens only occasionally):
Output of podman version:
Version: 1.6.2
RemoteAPI Version: 1
Go Version: go1.10.4
OS/Arch: linux/arm
Output of podman info --debug:
debug:
compiler: gc
git commit: ""
go version: go1.10.4
podman version: 1.6.2
host:
BuildahVersion: 1.11.3
CgroupVersion: v1
Conmon:
package: 'conmon: /usr/bin/conmon'
path: /usr/bin/conmon
version: 'conmon version 2.0.2, commit: unknown'
Distribution:
distribution: debian
version: "9"
MemFree: 180224000
MemTotal: 516104192
OCIRuntime:
name: runc
package: 'cri-o-runc: /usr/lib/cri-o-runc/sbin/runc'
path: /usr/lib/cri-o-runc/sbin/runc
version: 'runc version spec: 1.0.1-dev'
SwapFree: 110460928
SwapTotal: 258048000
arch: arm
cpus: 4
eventlogger: journald
hostname: hubot
kernel: 4.19.62-sunxi
os: linux
rootless: false
uptime: 41h 54m 53.58s (Approximately 1.71 days)
registries:
blocked: null
insecure: null
search:
- docker.io
store:
ConfigFile: /etc/containers/storage.conf
ContainerStore:
number: 8
GraphDriverName: overlay
GraphOptions: {}
GraphRoot: /var/lib/containers/storage
GraphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "true"
Supports d_type: "true"
Using metacopy: "false"
ImageStore:
number: 9
RunRoot: /var/run/containers/storage
VolumePath: /var/lib/containers/storage/volumes
Package info (e.g. output of rpm -q podman or apt list podman):
podman/bionic,now 1.6.2-1~ubuntu18.04~ppa1 armhf [installed]
Additional environment details (AWS, VirtualBox, physical, etc.):
OrangePi Zero
Memory at the time of the occurrence:
# free -hm
total used free shared buff/cache available
Mem: 492M 172M 216M 21M 104M 288M
Swap: 246M 140M 105M
Size of the ctr.log at that same time:
# ls -lahi ctr.log
239654 -rw------- 1 root root 190M Feb 7 18:37 ctr.log
Eek. We may be reading the entire file into memory...
@mheon we did that intentionally ... waiting for and hoping this would never be the case. ill take this one
@mheon lets discuss this on monday ... it's a reasonably deep hole we will need to go in to fix this. I am not discounting the bug, more on how we want to fix it.
some thoughts before i loose them ...
the failure here is fortunately not with the tail pkg we use. It is specifically with the function to print out out the previous entries of the log. When getting the logs of a container, we print N number of last log entries. The default is to print them all but N can also be provided (like show me the last 10 lines).
To do this, we were reading the whole file and then working backwards to find N number of lines. A line is something that ends in \n. A log line has must both end in \n AND fit a certain format or it is NOT considered a log (line). Reading of the whole file is what tripped when the file size was larger than the available memory.
To fix this,we will need to generally open the file, seek the end. Then read backwards finding the beginning and ending of a line (between \n's). That line needs to then be qualified as log line (has the right format). then, increment the number of found lines if N is set.
the additional quirk here is that we currently return this information to the caller method as a []logline. Given that we are here to fix low memory constraints, it is entirely possible that the []logline could exceed memory capacity (in the case of a large log) so we would only be moving the memory problem from the read to the return.
there is a log channel where we normally send stuff. the caller of the method in question, usually iterates the returned []logline and sends it to output. i think to do this correctly, the log channel will need to be provided deeper into the methods to immediately output and keep memory usage ideal.
One thing we might want to add is a way to truncate the logs, rather then allowing them to grow forever.