Podman: Spike of disk I/O with podman run

Created on 11 Mar 2020 · 21Comments · Source: containers/podman

/kind bug

Description
Briefly after starting any new container with podman run ... I notice a big spike on hard drive activity that doesn't stop after minutes.

The moment I exit the container, disk usage goes back to normal.

Steps to reproduce the issue:

podman run -it alpine /bin/sh
wait a few seconds or execute a non harmful ls -la /

Describe the results you received:
A lot of abnormal disk usage (LED is constantly blinking).

Additional information you deem important (e.g. issue happens only occasionally):
However, if I execute a shell on a detached container, nothing strange happens. This works fine:

➜  $ ~ podman run --detach alpine sh -c 'while true ; do sleep 100000 ; done'
➜  $ ~ podman exec -it whatever_container /bin/sh

I've recently changed storage driver to overlay and installed fuse-overlayfs on my system. Moving back to vfs makes no difference.

Output of podman version:

Version:            1.8.0
RemoteAPI Version:  1
Go Version:         go1.14
Git Commit:         2ced9094d4728dd09f60a177faa32339a8d0f721
Built:              Sat Feb 29 14:20:42 2020
OS/Arch:            linux/amd64

Output of podman info --debug:

debug:
  compiler: gc
  git commit: 2ced9094d4728dd09f60a177faa32339a8d0f721
  go version: go1.14
  podman version: 1.8.0
host:
  BuildahVersion: 1.13.1
  CgroupVersion: v1
  Conmon:
    package: Unknown
    path: /usr/bin/conmon
    version: 'conmon version 2.0.11, commit: ff9d97a08d7a4b58267ac03719786e4e7258cecf'
  Distribution:
    distribution: arch
    version: unknown
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  MemFree: 3655389184
  MemTotal: 8312897536
  OCIRuntime:
    name: runc
    package: Unknown
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc10
      commit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
      spec: 1.0.1-dev
  SwapFree: 4294426624
  SwapTotal: 4294963200
  arch: amd64
  cpus: 4
  eventlogger: journald
  hostname: myhost
  kernel: 5.5.8-arch1-1
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: Unknown
    Version: |-
      slirp4netns version 0.4.3
      commit: 2244b9b6461afeccad1678fac3d6e478c28b4ad6
  uptime: 7h 42m 0.54s (Approximately 0.29 days)
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  ConfigFile: /home/myuser/.config/containers/storage.conf
  ContainerStore:
    number: 0
  GraphDriverName: overlay
  GraphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: Unknown
      Version: |-
        fusermount3 version: 3.9.0
        fuse-overlayfs: version 0.7.7
        FUSE library version 3.9.0
        using FUSE kernel interface version 7.31
  GraphRoot: /home/myuser/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 1
  RunRoot: /run/user/1000/containers
  VolumePath: /home/myuser/.local/share/containers/storage/volumes

Configuration file (~/.config/containers/storage.conf):
I've tried two different configurations but neither works.

Config 1

[storage]
  driver = "overlay"
  runroot = "/run/user/1000/containers"
  graphroot = "/home/myuser/.local/share/containers/storage"
  [storage.options]
    additionalimagestores = ["/home/myuser/.local/share/containers2/storage"]
    size = ""
    remap-uids = ""
    remap-gids = ""
    ignore_chown_errors = ""
    remap-user = ""
    remap-group = ""
    skip_mount_home = ""
    mount_program = "/usr/bin/fuse-overlayfs"
    mountopt = ""
    [storage.options.aufs]
      mountopt = ""
    [storage.options.btrfs]
      min_space = ""
      size = ""
    [storage.options.thinpool]
      autoextend_percent = ""
      autoextend_threshold = ""
      basesize = ""
      blocksize = ""
      directlvm_device = ""
      directlvm_device_force = ""
      fs = ""
      log_level = ""
      min_free_space = ""
      mkfsarg = ""
      mountopt = ""
      size = ""
      use_deferred_deletion = ""
      use_deferred_removal = ""
      xfs_nospace_max_retries = ""
    [storage.options.overlay]
      ignore_chown_errors = ""
      mountopt = ""
      mount_program = ""
      size = ""
      skip_mount_home = ""
    [storage.options.vfs]
      ignore_chown_errors = ""
    [storage.options.zfs]
      mountopt = ""
      fsname = ""
      size = ""

Config 2

[storage]
  driver = "vfs"
  runroot = "/run/user/1000/containers"
  graphroot = "/home/mysuer/.local/share/containers/storage"
  [storage.options]
    size = ""
    remap-uids = ""
    remap-gids = ""
    ignore_chown_errors = ""
    remap-user = ""
    remap-group = ""
    skip_mount_home = ""
    mount_program = ""
    mountopt = ""
    [storage.options.aufs]
      mountopt = ""
    [storage.options.btrfs]
      min_space = ""
      size = ""
    [storage.options.thinpool]
      autoextend_percent = ""
      autoextend_threshold = ""
      basesize = ""
      blocksize = ""
      directlvm_device = ""
      directlvm_device_force = ""
      fs = ""
      log_level = ""
      min_free_space = ""
      mkfsarg = ""
      mountopt = ""
      size = ""
      use_deferred_deletion = ""
      use_deferred_removal = ""
      xfs_nospace_max_retries = ""
    [storage.options.overlay]
      ignore_chown_errors = ""
      mountopt = ""
      mount_program = ""
      size = ""
      skip_mount_home = ""
    [storage.options.vfs]
      ignore_chown_errors = ""
    [storage.options.zfs]
      mountopt = ""
      fsname = ""
      size = ""

Package info (e.g. output of rpm -q podman or apt list podman):

➜  $ ~ pacman -Qi podman  
Nombre                    : podman
Versión                   : 1.8.0-2
Descripción               : Tool and library for running OCI-based containers in pods
Arquitectura              : x86_64
URL                       : https://github.com/containers/libpod
Licencias                 : Apache
Grupos                    : Nada
Provee                    : Nada
Depende de                : cni-plugins  conmon  device-mapper  iptables  libseccomp  runc  skopeo
                            btrfs-progs  slirp4netns  libsystemd
Dependencias opcionales   : podman-docker: for Docker-compatible CLI
Exigido por               : Nada
Opcional para             : Nada
En conflicto con          : Nada
Remplaza a                : Nada
Tamaño de la instalación  : 99,67 MiB
Encargado                 : Morten Linderud <[email protected]>
Fecha de creación         : sáb 29 feb 2020 14:20:42
Fecha de instalación      : mar 03 mar 2020 21:05:01
Motivo de la instalación  : Instalado explícitamente
Guion de instalación      : No
Validado por              : Firma

Additional environment details (AWS, VirtualBox, physical, etc.):
Physical host.

kinbug

Source

miquecg

👍1

All 21 comments

Today I've received a Podman update on Arch Linux:

Version:            1.8.1
RemoteAPI Version:  1
Go Version:         go1.14
Git Commit:         444a19cdd2e6108c75f6c1aadc1a2a9138a8bd73
Built:              Wed Mar 11 22:49:18 2020
OS/Arch:            linux/amd64

But issue is still present.

miquecg on 12 Mar 2020

Do you have any idea what's being written? Can you use iotop to see what's generating the I/O?

mheon on 12 Mar 2020

👍1

Well, iotop doesn't show any extra information besides command and I/O metrics.

Command: podman run -it /bin/sh
IO: between 60-70 %

This time I could have a normal run without issues but on the next ones it happened again.

miquecg on 12 Mar 2020

So it is Podman itself, and not Conmon.

@baude Any thoughts here?

mheon on 12 Mar 2020

Oh - @miquecg Can you run a podman run command as root, and see if that triggers it? Would be useful to know if this is rootless-only

mheon on 12 Mar 2020

It happens also when running podman as root.

miquecg on 12 Mar 2020

filesystems arent my bag ... @giuseppe any ideas?

baude on 12 Mar 2020

I am not able to reproduce locally.

@miquecg could you strace the podman process?

Once you are able to reproduce the issue, please attach strace to it with: strace -f -p $PID_PODMAN

giuseppe on 13 Mar 2020

Command: podman run -it /bin/sh
IO: between 60-70 %

I measured I/O again only showing processes this time, not threads. Numbers are one order of magnitude smaller (3-6 %).

I left strace running for less than 10 seconds and the dump is huge, 15 MB! I'm attaching a trimmed down version of it:

strace.log

Update: Spotify on my machine does a lot of I/O on cold start but it doesn't even reach 3 %

miquecg on 13 Mar 2020

thanks, I don't see anything strange in the strace log.

Have you attached the first part of it?

Could you attach the part that is generated during the I/O spike?

giuseppe on 13 Mar 2020

Once this strange behaviour begins, I/O activity is constantly at 3-6 %.

Full strace dump gzipped

miquecg on 13 Mar 2020

@giuseppe Have you had time to check the full log?

These days I don't have access to the machine where this is happening so I cannot check again with newer Podman versions. Sorry.

miquecg on 24 Mar 2020

I'm running podman 1.8.2 on Manjaro, but don't see any meaningful IO. iotop doesn't show much for podman. @miquecg can you provide a copy of your iotop output for clarity?

I took a look at the strace. It would be great if you could get a new strace with more details - use the following flags: strace -fvttTyyx -s 4096 -o /tmp/strace.txt command or with PID: strace -fvttTyyx -s 4096 -o /tmp/strace.txt -p PID

The first thing to observe is perhaps what files we're opening - can't produce IO without opening files, then read/write to them.

~2000 instances of open in 200,000 syscalls:

libpod$ grep -e 'open(' -e 'openat(' -c strace.dump 
2048
libpod$ wc -l strace.dump 
200816 strace.dump

Again, we don't have timing flags which could help us analyze how long any syscall might be taking, but we can at least see what files are being opened - and if any of them might be on a physical disk versus nodev device like /proc

libpod$ grep -e 'open(' -e 'openat(' strace.dump | cut -d " " -f 3- | awk '{print $2}'  | sort | uniq -c
    128 "/etc/ld.so.cache",
    256 "/home/myuser/.local/share/containers/storage/libpod/bolt_state.db",
    256 "/proc/52785/stat",
    128 "/proc/cpuinfo",
    128 "/proc/self/mountinfo",
    256 "/proc/self/uid_map",
    128 "/proc/sys/kernel/cap_last_cap",
    128 "/run/user/1000/runc/66bf94e5ff853ca511e9e1fd1759e5a453e46501cbedb867eb6c5f795dbb149c/state.json",
    128 "/sys/kernel/mm/hugepages",
    128 "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size",
    128 "/usr/lib/libc.so.6",
    128 "/usr/lib/libpthread.so.0",
    128 "/usr/lib/libseccomp.so.2",

So, on physical disks, three files in /usr/lib, /etc/ld.so.cache, and /home/myuser/.local/share/containers/storage/libpod/bolt_state.db... Again, we don't know if the open or read of these files is taking a long time.

From man iotop:

iotop displays columns for the I/O bandwidth read and written by each process/thread during the sampling  period.  It also displays the percentage of time the thread/process spent while swapping in and while waiting on I/O. For each process, its I/O priority (class/level) is shown.

So, I believe the "%" collumn of iotop is similar to sar's IOwait, which is time the process spent waiting on IO _that it wasn't doing something else_. In other words, having high IO might partially mean that the process isn't doing much compute - so instead of idle, it's "blocked" on io....

For reference - was the machine you were running this on using some slow disk technology? I've got an nmve card on my machine, so IO calls are generally very quick.

tylarb on 25 Mar 2020

@tylarb Thanks for taking a look on this. I'll let you know whenever I have access to the machine where I had the issue.

Hard drive is SATA with LVM and some partitions but I cannot provide more info without looking at it.

Probably is what you suggest, some system configuration or hardware condition is causing the issue.

miquecg on 25 Mar 2020

I've got the same problem on Gentoo, SATA HDD drive with LVM+LUKS+Ext4 (libpod-1.8.0)
Can provide any info.

reagentoo on 2 Apr 2020

@tylarb
Sorry if I misunderstood your tips.
I used strace ... -p PID where is podman's PID on the host system:

$ grep -e 'open(' -e 'openat(' -c strace1.txt
13393
$ wc -l strace1.txt
1013744 strace1.txt
$ grep -e 'open(' -e 'openat(' strace1.txt | cut -d " " -f 3- | awk '{print $2}' | grep AT_FDCWD | wc -l
4378

4378 strings is openat(AT_FDCWD,
Another fragments from last grep:

openat(AT_FDCWD,
openat(AT_FDCWD,
openat(AT_FDCWD,
openat(AT_FDCWD,
"/etc/ld.so.preload",
"/etc/ld.so.cache",
"/lib64/libpthread.so.0",
"/usr/lib64/libseccomp.so.2",
"/lib64/libc.so.6",
"/sys/kernel/mm/transparent_hugepage/hpage_pmd_size",
"/proc/sys/kernel/cap_last_cap",
"/proc/cpuinfo",
"/proc/self/mountinfo",
"/sys/kernel/mm/hugepages",
"/proc/self/uid_map",
"/proc/self/uid_map",
"/run/user/1000/runc/a635f25315ef3ddfcbd5791e92e6abb0ef8f591c80186d265c48b967b92e7872/state.json",
"/proc/6328/stat",
"/proc/6328/stat",

reagentoo on 2 Apr 2020

@reagentoo let's get strace with timings strace -fvttTyyx -s 4096 -o /tmp/strace.txt -p PID

This way, we can see if the open, read, or write is taking a long time, indicating the syscall is slow, which would mean the problem is underneath Podman - at OS level.

If you provide the full strace file I can take a look and we can observe if we're reading more bytes than expected or is reasonable for some reason.

tylarb on 2 Apr 2020

@giuseppe FYI

rhatdan on 2 Apr 2020

I've got the same problem on Gentoo, SATA HDD drive with LVM+LUKS+Ext4 (libpod-1.8.0)
Can provide any info.

Exactly same setup.

IIRC, I don't have this problem when running podman root. The main difference could be the encryption of /home partition, but I'd need to double check it.

Sorry for not being really helpful here but due to quarantine we must stay home everybody and I cannot access that computer.

miquecg on 2 Apr 2020

@miquecg
I didn't try to run from root. But now I've updated libpod from 1.8.0 to 1.8.2. And now I don't observe this issue yet.

reagentoo on 2 Apr 2020

👍1

Since you can confirm the problem doesn't happen anymore, let's close the issue report. At least you test it. Thanks,

miquecg on 21 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

error creating libpod runtime: there might not be enough IDs available in the namespace

juansuerogit · 51Comments

podman should support native secrets

kallisti5 · 49Comments

Rootless podman error: could not get runtime: open /proc/31678/ns/user: no such file or directory

faulesocke · 65Comments

Package up podman for vanilla Debian

lsm5 · 142Comments

No support for journald logging

Shulito · 71Comments