Podman: Rootless containers reset connection

Created on 19 Jul 2020  Â·  18Comments  Â·  Source: containers/podman

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Rootless containers default network mode is unstable. When I run a webserver inside a rootless container and try to perform multiple HTTP requests by curl I often get curl: (56) Recv failure: Connection reset by peer. After around 2 minutes any attempts to establish a connection to the port are hung. On the other hand, rootful containers or host network mode don't cause the issue. This suggests that the issue is related to slirp4netns which is a default network mode for rootless containers.

Steps to reproduce the issue:

  1. podman run -d -p 8080:80 nginx:alpine

  2. while true; do curl -L http://127.0.0.1:8080; done

Describe the results you received:

Periodically in the output, I can see:

curl: (56) Recv failure: Connection reset by peer

Or in TCP:

46172   88.092580136    127.0.0.1   127.0.0.1   TCP 66  8080 → 39888 [RST, ACK] Seq=1 Ack=79 Win=65536 Len=0 TSval=2905973286 TSecr=2905973097

In the end, the connection cannot be finished:

screenshot_20200719_114810

Describe the results you expected:

The connection shouldn't be reset.

Additional information you deem important (e.g. issue happens only occasionally):

I attached a Wireshark capture log of the loopback interface.
podman_network_bug.pcapng.gz

The issue is not seen if the network mode is host or podman runs under root. I tried to test the issue on podman 1.6.2 and it's not reproducible there. It seems it's a regression.

Output of podman version:

Version:      2.0.2
API Version:  1
Go Version:   go1.14.4
Built:        Thu Jan  1 01:00:00 1970
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.15.0
  cgroupVersion: v1
  conmon:
    package: 'conmon: /usr/libexec/podman/conmon'
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.18, commit: '
  cpus: 8
  distribution:
    distribution: neon
    version: "18.04"
  eventLogger: file
  hostname: thinkpad-t480s
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.3.0-62-generic
  linkmode: dynamic
  memFree: 1611853824
  memTotal: 16664281088
  ociRuntime:
    name: runc
    package: Unknown
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 0.4.3
      commit: unknown
  swapFree: 897052672
  swapTotal: 1023406080
  uptime: 2h 52m 16.21s (Approximately 0.08 days)
registries:
  localhost:5000:
    Blocked: false
    Insecure: true
    Location: localhost:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: localhost:5000
  search:
  - docker.io
  - registry.access.redhat.com
store:
  configFile: /home/dmisharo/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: Unknown
      Version: |-
        fusermount3 version: 3.6.2
        fuse-overlayfs: version 0.7.8
        FUSE library version 3.4.1
        using FUSE kernel interface version 7.27
  graphRoot: /home/dmisharo/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 20
  runRoot: /run/user/1000/containers
  volumePath: /home/dmisharo/.local/share/containers/storage/volumes
version:
  APIVersion: 1
  Built: 0
  BuiltTime: Thu Jan  1 01:00:00 1970
  GitCommit: ""
  GoVersion: go1.14.4
  OsArch: linux/amd64
  Version: 2.0.2

Package info (e.g. output of rpm -q podman or apt list podman):

Listing... Done
podman/unknown,now 2.0.2~2 amd64 [installed]

Additional environment details (AWS, VirtualBox, physical, etc.):
Local physical machine

kinbug rootless

Most helpful comment

The issue is on RootlessKit (which was adopted since v1.8 to replace slirp4netns port forwarder), not on slirp4netns.

I confirmed the issue is reproducible with both Rootless Docker and Rootless Podman. (But reproducibility seems extremely low, hard to debug...)

All 18 comments

Is this new in 2.0?

@AkihiroSuda your assumption is right. I tested it on Fedora 29 with podman 1.6.2 and I couldn't reproduce the issue. It seems it's a regression.

The issue is on RootlessKit (which was adopted since v1.8 to replace slirp4netns port forwarder), not on slirp4netns.

I confirmed the issue is reproducible with both Rootless Docker and Rootless Podman. (But reproducibility seems extremely low, hard to debug...)

I'm using Fedora 32, I was also hitting this and seems like downgrading it to podman-2:1.8.2-2.fc32.x86_64 solved the problem for me.

The issue is on RootlessKit (which was adopted since v1.8 to replace slirp4netns port forwarder), not on slirp4netns.

I confirmed the issue is reproducible with both Rootless Docker and Rootless Podman. (But reproducibility seems extremely low, hard to debug...)

We're seeing this intermittently, reported as #7067, can you explain how you confirmed the issue? I don't want to chase my vendor (gitlab) and claim this is something happening __within__ their container if I can verify/prove that it's indeed my choice of using podman.

The thing that was perplexing me was that I saw only ONE of the two mapped ports dropping (reported in #7067). Which made me assume that it was gitlab sub-system rather than podman/{slirp4netns,rootlesskit}.

I see @Victoremepunto saying a downgrade works, but I was wondering how that podman- package downgrade impacts Rotlesskit or if thats a separate package that needs to be downgraded (or if the downgrade goes back to using slirp4netns).

Another thing to note, this isn't intermittent for us... once this happens it __stays__ broken.

I wanted to tag @mheon as I was asking about this issue in IRC. This is a real nasty one which will block adoption for anyone running a network based service in a rootless podman instance. We're dropping back to 1.8.2 and keeping an eye on this.

could you give a try to podman run --network slirp4netns:port_handler=slirp4netns It will force the slirp4netns listener we had in the past. This feature was added recently to podman.

https://github.com/containers/podman/pull/7101 should fix the issue. Please try.

@AkihiroSuda I am experiencing the issue currently with podman 2.0.2.

What is the best / correct way to update rootlesskit to 0.10.0 in podman?

2.0.2 is several versions out of date - can you update to 2.0.4?

Worst case, it will be in 2.0.5, which I am releasing right now.

I tried again with different versions, including the following:
podman version 2.1.0-dev (built from scratch)
podman 2.0.4

Still no luck - my rootless container is still dropping incoming TCP
connections like flies.

On Mon, Aug 24, 2020 at 4:22 PM Matthew Heon notifications@github.com
wrote:

Worst case, it will be in 2.0.5, which I am releasing right now.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/containers/podman/issues/7016#issuecomment-679345894,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AA7SK2FMH3Q3IDAAUODDARLSCLDYLANCNFSM4PBELMEA
.

I will give it another shot with 2.0.5 once it's out.

On Mon, Aug 24, 2020 at 4:47 PM Sebastien G. Cote cotegu@gmail.com wrote:

I tried again with different versions, including the following:
podman version 2.1.0-dev (built from scratch)
podman 2.0.4

Still no luck - my rootless container is still dropping incoming TCP
connections like flies.

On Mon, Aug 24, 2020 at 4:22 PM Matthew Heon notifications@github.com
wrote:

Worst case, it will be in 2.0.5, which I am releasing right now.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/containers/podman/issues/7016#issuecomment-679345894,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AA7SK2FMH3Q3IDAAUODDARLSCLDYLANCNFSM4PBELMEA
.

Still no luck with 2.0.5 unfortunately. My app that's running in a rootless container is still rejecting remote connections as soon as they are opened. If I run the app in a rootful container, it works - so basically still the same symptoms as described in the above thread.

Probably different issue, please open a new issue with full reproducer

@sec841 if you're able to reproduce this well then please open something. When I was encountering issues with this it would happen periodically in away that I couldn't induce (just had to wait for it to happen).

Sure - I'll try to put together a minimal repro and will open up a new issue. Thanks all.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

alitvak69 picture alitvak69  Â·  50Comments

Shulito picture Shulito  Â·  71Comments

lsm5 picture lsm5  Â·  142Comments

AkihiroSuda picture AkihiroSuda  Â·  50Comments

juansuerogit picture juansuerogit  Â·  51Comments