Podman: Random behaviour when using stdout/stderr inside an entrypoint.

Created on 9 Oct 2020 · 7Comments · Source: containers/podman

/kind bug

Description

Steps to reproduce the issue:

Use the following Dockerfile,

FROM registry.access.redhat.com/ubi8/ubi-minimal:8.2
COPY entrypoint.sh /
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]

Use the following as entrypoint.sh,

#!/bin/bash

OC_NAMESPACE_ENV="weird"
oc_namespace_file="/run/secrets/kubernetes.io/serviceaccount/namespace"

echo ""

if ! [[ -f ${oc_namespace_file} ]]; then
  echo " [ WRN ] the namespace file doesn't exist (${oc_namespace_file})," \
       "using the OC_NAMESPACE ENV instead"  >&2

  if [[ -z ${OC_NAMESPACE_ENV} ]]; then
    echo -e " [ ERR ] could not use the namespace file (${oc_namespace_file})," \
            "nor the environment variable OC_NAMESPACE_ENV\n" >&2
    exit
  else
    oc_namespace=${OC_NAMESPACE_ENV}
  fi
else
  oc_namespace=$(cat ${oc_namespace_file})
fi
echo " [ INF ] found namespace '${oc_namespace}'"

Note how the error messages are redirected to stderr.

Build with,

$ > podman build . -t weird:latest

Run with,

$ > podman run weird:latest

Notice the output.
In my world the "[ INF ]"-string at the end of the entrypoint.sh should never be able to be printed before the "[WRN]"-string. But in random cases, the order is shifted, see output below,

$ >  podman run pisa-new:latest
 [ INF ] found namespace 'manpi-dev'
 [ WRN ] the namespace file doesn't exist (/run/secrets/kubernetes.io/serviceaccount/namespace), using the OC_NAMESPACE ENV instead

$ > podman run pisa-new:latest
 [ WRN ] the namespace file doesn't exist (/run/secrets/kubernetes.io/serviceaccount/namespace), using the OC_NAMESPACE ENV instead
 [ INF ] found namespace 'manpi-dev'

$ > podman run pisa-new:latest
 [ INF ] found namespace 'manpi-dev'
 [ WRN ] the namespace file doesn't exist (/run/secrets/kubernetes.io/serviceaccount/namespace), using the OC_NAMESPACE ENV instead

$ > podman run pisa-new:latest
 [ INF ] found namespace 'manpi-dev'
 [ WRN ] the namespace file doesn't exist (/run/secrets/kubernetes.io/serviceaccount/namespace), using the OC_NAMESPACE ENV instead

$ > podman run pisa-new:latest
 [ WRN ] the namespace file doesn't exist (/run/secrets/kubernetes.io/serviceaccount/namespace), using the OC_NAMESPACE ENV instead
 [ INF ] found namespace 'manpi-dev'

Note that if I remove the redirection in entrypoint.sh from the "[ERR]"-strings (>&2) the messages always gets printed in the right order (or at least I haven't been able to reproduce the issue).

As you see from the output above, the order is not consistent - which to me indicates that there is something weird going on, or that I'm totally missunderstanding something here.

Describe the results you received:
See point 5 above.

Describe the results you expected:
I expect the messages to be printed in an predictable way and not random, ie. "[INF]"-string should always be printed last.

Additional information you deem important (e.g. issue happens only occasionally):
It seems to me that it happens randomly, some sort of race triggers somewhere.

Output of podman version:

podman version
Version:      2.1.1
API Version:  2.0.0
Go Version:   go1.15.2
Built:        Wed Oct  7 15:22:27 2020
OS/Arch:      linux/amd64

Output of podman info --debug:

$ > podman info --debug
host:
  arch: amd64
  buildahVersion: 1.16.1
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.21-3.fc33.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.21, commit: 0f53fb68333bdead5fe4dc5175703e22cf9882ab'
  cpus: 1
  distribution:
    distribution: fedora
    version: "33"
  eventLogger: journald
  hostname: fedora
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.8.13-300.fc33.x86_64
  linkmode: dynamic
  memFree: 612429824
  memTotal: 4119355392
  ociRuntime:
    name: crun
    package: crun-0.15-5.fc33.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.15
      commit: 56ca95e61639510c7dbd39ff512f80f626404969
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.4-4.dev.giteecccdb.fc33.x86_64
    version: |-
      slirp4netns version 1.1.4+dev
      commit: eecccdb96f587b11d7764556ffacfeaffe4b6e11
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 2054680576
  swapTotal: 2059399168
  uptime: 18h 18m 4.83s (Approximately 0.75 days)
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /home/manpi/.config/containers/storage.conf
  containerStore:
    number: 166
    paused: 0
    running: 0
    stopped: 166
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.1.2-1.fc33.x86_64
      Version: |-
        fusermount3 version: 3.9.3
        fuse-overlayfs: version 1.1.0
        FUSE library version 3.9.3
        using FUSE kernel interface version 7.31
  graphRoot: /home/manpi/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 410
  runRoot: /run/user/1000/containers
  volumePath: /home/manpi/.local/share/containers/storage/volumes
version:
  APIVersion: 2.0.0
  Built: 1602076947
  BuiltTime: Wed Oct  7 15:22:27 2020
  GitCommit: ""
  GoVersion: go1.15.2
  OsArch: linux/amd64
  Version: 2.1.1

Package info (e.g. output of rpm -q podman or apt list podman):

$ > rpm -q podman
podman-2.1.1-11.fc33.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?
I've checked the troubleshooting guide - but didn't find anything.
I have not built my own version of podman though =\

Additional environment details (AWS, VirtualBox, physical, etc.):
I'm running this in VirtualBox, with Fedora 33 as guest OS.
My host OS is macOS Catalina 10.15.7 (19H2).
I'm just using a standard terminal (iterm2), ssh'ing into my virtual Fedora and run podman inside there - I dont think that this should be an issue. Everything else is working "as normal", and as stated above, I get expected behaviour if I remove the redirection to stderr from within the entrypoint.sh.

Thanks for an awesome project! :heart:

kinbug

Source

patchon

All 7 comments

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] on 9 Nov 2020

@baude @vrothberg @nalind @mtrmac @mheon @giuseppe

This looks like are randomly getting the STDERR message back after the STDOUT message even though STDERR happened first. This looks like it is an issue with go threading, and I am not sure we can fix it.

Thoughts?

rhatdan on 9 Nov 2020

UNIX pipes/sockets don’t come with timing information attached to every byte, so if stdout/stderr is consumed by conmon, or whatever it is, from two separate file descriptors, it’s not possible to reconstruct the original order. If you want that, you have to use a single pipe/socket for both, like a terminal (and then it’s not possible to distinguish output directed to stdout vs. stderr).

mtrmac on 9 Nov 2020

👍1

Thanks for clearing it up, I guess its not a huge issue, I was just very confused about the output order.

patchon on 9 Nov 2020

Basically, with a

while stdout_open && stderr_open:
    stdout_readable, stdout_open, stderr_readable, stderr_open = poll(stdout_fd, stderr_fd)
    if stdout_readable: forward_stdout()
    if stderr_readable: forward_stderr()

it’s possible for the producer to write to stdout1, stderr, stdout2, in that order, and _only later_ for the consumer to wake, and see both stderr_readable and stdout_writable, and the stdout buffer to contain (trailing parts of)stdout1+(leading parts of)stdout2, and no way to reconstruct the timing.

mtrmac on 9 Nov 2020

👍1

Since this can not be fixed easily and is really not a bug. Closing.

rhatdan on 10 Nov 2020

👍1

Again, thanks for clearing it up and for the awesome work you're doing with podman. Really appreciate your hard work! 🙋🏼‍♂️

patchon on 11 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings