Nvidia-docker: STDIN to container is not working when activating nvidia runtime.

Created on 1 Oct 2019 · 21Comments · Source: NVIDIA/nvidia-docker

1. Issue or feature description

STDIN is not working when using nvidia runtime. I was able to reproduce that by performed the steps described on item 2. I'm using EKS cluster on AWS with worker nodes that support GPU. First I was trying to do that by using kubectl, but I was getting the same issue. Then I decided to go to docker level directly and perform the same STDIN test, and the result was the same. I was able to use stdin just when removing nvidia runtime from docker startup options, and it worked as expected. I tried to check logs, put docker on debug mode, checking the hooks, but unfortunately I'm not able to find the root cause why STDIN is not working when using nvidia runtime. Any insight on that is really appreciated.

2. Steps to reproduce the issue

# Create a simple nginx container 
$ docker run -d nginx
352f4ca9b801f08d37afac7fc69066c5f007c815e32918bfb81cb24dc3a83d2d

# Force an stdin by using the command "echo foo" to docker exec in interactive mode with a cat command in container:
$ echo foo | docker exec -i 352f4ca9b801 cat
Result: No stdout

# Remove option flag --default-runtime=nvidia from docker startup config file
$ vi /etc/systemconfig/docker

# Restart docker
$ service docker restart
Redirecting to /bin/systemctl restart docker.service

# Create a new container without nvidia runtime
$ docker run -d nginx
941ac65a8f82c2fd631fc6030bdd30f24576bb541c5087d889b40688d76deebf

# Run the same command to use stdin
$ echo foo | docker exec -i 941ac65a8f82 cat
foo

# foo is returned. Worked as expected

3. Information to attach (optional if deemed irrelevant)

[X] Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
[X] Kernel version from uname -a
[ ] Any relevant kernel output lines from dmesg
[X] Driver information from nvidia-smi -a
[X] Docker version from docker version
[X] NVIDIA packages version from dpkg -l '*nvidia*' _or_ rpm -qa '*nvidia*'
[X] NVIDIA container library version from nvidia-container-cli -V
[ ] NVIDIA container library logs (see troubleshooting)
[X] Docker command, image and tag used

outputs.txt

bug

Source

martilea

👍1

Most helpful comment

This may also be what is preventing kubectl cp from working. I am unable to test locally, but on EKS, the AL2_x86_64 AMI works great while AL2_x86_64_GPU (which has nvidia-docker) has problems. I will either get tar: short read or tar: This does not look like a tar archive when attempting to copy files into the pod. Copying files out works fine.

This problem with kubectl was first posted on StackOverflow. I believe it to be related to this issue.
https://stackoverflow.com/questions/58479650/kubectl-cp-fails-with-tar-this-does-not-look-like-a-tar-archive-on-nodes-runn

thavlik on 16 Jan 2020

👍6

All 21 comments

Sorry for the lack of activity this bug is pretty surprising.
I'll try to look into into!

RenaudWasTaken on 28 Oct 2019

thavlik on 16 Jan 2020

👍6

I have also hit a very similar issue: I am unable to issue a "kubectl cp" command where I would copy a file from my machine into the pod when the worker node is running the nvidia-docker driver. The copy will fail into this kind of error:

tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors
command terminated with exit code 2

I'm using EKS provided AMI amazon-eks-gpu-node-1.14-v20200122 (ami-031814549a4b2892c) with EKS version v1.14.9-eks-1f0ca9, both which are the latest Amazon provided versions.

garo on 26 Feb 2020

👍3

Just an FYI: GPU nodes on AKS do not suffer from this problem, but QoS suffers substantially from well known disk IO issues. I am using it in spite of the problems. Also, an on-demand NC6_Promo is less than half the price of a p2.xlarge and has two additional cores.

thavlik on 26 Feb 2020

@RenaudWasTaken did you get a chance to look into this? Are there any updates or even a potential fix or workaround? And is there anything we can do to help?

itssimon on 16 Apr 2020

Hi Renaud, long time no see. :-)
This also breaks https://github.com/windmilleng/tilt on EKS GPU machines. We have been having these weird issues that only affected pods on GPU machines and, after excluding one by one all of the differences between the CPU/GPU images, I tracked it down to the Docker runtime.

I looked everywhere in the sources for runc, NVIDIA/libnvidia-container, NVIDIA/container-toolkit and NVIDIA/nvidia-container-runtime (did I miss any? 😀), but I couldn't find any smoking guns. My suspicion was that somehow a prestart hook like nvidia-container-runtime-hook causes stdin to be consumed (unlikely, because its own stdin is a JSON file with the state) or never passed to the exec process. Any hints? I am not too familiar with these internals.

therc on 1 May 2020

This bug is very surprising indeed, and I find it vary hard to see how it could be linked back to the nvidia runtime (though it clearly seems to be in one way or another).

The nvidia-container-runtime-hook only comes into play on initial container creation, and should have nothing to do with the subsequent exec call (where the attachment of STDIN seems to be having problems).

I am also not able to produce it on any of the machines I have available, so it's hard to debug it directly. I will try and spin up an EKS cluster in the next couple of days to take a deeper look into this.

klueska on 1 May 2020

I looked everywhere in the sources for runc, NVIDIA/libnvidia-container, NVIDIA/container-toolkit and NVIDIA/nvidia-container-runtime (did I miss any? 😀), but I couldn't find any smoking guns. My suspicion was that somehow a prestart hook like nvidia-container-runtime-hook causes stdin to be consumed (unlikely, because its own stdin is a JSON file with the state) or never passed to the exec process. Any hints? I am not too familiar with these internals.

Reading through this, this is helpful. I hadn't thought much through this issue but I'm pretty sure we could replicate this with runc directly (and a dummy hooks), strace the calls and identify the issue.

I'll take a stab this weekend

RenaudWasTaken on 2 May 2020

❤1

@therc can you list your nvidia and docker components version? I can't reproduce...

RenaudWasTaken on 2 May 2020

@klueska you're right, the hook and the bug will trigger at different times. I was thinking of something like a side effect, perhaps. Maybe it's a subtle difference in the AppArmor/SELinux/etc. setup on the Amazon side when the nvidia-docker runtime is installed. I tried to get a single, plain EC2 instance up, to reproduce the issue (i.e. distinguishing EKS vs AL2), but I only messed up a bunch of unrelated ones, so I'll have to retry tomorrow. Another even stranger clue I forgot to mention: stdin works when it's a TTY (and you pass -t).

@RenaudWasTaken trying to set up another cluster, I can't get easy shell access to the existing one. The kubelet claims it's v1.14.9-eks-f459c0 running on on docker://18.9.9, but that's not everything that you are looking for.

In the meantime, I opened a case with AWS as well. Thanks for your help!

therc on 2 May 2020

Ah, I found more stuff in the existing logs:

Runtimes:map[nvidia:{Path:/etc/docker-runtimes.d/nvidia Args:[]} runc:{Path:runc Args:[]}] DefaultRuntime:nvidia Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:} LiveRestoreEnabled:true Isolation: InitBinary:docker-init ContainerdCommit:{ID:894b81a4b802e4eb2a91d1ce216b8817763c29fb Expected:894b81a4b802e4eb2a91d1ce216b8817763c29fb} RuncCommit:{ID:dc9208a3303feef5b3839f4323d9beb36df0a9dd Expected:dc9208a3303feef5b3839f4323d9beb36df0a9dd} InitCommit:{ID:fec3683 Expected:fec3683} SecurityOptions:[name=seccomp,profile=default]}

runc is this version: https://github.com/opencontainers/runc/commit/dc9208a3303feef5b3839f4323d9beb36df0a9dd

and containerd: https://github.com/containerd/containerd/commit/894b81a4b802e4eb2a91d1ce216b8817763c29fb

therc on 2 May 2020

yum logs to the rescue:

Apr 23 19:30:28 Installed: 1:nvidia-418.87.00-0.amzn2.x86_64
Apr 23 19:30:56 Installed: 1:nvidia-dkms-418.87.00-0.amzn2.x86_64
Apr 23 19:31:48 Installed: containerd-1.2.6-1.amzn2.x86_64
Apr 23 19:31:56 Installed: libnvidia-container1-1.0.7-1.x86_64
Apr 23 19:31:56 Installed: libnvidia-container-tools-1.0.0-1.amzn2.x86_64
Apr 23 19:31:56 Installed: nvidia-container-toolkit-1.0.5-2.amzn2.x86_64
Apr 23 19:32:02 Installed: docker-18.09.9ce-2.amzn2.x86_64
Apr 23 19:32:02 Installed: docker-runtime-nvidia-1-1.amzn2.noarch

therc on 2 May 2020

Found this: https://github.com/emypar/aws/tree/master/eks/gpu-node-stdin-patch

So it seems a combination of an updated docker-ce and an updated nvidia-docker stack will fix the issue. I'd still be curious to know what the underlying issue was, but at least doing an update seems to fix it.

klueska on 2 May 2020

❤2

Awesome find @klueska ! Just tested it and it works like a charm. The script just needed a small update which I've created a PR for.

itssimon on 2 May 2020

That's very useful, thanks. I noticed that the nodes were pinned to libnvidia-container-tools-1.0.0-1.amzn2.x86_64 even if 1.0.7 was available. Looking at the changes, I saw https://github.com/NVIDIA/libnvidia-container/commit/deccb2801502675bd283c6936861814dbca99ecd and hoped that in some crazy way it fixed stuff, e.g. ldconfig ran and somehow managed to output a corrupt cache file — which shouldn't be fatal for ld, but follow me for a second — which miraculously causes new processes in the container to drop a non-interactive stdin and only that.

I forced an update to 1.0.7, then I started a new pod on the machine and as, you can imagine, it was still broken. Same after a reboot. (I didn't mention seccomp yesterday, but I had looked at audit logs and hadn't seen any obvious errors there.)

We still don't know why this is happening, but libnvidia-container-tools doesn't seem to be directly responsible. Versions and build dates for the other packages:

Name        : nvidia-container-toolkit
Version     : 1.0.5
Release     : 2.amzn2
Build Date  : Wed 11 Sep 2019 12:25:23 AM UTC

Name        : docker-runtime-nvidia
Version     : 1
Release     : 1.amzn2
Build Date  : Thu 17 Jan 2019 10:40:47 PM UTC

Name        : docker
Version     : 18.09.9ce
Release     : 2.amzn2
Build Date  : Fri 01 Nov 2019 07:34:30 PM UTC

therc on 2 May 2020

I "fixed" the bug a bunch of ways, but the simplest I found was this, which can even be run on the same system live, at the cost of losing all running pods:

yum swap -- remove docker-runtime-nvidia \
         -- install nvidia-docker2-2.2.2-1 nvidia-container-runtime-3.1.4-1
systemctl try-restart docker

(reloading the service doesn't change the runtime configurations, it seems, so you need a hard restart)

Unlike the other solution, it does not touch the Docker version. So it's down to hooks/runc vs nvidia-container-runtime?

therc on 3 May 2020

I have no idea what the docker-runtime-nvidia package is, that's probably something that the AWS engineers created?

I suggest you open an issue with them since installing the NVIDIA components seems to fix it :)?
Thanks for looking into this!

RenaudWasTaken on 3 May 2020

It's a package with just two configuration files:

/etc/docker-runtimes.d/nvidia:

#!/bin/sh
exec /usr/bin/oci-add-hooks --hook-config-path /usr/share/docker-runtime-nvidia/hook-config.json --runtime-path /usr/bin/docker-runc "$@"

/usr/share/docker-runtime-nvidia/hook-config.json:

{
  "hooks": {
    "prestart": [
      {
        "path": "/usr/bin/nvidia-container-runtime-hook",
        "args": ["/usr/bin/nvidia-container-runtime-hook", "prestart"]
      }
    ]
  }
}

And the hook is a symlink to the binary in

Name : nvidia-container-toolkit
Version : 1.0.5
Release : 2.amzn2

therc on 3 May 2020

Just for the record: the config files and the package are fine; I had no idea, but it's really oci-add-hooks that is an AWS thing. The name of it made it sound very official. Anyway, I think that's our guy who is dropping stdin: https://github.com/awslabs/oci-add-hooks/issues/5
Sorry for the noise. :-)

therc on 3 May 2020

We will upgrade to use docker 19.03 and this problem will be solved. Current EKS GPU optimized AMI doesn't install nvidia-docker2 (we use internal docker-runtime-nvidia instead). I think since NVIDIA GPUs are natively supported as devices in the Docker runtime. We don't need this runtime or nvidia-container-runtime anymore.

Here's the old stack

rpm -qa | grep nvidia
nvidia-dkms-418.87.00-0.amzn2.x86_64
nvidia-418.87.00-0.amzn2.x86_64
libnvidia-container-tools-1.0.0-1.amzn2.x86_64
libnvidia-container1-1.0.7-1.x86_64
docker-runtime-nvidia-1-1.amzn2.noarch 
nvidia-container-toolkit-1.0.5-2.amzn2.x86_64

==========================================================================================================================================
 Package                  Arch                 Version                                              Repository                       Size
==========================================================================================================================================
Updating:
 docker                   x86_64               18.09.9ce-2.amzn2                                    amzn2extra-docker                30 M
Installing for dependencies:
 containerd               x86_64               1.2.6-1.amzn2                                        amzn2extra-docker                20 M
 runc                     x86_64               1.0.0-0.1.20190510.git2b18fe1.amzn2                  amzn2extra-docker               2.0 M

We will update stack to

nvidia-dkms-418.87.00-0.amzn2.x86_64
nvidia-418.87.00-0.amzn2.x86_64
libnvidia-container-tools-1.0.0-1.amzn2.x86_64
libnvidia-container-1.0.0-1.amzn2.x86_64
nvidia-container-toolkit-1.0.5-2.amzn2.x86_64

Jeffwan on 3 May 2020

❤1

The EKS would release a new patch including awslabs/oci-add-hooks#5 to fix stdin issue, although update to Docker v19.03 default runtime "runc" would aslo fix it, but https://github.com/NVIDIA/k8s-device-plugin requires nvidia default runtime "nvidia"