Nvidia-docker: Error response from daemon: OCI runtime create failed

Created on 22 Jan 2018  Â·  5Comments  Â·  Source: NVIDIA/nvidia-docker

1. Issue or feature description

Nvidia-Docker stopped working.
I had a jupyterhub running with nvidia-docker supported and it worked quite well.
Today I logged into the host system and ran sudo apt-get update/upgrade, and somehow, suddenly Nvidia-Docker does not work anymore. That said I can't recall if the upgrade actually did something so that might not be the root of the issue.
System runs debian.

2. Steps to reproduce the issue

sudo docker run --rm nvidia/cuda:8.0-devel nvidia-smi

docker: Error response from daemon: OCI runtime create failed: container_linux.go:296: starting container process caused "process_linux.go:398: container init caused \"process_linux.go:381: running prestart hook 1 caused \\\"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=8.0 --pid=25807 /var/lib/docker/overlay2/8127e7486398ec495fc98de2cee1f18e769ee97f43211ccbc455a058d3b3923a/merged]\\\\nnvidia-container-cli: ldcache error: open failed: /sbin/ldconfig.real: no such file or directory\\\\n\\\"\"": unknown.

3. Information to attach (optional if deemed irrelevant)

$uname -a Linux donna 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04) x86_64 GNU/Linux

 $ docker version
Client:
 Version:   17.12.0-ce
 API version:   1.35
 Go version:    go1.9.2
 Git commit:    c97c6d6
 Built: Wed Dec 27 20:11:19 2017
 OS/Arch:   linux/amd64

Server:
 Engine:
  Version:  17.12.0-ce
  API version:  1.35 (minimum version 1.12)
  Go version:   go1.9.2
  Git commit:   c97c6d6
  Built:    Wed Dec 27 20:09:54 2017
  OS/Arch:  linux/amd64
  Experimental: false

$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.82                 Driver Version: 375.82                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 0000:41:00.0     Off |                  N/A |
|  0%   23C    P0    55W / 250W |      0MiB / 11170MiB |      3%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

$ nvidia-container-cli -V
version: 1.0.0
build date: 2018-01-11T00:29+00:00
build revision: 4a618459e8ba522d834bb2b4c665847fae8ce0ad
build compiler: x86_64-linux-gnu-gcc-6 6.3.0 20170516
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections

Most helpful comment

@khallaghi I believe so. I first got hit by #677, then this one.
This is however not a Debian stretch, but a mix of testing and unstable.

My workaround was to symlink /sbin/ldconfig to /sbin/ldconfig.real

All 5 comments

Sorry for causing the trouble, it seems that I had the wrong sources list installed. To everyone running Debian and having this issue: Make sure you get your stuff from here: https://nvidia.github.io/nvidia-docker/

You were probably using the Ubuntu packages instead of the Debian ones.

Is it possible to have other causation?
I have exactly the same issue on the same platform(debian stretch) but I installed from the right repository.

@khallaghi I believe so. I first got hit by #677, then this one.
This is however not a Debian stretch, but a mix of testing and unstable.

My workaround was to symlink /sbin/ldconfig to /sbin/ldconfig.real

met the same problem,thanks @sleveque
sudo docker run --gpus all nvidia/cuda:9.0-base nvidia-smi

docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:449: container init caused "process_linux.go:432: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: ldcache error: open failed: /sbin/ldconfig.real: no such file or directory\\n\""": unknown.
ERRO[0000] error waiting for container: context canceled
Solution:
ln -s /sbin/ldconfig /sbin/ldconfig.real

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mmitterma picture mmitterma  Â·  4Comments

romansavrulin picture romansavrulin  Â·  4Comments

meftaul picture meftaul  Â·  3Comments

mythly picture mythly  Â·  3Comments

jonghwanhyeon picture jonghwanhyeon  Â·  4Comments