Nvidia-docker: Problems in importing an image.

Created on 8 Jun 2018  路  1Comment  路  Source: NVIDIA/nvidia-docker

1. Issue or feature description

I have same problem like this #174
It seem to be resolved in nvidia-docker2, but I can't work successfully

2. Steps to reproduce the issue

I just install nvidia-docker with
apt install nvidia-docker2

pull the container I need
nvidia-docker pull nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04

run it
nvidia-docker run -it <image> /bin/bash
install python from apt in container

commit container to another repository
nvidia-docker commit <container> nvtest

save to tar
nvidia-docker save -o nvtest.tar nvtest

remove the nvtest in repository and runtime container

import tar from nvtest.tar
nvidia-docker import nvtest.tar nvtest

run command
docker run --runtime=nvidia --rm a81cce78e47c nvidia-smi

and output is
docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH": unknown.

change nvidia-smi to /bin/bash has same output

and I try to use another method to export container

export container
nvidia-docker export -o nvtest.tgz b93807e16849

import tar from nvtest.tar
cat nvtest.tgz | nvidia-docker import - nvtest2

and I can use bash in container now !
but, the nvcc is not in $PATH (still in /usr/local/cuda/bin)
and when I use ldconfig , the output is

/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libcuda.so.384.130 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.384.130 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.384.130 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.384.130 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.384.130 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.384.130 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libcuda.so.1 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libcuda.so is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.384.130 is empty, not checked.

/usr/lib/x86_64-linux-gnu/libcuda.so.384.130 is 0 byte file now @@...

3. Information to attach (optional if deemed irrelevant)

  • [x] Kernel version from uname -a

Linux yichiun-ubuntu1604 4.13.0-43-generic #48~16.04.1-Ubuntu SMP Thu May 17 12:56:46 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  • [ ] Any relevant kernel output lines from dmesg
  • [x] Driver information from nvidia-smi -a

==============NVSMI LOG==============

Timestamp : Fri Jun 8 15:34:54 2018
Driver Version : 384.130

Attached GPUs : 1
GPU 00000000:01:00.0
Product Name : GeForce GTX 1060 6GB
Product Brand : GeForce
Display Mode : Enabled
Display Active : Enabled
Persistence Mode : Disabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 1920
......

  • [x] Docker version from docker version

Client:
Version: 18.03.1-ce
API version: 1.37
Go version: go1.9.5
Git commit: 9ee9f40
Built: Thu Apr 26 07:17:20 2018
OS/Arch: linux/amd64
Experimental: false
Orchestrator: swarm

Server:
Engine:
Version: 18.03.1-ce
API version: 1.37 (minimum version 1.12)
Go version: go1.9.5
Git commit: 9ee9f40
Built: Thu Apr 26 07:15:30 2018
OS/Arch: linux/amd64
Experimental: false

  • [x] NVIDIA packages version from dpkg -l '*nvidia*' _or_ rpm -qa '*nvidia*'

nvidia-container-runtime 2.0.0+docker18.03.1-1 amd64 NVIDIA container runtime
nvidia-container-runtime-hook 1.3.0-1 amd64 NVIDIA container runtime hook
nvidia-docker (no description available)
nvidia-docker2 2.0.3+docker18.03.1-1 all nvidia-docker CLI wrapper

  • [x] NVIDIA container library version from nvidia-container-cli -V

version: 1.0.0
build date: 2018-04-26T22:53+00:00
build revision: 163054a04b21c4455c8cae7e47873d9f2a091f55
build compiler: gcc-5 5.4.0 20160609
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections

  • [ ] NVIDIA container library logs (see troubleshooting)
  • [ ] Docker command, image and tag used

Most helpful comment

nvidia-docker save -o nvtest.tar nvtest
nvidia-docker import nvtest.tar nvtest

This is wrong, after a docker save, you need to docker load. Not docker import.
I'm surprised it works, but it's clearly not doing what you expect.

For your second problem, doing docker export will drop all the environment variables.
But you need those environment variables to trigger GPU support:
https://github.com/nvidia/nvidia-container-runtime#environment-variables-oci-spec
A fix would be:

cat nvtest.tgz | docker import --change "ENV NVIDIA_VISIBLE_DEVICES=all" - nvtest2

>All comments

nvidia-docker save -o nvtest.tar nvtest
nvidia-docker import nvtest.tar nvtest

This is wrong, after a docker save, you need to docker load. Not docker import.
I'm surprised it works, but it's clearly not doing what you expect.

For your second problem, doing docker export will drop all the environment variables.
But you need those environment variables to trigger GPU support:
https://github.com/nvidia/nvidia-container-runtime#environment-variables-oci-spec
A fix would be:

cat nvtest.tgz | docker import --change "ENV NVIDIA_VISIBLE_DEVICES=all" - nvtest2
Was this page helpful?
0 / 5 - 0 ratings

Related issues

mmitterma picture mmitterma  路  4Comments

DimanNe picture DimanNe  路  3Comments

SpotCrowdTech picture SpotCrowdTech  路  3Comments

o1lo01ol1o picture o1lo01ol1o  路  4Comments

lsb picture lsb  路  4Comments