Nvidia-docker: unable to import tensorflow, ImportError: libcublas.so.9.0

Created on 7 Sep 2018 · 13Comments · Source: NVIDIA/nvidia-docker

when I import tensorflow I get the below mentioned error,
"ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory"
but libcublas.so.9.0 is avalable,
$ldconfig -p | grep libcublas (shows)
libcublas.so.9.0 (libc6,x86-64) => /usr/local/cuda/lib64/libcublas.so.9.0

$nvcc --version
Cuda compilation tools, release 9.0, V9.0.176
Cudnn-version - 7.0.5

docker run --runtime nvidia --rm nvidia/cuda nvidia-smi

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+

Source

sathiez

👍3

Most helpful comment

libcublas.so.9.0 is on the host, not inside the container. You need to use our CUDA base images on Docker Hub and then install TF. Or just use the official TensorFlow images on DockerHub.

flx42 on 7 Sep 2018

👍2

All 13 comments

Hi,

I am currently facing the same issue. When I import tensorflow inside my custom container, I get the error

ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.

My host machine is a p2.xlarge with the following config:

cuda 9.0 (followed these installation instructions)
ldconfig -p | grep libcublas

libcublas.so.9.0 (libc6,x86-64) => /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcublas.so.9.0
libcublas.so (libc6,x86-64) => /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcublas.so

nvidia-docker2

docker run --runtime nvidia --rm nvidia/cuda nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130                Driver Version: 384.130                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   58C    P0    62W / 149W |      0MiB / 11439MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

My container config, started with: docker run -it --runtime=nvidia MyCustomImage

tensorflow-gpu=1.10.1

gqoew on 7 Sep 2018

Which docker image are you using?
It's likely a problem from the image itself.

flx42 on 7 Sep 2018

@flx42

I use python:3.6.5-jessie:

docker run -it --runtime=nvidia python:3.6.5-jessie

Then, I installed tensorflow via pip install tensorflow-gpu. After this, import tensorflow returns ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory in python shell.

Do I need to add something to the image? (I need to build my image from python:3.6.5-jessie)

gqoew on 7 Sep 2018

libcublas.so.9.0 is on the host, not inside the container. You need to use our CUDA base images on Docker Hub and then install TF. Or just use the official TensorFlow images on DockerHub.

flx42 on 7 Sep 2018

👍2

@flx42 Thanks, I understand now. Do I still need to install Cuda on the host?

gqoew on 7 Sep 2018

No, you only need to have the NVIDIA driver on the host. You don't need the CUDA toolkit

flx42 on 7 Sep 2018

👍1

Eventhough I have nvidia/cuda docker image in my host. I get the same error, I build my image using

s2i build 'src-folder' seldonio/seldon-core-s2i-python3:0.1 'imageName

sathiez on 8 Sep 2018

@flx42

I tried to build the image nvidia/cuda:9.0-base (https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/base/Dockerfile)
However it seems there is the problem with the checksum step. I opened an issue in the related repository (https://gitlab.com/nvidia/cuda/issues/21)

gqoew on 9 Sep 2018

@flx42

I tried using nvidia/cuda:9.0-base from dockerhub:

docker run -it --runtime=nvidia nvidia/cuda:9.0-base

Then, I installed tensorflow via pip install tensorflow-gpu. After this, import tensorflow still returns ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory in python shell.

Also, ldconfig -p | grep libcublas returns no results.

gqoew on 10 Sep 2018

@flx42
Hi,
where can I get nvidia/cuda.sh file?
source2image (s2i) supports only .sh files instead of docker file

sathiez on 10 Sep 2018

On using runtime images I am unable to build image. I get the following error,
|"Step 16/46 : FROM ${repository}:9.0-base-ubuntu16.04
invalid reference format "

sathiez on 12 Sep 2018

@gqoew the base image tags do not bundle libcublas, you need to install it. Look at the official TensorFlow dockerfile.

@sathiez we do not support your tool.

flx42 on 15 Sep 2018

@flx42 I have looked at the official TF docker file (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/dockerfiles/dockerfiles/nvidia.Dockerfile) but I still get things like

E: Unable to locate package libcudnn7
E: Unable to locate package libnccl2