when I import tensorflow I get the below mentioned error,
"ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory"
but libcublas.so.9.0 is avalable,
$ldconfig -p | grep libcublas (shows)
libcublas.so.9.0 (libc6,x86-64) => /usr/local/cuda/lib64/libcublas.so.9.0
$nvcc --version
Cuda compilation tools, release 9.0, V9.0.176
Cudnn-version - 7.0.5
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.87 Driver Version: 390.87 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 105... Off | 00000000:01:00.0 On | N/A |
| 29% 35C P8 N/A / 75W | 1047MiB / 4038MiB | 1% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
Hi,
I am currently facing the same issue. When I import tensorflow inside my custom container, I get the error
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.
My host machine is a p2.xlarge with the following config:
ldconfig -p | grep libcublaslibcublas.so.9.0 (libc6,x86-64) => /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcublas.so.9.0
libcublas.so (libc6,x86-64) => /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcublas.so
docker run --runtime nvidia --rm nvidia/cuda nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130 Driver Version: 384.130 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:00:1E.0 Off | 0 |
| N/A 58C P0 62W / 149W | 0MiB / 11439MiB | 100% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
My container config, started with: docker run -it --runtime=nvidia MyCustomImage
tensorflow-gpu=1.10.1Which docker image are you using?
It's likely a problem from the image itself.
@flx42
I use python:3.6.5-jessie:
docker run -it --runtime=nvidia python:3.6.5-jessie
Then, I installed tensorflow via pip install tensorflow-gpu. After this, import tensorflow returns ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory in python shell.
Do I need to add something to the image? (I need to build my image from python:3.6.5-jessie)
libcublas.so.9.0 is on the host, not inside the container. You need to use our CUDA base images on Docker Hub and then install TF. Or just use the official TensorFlow images on DockerHub.
@flx42 Thanks, I understand now. Do I still need to install Cuda on the host?
No, you only need to have the NVIDIA driver on the host. You don't need the CUDA toolkit
Eventhough I have nvidia/cuda docker image in my host. I get the same error, I build my image using
@flx42
I tried to build the image nvidia/cuda:9.0-base (https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/base/Dockerfile)
However it seems there is the problem with the checksum step. I opened an issue in the related repository (https://gitlab.com/nvidia/cuda/issues/21)
@flx42
I tried using nvidia/cuda:9.0-base from dockerhub:
docker run -it --runtime=nvidia nvidia/cuda:9.0-base
Then, I installed tensorflow via pip install tensorflow-gpu. After this, import tensorflow still returns ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory in python shell.
Also, ldconfig -p | grep libcublas returns no results.
@flx42
Hi,
where can I get nvidia/cuda.sh file?
source2image (s2i) supports only .sh files instead of docker file
On using runtime images I am unable to build image. I get the following error,
|"Step 16/46 : FROM ${repository}:9.0-base-ubuntu16.04
invalid reference format "
@gqoew the base image tags do not bundle libcublas, you need to install it. Look at the official TensorFlow dockerfile.
@sathiez we do not support your tool.
@flx42 I have looked at the official TF docker file (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/dockerfiles/dockerfiles/nvidia.Dockerfile) but I still get things like
E: Unable to locate package libcudnn7
E: Unable to locate package libnccl2
Most helpful comment
libcublas.so.9.0is on the host, not inside the container. You need to use our CUDA base images on Docker Hub and then install TF. Or just use the official TensorFlow images on DockerHub.