Some ERRORs occured When I try to run tensorflow/serving:latest-gpu.
Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
Ubuntu 16.04.6 LTS
docker version: 19.03
cuda version:9.0
hi, mybe didn't use --gpus flags when docker run. example:
docker run --gpus all nvidia/cuda:9.0-base nvidia-smi
@YelongYin,
Can you please let us know if your issue still persists, or is it resolved, so that we can work towards closure of this issue. Thanks!
Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!
I'm having the same problem. Running tensorflow-gpu locally with cuda10 succeeded. However, when tensorflow serving is run with docker, the following message appears and uses only cpu resources.
2019-12-06 09: 59: 12.917246: W external / org_tensorflow / tensorflow / stream_executor / platform / default / dso_loader.cc: 55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: / usr / local / nvidia / lib: / usr / local / nvidia / lib64
2019-12-06 09: 59: 12.917287: E external / org_tensorflow / tensorflow / stream_executor / cuda / cuda_driver.cc: 318] failed call to cuInit: UNKNOWN ERROR (303)
Is there any list of compatible CUDA CUDNN version?
Could you reopen the issue? I experienced the same thing. When using tensorflow-gpu-2.1.0 and docker 19.03.4.
Error log from notebook:
2020-01-25 04:32:14.300854: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-25 04:32:14.300950: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-25 04:32:14.300959: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
Why tensorflow-gpu look for cuda library path /usr/local/nvidia/lib:/usr/local/nvidia/lib64 rather than /usr/local/cuda? Do you on purpose to sabotage the build?
docker run --gpus all --rm nvidia/cuda nvidia-smi
Unable to find image 'nvidia/cuda:latest' locally
latest: Pulling from nvidia/cuda
7ddbc47eeb70: Pull complete
c1bbdc448b72: Pull complete
8c3b70e39044: Pull complete
45d437916d57: Pull complete
d8f1569ddae6: Pull complete
85386706b020: Pull complete
ee9b457b77d0: Pull complete
be4f3343ecd3: Pull complete
30b4effda4fd: Pull complete
Digest: sha256:31e2a1ca7b0e1f678fb1dd0c985b4223273f7c0f3dbde60053b371e2a1aee2cd
Status: Downloaded newer image for nvidia/cuda:latest
Sat Jan 25 04:50:09 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.44 Driver Version: 440.44 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2070 Off | 00000000:01:00.0 On | N/A |
| 0% 34C P8 9W / 185W | 894MiB / 7981MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
I can created a symlink in the docker image. But this stinks. Because /usr/local/cuda is a default nvidia library installation location and all NVIDIA place their stuffs there.
Why you mess this up when you build tensorflow-gpu?
Out of the box, tensorflow-gpu 2.0 installed from pypi by pip doesn't work under container pulled from NVIDIA docker image nvidia/cuda:10.0-cudnn7-devel.
Even I created a symlink from /usr/local/cuda to /usr/local/nvidia, I still failed to launch 1D CNN in Keras.
I have to pull image nvcr.io/nvidia/tensorflow:19.12-tf2-py3 from NVIDIA to make it runs.
@luvwinnie
Here is the list
I met the same problem,Is there any compatible CUDA CUDNN version for tensorflow serving?
Running into the same issue
Does the issue still exist with the latest version? There was some mismatches for the library version between Tensorflow and Tensorflow Serving.
Most helpful comment
Out of the box,
tensorflow-gpu 2.0installed from pypi by pip doesn't work under container pulled from NVIDIA docker imagenvidia/cuda:10.0-cudnn7-devel.Even I created a symlink from
/usr/local/cudato/usr/local/nvidia, I still failed to launch 1D CNN in Keras.I have to pull image
nvcr.io/nvidia/tensorflow:19.12-tf2-py3from NVIDIA to make it runs.@luvwinnie
Here is the list