I'm trying to run nvidia-docker on my Fedora 23 laptop.
Both, nvidia-docker as well as nvidia-docker-plugin abort with
Error: Could not load NVML library
(I do have /usr/lib64/nvidia/libnvidia-ml.so which is part of the xorg-x11-drv-nvidia-cuda-358.16-2.fc23.x86_64 package)
Any idea what I'm missing ?
Can you provide the output of
ldconfig -p | grep nvidia-ml
The output is strictly empty.
That's the problem then :) NVML is not in your ldcache.
If your package installed the appropriate conf file in /etc/ld.so.conf.d, then it's just a matter of doing:
sudo ldconfig
Otherwise either create it:
sudo tee /etc/ld.so.conf.d/nvidia-ml.conf <<< /usr/lib64/nvidia
sudo ldconfig
or set LD_LIBRARY_PATH before running nvidia-docker or nvidia-docker-plugin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib64/nvidia
Ah ! So it turns out I was missing a "devel" rpm ("xorg-x11-drv-nvidia-devel"). (Perhaps this is a packaging bug, as I believe that ldconfig entry is needed not just for development.)
With that fixed I now get a different error:
When running nohyp nvidia-docker-plugin > /tmp/nvidia-docker.log I see (in that log file)
nvidia-docker-plugin | 2016/04/14 19:50:49 Error: nvml: Not Supported
Is this a version mismatch somewhere ?
And running nvidia-docker run -it nvidia/cuda bash I get
docker: Error response from daemon: create nvidia_driver_358.16: create nvidia_driver_358.16: Error looking up volume plugin nvidia-docker: plugin not found.
Thanks for your help,
Stefan
This is related to #40, your GPU is not currently supported.
One way to workaround it is to use nvidia-docker volume setup instead of the plugin (see here).
Great, that works !
Thanks for your support,
Stefan
Most helpful comment
That's the problem then :) NVML is not in your ldcache.
If your package installed the appropriate conf file in
/etc/ld.so.conf.d, then it's just a matter of doing:sudo ldconfigOtherwise either create it:
or set
LD_LIBRARY_PATHbefore runningnvidia-dockerornvidia-docker-plugin