I'm using FROM nvidia/cuda:8.0-devel-ubuntu16.04 in my Dockerfile.
After the build and run I'm getting
$ nvidia-smi
bash: nvidia-smi: command not found
May I have to specify the path in my Docker container?
What was the docker command you used?
@flx42 So I have a DOCKER_HOST that points to the running Docker Nvidia container (the GPUs machine) like
export DOCKER_HOST=tcp://x.x.x.x:2376
export DOCKER_TLS_VERIFY=1
set NVIDIA_VER=367.57
and the I connect to the docker instance binding the ports when doing the tunnel to the machine:
$ ssh -i "$DOCKER_CERT" docker@$IP -g -R 10250:localhost:10250 -L 0.0.0.0:3000:127.0.0.1:3000 -L 0.0.0.0:8181:127.0.0.1:8181 -L 5858:127.0.0.1:5858 -L 4567:127.0.0.1:4567
My docker instance is started as usual
$ docker run --rm -it --name $CONTAINER_NAME -p 3000:3000 $CONTAINER_IMG:$CONTAINER_VERSION $CMD
I can connect to the docker nvidia instance from the docker host:
loreto@nvidia-docker:~$ sudo nvidia-docker run --rm nvidia/cuda nvidia-smi
Fri Mar 17 09:08:03 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57 Driver Version: 367.57 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GRID K520 Off | 0000:00:03.0 Off | N/A |
| N/A 35C P8 17W / 125W | 0MiB / 4036MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
and even
loreto@nvidia-docker:~$ nvidia-smi
Fri Mar 17 09:12:59 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57 Driver Version: 367.57 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GRID K520 Off | 0000:00:03.0 Off | N/A |
| N/A 35C P8 17W / 125W | 0MiB / 4036MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
while from my container $CONTAINER_NAME when running I cannot see it
$ docker exec -it $CONTAINER_NAME bash
$ nvidia-smi
nvidia-smi: command not found
The $CONTAINER_NAME was built FROM nvidia/cuda:8.0-devel-ubuntu16.04
[UPDATE]
This issue could be related to https://github.com/NVIDIA/nvidia-docker/issues/105
[UPDATE]
The problem was due to the docker run command, I didn't attach the devices, set the driver to my container:
docker run --rm -it --device=/dev/nvidiactl --device=/dev/nvidia-uvm --device=/dev/nvidia0 -v nvidia_driver_367.57:/usr/local/nvidia:ro --name $CONTAINER_NAME -p 3000:3000 $CONTAINER_IMG:$CONTAINER_VERSION $CMD
To note that here you cannot attach the volume with the option --volume-driver nvidia-docker, due to a limitation of the nvidia driver, so you have to setup the volume first on the host.
Using the --gpus all worked for me
before that the docker container wasn't able to attach/detect the gpu of the host on which it is running
docker run -it --name $CONTAINER_NAME --gpus all -p 3000:3000 $IMAGE_NAME