Nvidia-docker wraps three different docker commands, and passes the rest through. It would have saved me a bit of time to know that nvidia-docker was not mounting the GPU device and doing a silent no-op.
Based on https://github.com/NVIDIA/nvidia-docker/blob/302467b4715b6fa63846617c5073695654da8ffb/tools/src/nvidia-docker/main.go#L52 (the main branch as of this writing), the three commands that nvidia-docker intercepts are create / run / volume.
One way that this had bitten me was using a Dockerfile (which is its own issue).
I was trying to build a container image via nvidia-docker build ., where docker build . is a common pattern (afaict), and I had assertions during my build that would run tests that tested app functionality on the GPU.
Compare running nvidia-docker run nvidia/cuda:7.5-devel nvidia-smi versus starting a Dockerfile with
FROM nvidia/cuda:7.5-devel
RUN nvidia-smi
...
There are two workarounds about the Dockerfile issue: 1) rewrite the Dockerfile as a shell script, and pass it to run; 2) run the image build via straight docker, and run tests thereafter.
Support of Dockerfiles is useful and is orthogonal to transparency about a nvidia-docker no-op.
Thanks very much :)
I'm not really sure I understood correctly. You want us to expose GPUs at build time?
We don't do that because it breaks the Docker philosophy, namely Docker images should be portable.
You should be able to build your Docker image on any node, including one that doesn't have any GPUs and later on, schedule it on different nodes with potentially different GPU models.
If I may, maybe something along those lines would do. Let's say you want to build myapp based on mylib
In mylib/Dockerfile:
FROM cuda:tag
RUN install mylib
CMD make check # GPU tests
In myapp/Dockerfile:
FROM mylib
RUN install myapp
CMD ./myapp # overwrite the above
@flx42 that workflow looks particularly elegant, thank you.
@3XX0 that sounds wholly reasonable. Thanks for your reply, and your indulgence in my request about support of a Dockerfile using a GPU. I'd seen that it was a bit of a haul to build ARM containers on x86 boxes ( https://resin.io/blog/building-arm-containers-on-any-x86-machine-even-dockerhub/ ), so I didn't expect cross-GPU portability.
I'll note that the larger point remains, of nvidia-docker not making visible that it will not set up volumes and attach devices. This may be low-priority.
I don't want to log that when using nvidia-docker because we want it to be a drop-in replacement for docker. This would add noise to the output.
I documented it briefly on the wiki:
https://github.com/NVIDIA/nvidia-docker/wiki/Using-nvidia-docker#description
If you have other suggestions to document this behavior, I would be glad to hear them.