Nvidia-docker: Optix 6.0 not supported in docker

Created on 12 Jun 2019  路  12Comments  路  Source: NVIDIA/nvidia-docker

1. Issue or feature description

Since Optix 6.0 a part of its libraries was moved to the GPU driver and became inaccessible in docker. Initialization of the Optix 6 context fails in docker with the error "Failed to load OptiX library", while it is working correctly in the host. The same procedure is working correctly in both, host and docker, with Optix 5.

This issue was reported on the nvidia developers forum and also reported in the nvidia-docker issues, however, people were directed to libnvidia-container support. I submitted issue there, but I am not sure if it does not fit better here.

2. Steps to reproduce the issue

I compiled an image with one of the Optix 6 SDK samples (failing) and the same with Optix 5 (running OK):

Image is configured to run Optix 6 sample:

docker run --rm --runtime=nvidia rsulej/optix-docker-test

You can run exactly the same but with Optix 5 in the interactive mode:

docker run --rm --runtime=nvidia -it rsulej/optix-docker-test sh

and inside the container:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/minimalOptixSample/optix5
/minimalOptixSample/optix5/optixDeviceQuery

The whole setup, including Dockerfile is available in GitHub.

Thanks for help!
Robert

enhancement

Most helpful comment

sudo docker run -e  NVIDIA_DRIVER_CAPABILITIES=graphics --gpus all nvidia/cuda:10.0-base

All 12 comments

Sorry this never got answered. From what I understand (though I haven't looked into it very deeply), you will need to mount libnvoptix.so.X and libnvidia-rtcore.so.X from the hist to the container.

Unfortunately extending support for Optix into container is a bit further down the roadmap and hence won't get tackled natively for a few months.

Well... that works!

Someone from the OptiX forum already tried copying files to docker, but missed libnvidia-rtcore.so.X.

I just mounted all the files you mention and the device query sample works fine. I need to point manually to the exact driver version, but for the moment it is perfectly enough. If I run into troubles with a more sophisticated app, I'll be back.

Thanks!
Robert

Thought the recent major changes with how nvidia-docker interacts with Docker 19.03, nvidia-container-runtime 3.1, the proprietary driver 430, etc. might have addressed this, but it is still an issue.

Things are moving forward.. In the new OptiX 7 all the OptiX symbols (and also cuDNN for AI denoiser) are moved to the driver. I did not try yet if @RenaudWasTaken solution will work and which driver files need to be mounted. Just letting you know there are major changes.

With libnvidia-container1 version 1.0.4 (or newer) I added an experimental support for this.

Experimental because I really just mounted the two libraries without testing or looking into what more might be required.

Feel free to test and give me feedback :)

Thanks! I'll try and let you know.

Hi @RenaudWasTaken, Thanks for your work! I'm having libnvidia-container1 == 1.0.5, though. libnvoptix.so and libnvidia-rtcore.so are still not mounted into the container automatically. Do I need to turn the behavior on with any flags?

You can try it with the environment variable NVIDIA_DRIVER_CAPABILITIES set to graphics

I tried it but no luck. This was the command executed NVIDIA_DRIVER_CAPABILITIES=graphics sudo docker run -d -p 2222:22 --rm --gpus all --name test chenzhekl/test.

and the driver version on the host:

Driver Version: 440.33.01    
CUDA Version: 10.2
sudo docker run -e  NVIDIA_DRIVER_CAPABILITIES=graphics --gpus all nvidia/cuda:10.0-base

My bad.. Thanks for your help! Everything works now.

Closing for now as this seems to be resolved.

Was this page helpful?
0 / 5 - 0 ratings