nvidia-docker 1 can run OpenGL applications; nvidia-docker 2 can't

Created on 15 Nov 2017  ·  32Comments  ·  Source: NVIDIA/nvidia-docker

When using nvidia-docker 1, I can run applications that use OpenGL in a guest and they will display in the host environment. When trying to the same application in a similarly-configured container with nvidia-docker 2, I always get this error:

libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  154 (GLX)
  Minor opcode of failed request:  3 (X_GLXCreateContext)
  Value in failed request:  0x0
  Serial number of failed request:  35
  Current serial number in output stream:  37

Running nvidia-smi works on the host as well as in containers using either version of nvidia-docker and always produces appropriate output.

I've attached a couple of Dockerfiles and scripts that demonstrate the issue. run-nvidia-docker-1.sh uses Dockerfile.1 to pull nvidia/cuda:8.0-devel-ubuntu16.04, installs mesa-utils, and then uses nvidia-docker to launch a container that maps all of the necessary volumes and then runs glxgears. When I have nvidia-docker 1 installed, it works and glxgears displays as expected. When I completely remove nvidia-docker 1, install 2, purge all existing Docker images and volumes, and try again, I get the above error.

run-nvidia-docker-2.sh is similar, but it uses the stock ubuntu:16.04 image and adds --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=all parameters, so I would expect it to work with nvidia-docker 2. It also produces the same error as above.

Test files: x11-test.tar.gz

Host computer specs:
OS: Ubuntu Linux 16.04
CPU: Intel(R) Xeon(R) CPU E5-1650 v3
Output from nvidia-smi (which works on the host and also in containers using either version of nvidia-docker):

| NVIDIA-SMI 387.12                 Driver Version: 387.12                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro K2000        Off  | 00000000:02:00.0  On |                  N/A |
| 32%   49C    P0    N/A /  N/A |    899MiB /  1996MiB |      8%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Quadro K2000        Off  | 00000000:03:00.0 Off |                  N/A |
| 30%   37C    P8    N/A /  N/A |     12MiB /  1999MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

Most helpful comment

It might be helpful to note this at the beginning of the README for Nvidia-docker2. Even though it's totally fair that it's not implemented yet, it would be good to know, since this is critical for a lot of users (especially the ROS community and those following the ROS communiy's work in using the host's xserver).

All 32 comments

OpenGL is not supported at the moment and there is no plan to support GLX is the near future (same as 1.0). OpenGL+EGL however is on the roadmap and will be supported. We will update #11 once we publish it.

If you are a NGC subscriber and need GLX for your workflow, I suggest you fill out a feature request.

It might be helpful to note this at the beginning of the README for Nvidia-docker2. Even though it's totally fair that it's not implemented yet, it would be good to know, since this is critical for a lot of users (especially the ROS community and those following the ROS communiy's work in using the host's xserver).

I am facing the same problem. It would be helpful to know that such an error can come up with NVidia-Docker2 so we don;t update.
( But please do support OpenGL asap too. Thanks )

Same problem. We are relying on OpemGL apps run in container. That's the only issue stopping us from switching to nvidia-docker 2.

same problem here trying to use ROS in docker. Will revert to nvidia-docker 1.

Please try our new OpenGL beta images based on libglvnd: https://hub.docker.com/r/nvidia/opengl/

Hi @flx42 the OpenGL beta images appear to solve the problem for me in my initial tests.

Will this functionality get rolled into the CUDA docker images?

@oursland We will have a CUDA + OpenGL official image soon (probably called nvidia/cudagl).

@flx42 I found the newly released cudagl images here: https://gitlab.com/nvidia/cudagl

That is awesome! However, it is unclear to me how to customize them so that I can work with CUDA 8, as I do not have access to CUDA 9 at this time.

The descriptions say something like 9.1-devel, 9.1-devel-ubuntu16.04 (9.1/devel/Dockerfile) + (1.0-glvnd/devel/Dockerfile), but it is unclear to me how to "compose" the OpenGL and CUDA images in my own Dockerfile, e.g., to use glnvd and CUDA 8, instead of CUDA 9. (I am not very familiar with Docker.)

Would something like that even be possible?

Thank you!

Hello @AndreiBarsan, we won't publish images with OpenGL and CUDA 8.0. If you need this use case, you can either do FROM nvidia/cuda:8.0-devel and then install libglvnd like we do in our Dockerfile:
https://gitlab.com/nvidia/opengl/tree/ubuntu16.04
But that's probably a bit challenging.

Instead, you should do FROM nvidia/opengl:1.0-glvnd-devel and then add CUDA 8.0 manually:
https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/8.0/runtime/Dockerfile
https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/8.0/devel/Dockerfile

@flx42 Thank you very much for the quick response and the tips. I will try that!

For the sake of anyone else that come here looking, I've created a ros-indigo-desktop-full-nvidia example at https://hub.docker.com/r/lindwaltz/ros-indigo-desktop-full-nvidia/ that adds the necessary libraries on top of the osrf image (opengl, libglvnd, cuda8). Should be easy to reproduce for other ros flavors.

Thanks @lindwaltz !

@ruffsl Do you think we could get ROS images based on our recently released (finally!) OpenGL images:
https://hub.docker.com/r/nvidia/opengl/
https://hub.docker.com/r/nvidia/cudagl/
Let me know what you think, or if you need any help.

@flx42

Let me know what you think, or if you need any help.

your images work really great!

i used them to wrap DaVinci Resolve in Docker containers (see: https://gitlab.com/mash-graz/resolve) and the result works surprisingly well!
it's really impressive, that nvidia-docker is already able to handle quite challenging tasks like this.

i only have to remark one significant issue: your opengl and cudagl are using a GLVND installation, but CentOS, which represents the base of one of your images, doesn't support GLVND utill now! that's a real unpleasant source of all kinds of nasty GLX related troubles. even the most simple tools additionally installed from CentOSs main repository, like a glxinfo command, will not work anymore. at the end, i had to prepare two different images for pure Nvidia use and mixed Cuda+Mesa DRI intel iGPU utilization. but that's nothing to criticize. i just write it down, because others may have to face similar issues.

Thanks for the feedback @mash-graz!
Don't hesitate to do a bug report for the CentOS images, if you want me to take a look.
You can do it here, or on GitLab: https://gitlab.com/nvidia/opengl/

Thanks! I used your images as a starting point and it worked for us as well. Also some observation - as long as you keep the old /usr/local/nvidia/lib , /usr/local/nvidia/lib64 in the LD_LIBRARY_PATH, the image seems to work fine also when starting via nvidia-docker 1 (not sure if keeping the old entries is even needed).

An additional question here. I am able to run glxgears on the host's X server from a Docker image based on nvidia/opengl:1.0-glvnd-devel-ubuntu16.04. Now I'd like to be able to run it (well, actually gazebo) inside of a https://github.com/fcwu/docker-ubuntu-vnc-desktop. But I am getting

X Error of failed request: adValue (integer parameter out of range for operation)
  Major opcode of failed request:   151 (GLX)
  Minor opcode of failed request:   3 (X_GLXCreateContext)
  Value in failed request:  0x0
  Serial number of failed request:  25
  Current serial number in output stream:  26

Is it possible to achieve that or there are some inherent limitations?

@ruffsl probably has an image for this already.

@kuz I'm not sure how your launched your container there, but I posted a minimal example GLX-Gears with nvidia-docker here: https://github.com/NVIDIA/nvidia-docker/issues/136#issuecomment-398593070

With respect to gazebo, we have some examples of using gazebo with nvidia-docker1 at osrf/car_demo , and I have a WIP PR for updating it to use nvidia-docker2 here: https://github.com/osrf/car_demo/pull/40

This is really old and dated, but here is a rabbit hole about getting gazebo server running in docker on a headless server with AWS: https://github.com/ruffsl/gazebo_docker_demos/tree/master/aws

thx @ruffsl. anyone know if there are cudagl images for arm64v8/ubuntu:xenial-20180123? e.g. https://github.com/open-horizon/cogwerx-jetson-tx2/blob/master/Dockerfile.cudabase w/ opengl

critically, i dont see libglvnd0 for arm64 https://launchpad.net/ubuntu/xenial/arm64?text=libglvnd0

i have tried building it to no avail: https://github.com/NVIDIA/nvidia-docker/issues/136#issuecomment-401564867

@ruffsl, yep, I've seen those examples before, but they rely on X-server running on the host machine, I am trying to create a self-contained image where gazebo utilizes server's GPUs and there is a web interface to VNC with LXDE, so that a remote user can just access VNC via browser, and see the whole LXDE in front of his eyes, where he can run gzclient, rviz and do that all from a tablet if that is his wish. I guess nvidia VirtualGL examples are the step in the right direction, I just hoped maybe there is an easier way or that someone had already done it.

Is it even possible?

To answer my own question: yes, it is possible, see https://github.com/willkessler/nvidia-docker-novnc

Hi,

My code (when running on local PC) will pop up CUDA OpenGL post-processing and Super Triangle Calculator window for rendering. However the nvidia docker (in linux server) is not able to show these 2 windows. The error is:
freeglut (c): ERROR: Internal error <FBConfig with necessary capabilities not found> in function fgOpenWindow
It seem the opengl in docker miss some libraries.
I've been trying to find the way for weeks but still not find a solution for this. I've tried nvidia/opengl, nvidia/cudagl, vnc remote.

Is there any to solve this?
Thank you.

tsly

Any plans for cudagl with cudnn support?

Has anyone managed to get ROS working with nvidia-docker? Would be great if you could share steps.

@nvidia Will there also be a stretch example?

@machinekoder , yep. our uni lab uses ROS and nvidia-docker extensively so we can quickly onboard new students joining, share common ML development environments, and isolate simulation cluster resources.

See this ROS wiki for more details:
http://wiki.ros.org/docker/Tutorials/Hardware%20Acceleration#nvidia-docker2

Note that it's a lot easier with Ubuntu 18.04 and up. Related:
https://github.com/NVIDIA/nvidia-docker/issues/136#issuecomment-398593070
https://github.com/osrf/docker_templates/issues/33

@machinekoder: We use the follow simple trick to add opengl support for nvidia-docker2 to any docker image.

Create this Dockerfile:

FROM <some_image>
# e.g. FROM osrf/ros:kinetic-desktop-full

# optional, if the default user is not "root", you might need to switch to root here and at the end of the script to the original user again.
# e.g.
# USER root

RUN apt-get update && apt-get install -y --no-install-recommends \
        pkg-config \
        libxau-dev \
        libxdmcp-dev \
        libxcb1-dev \
        libxext-dev \
        libx11-dev && \
    rm -rf /var/lib/apt/lists/*

# replace with other Ubuntu version if desired
# see: https://hub.docker.com/r/nvidia/opengl/
COPY --from=nvidia/opengl:1.0-glvnd-runtime-ubuntu16.04 \
  /usr/local/lib/x86_64-linux-gnu \
  /usr/local/lib/x86_64-linux-gnu

# replace with other Ubuntu version if desired
# see: https://hub.docker.com/r/nvidia/opengl/
COPY --from=nvidia/opengl:1.0-glvnd-runtime-ubuntu16.04 \
  /usr/local/share/glvnd/egl_vendor.d/10_nvidia.json \
  /usr/local/share/glvnd/egl_vendor.d/10_nvidia.json

RUN echo '/usr/local/lib/x86_64-linux-gnu' >> /etc/ld.so.conf.d/glvnd.conf && \
    ldconfig && \
    echo '/usr/local/$LIB/libGL.so.1' >> /etc/ld.so.preload && \
    echo '/usr/local/$LIB/libEGL.so.1' >> /etc/ld.so.preload

# nvidia-container-runtime
ENV NVIDIA_VISIBLE_DEVICES \
    ${NVIDIA_VISIBLE_DEVICES:-all}
ENV NVIDIA_DRIVER_CAPABILITIES \
    ${NVIDIA_DRIVER_CAPABILITIES:+$NVIDIA_DRIVER_CAPABILITIES,}graphics

# USER original_user

Then build it:

docker build --tag <image_name>:<img_tag>-nvidia .
# e.g.:
docker build --tag osrf/ros:kinetic-desktop-full-nvidia .

Now run it e.g. with:

docker run -it \
    --env="DISPLAY" \
    --env="QT_X11_NO_MITSHM=1" \
    --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
    -env="XAUTHORITY=$XAUTH" \
    --volume="$XAUTH:$XAUTH" \
    --runtime=nvidia \
    osrf/ros:kinetic-desktop-full-nvidia \
    bash

One catch is that this patched image will only work with Nvidia, so not anymore with Intel Graphics acceleration...

@koenlek Thanks.
@ruffsl Thanks for the pointers to the ROS wiki.

I just tried compiling the 16.04 example with debian:stretch base and it works!

We are using this setup with ROS as well. So far I'm the first nvidia user thanks to a recent hardware upgrade.

@koenlek I created a fork of the opengl repo for Debian Stretch and add your usage instructions: https://github.com/machinekoder/nvidia-opengl-docker Works like a charm!

OpenGL is not supported at the moment and there is no plan to support GLX is the near future (same as 1.0). OpenGL+EGL however is on the roadmap and will be supported. We will update #11 once we publish it.

I realize things have changed since then, but I’m still a bit confused about some of the capabilities, terminology and implementation of GPU-based containers in particular with respect to EGL. The most popular one seems to be this one, nvidia-docker.

So a few questions:

  1. In the FAQ (https://github.com/NVIDIA/nvidia-docker/wiki/Frequently-Asked-Questions) ‘Is OpenGL supported?’ Answer is “Yes, EGL is supported.” My question is why are they used interchangeably? From my understanding EGL + glslES is a subset of OpenGL+glsl, is used over a different context type (egl vs opengl context), and though usage is similar, generally demands conscious knowledge of its limitations.
  2. Why would EGL be implemented first? Isn’t OpenGL way more popular?
  3. Why would Cudagl have to be a different docker altogether? Are there any drawbacks to using CudaGL?

Also, if there is a better/more appropriate place to post this question, I'd appreciate the head's up!

I've tried two docker image, both of them show the error message.

  1. docker run --runtime=nvidia --rm nvidia/cuda:10.1-base nvidia-smi
  2. Followed @koenlek 's Dockerfile sample to build my docker image.

The error message like below:

docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:424: container init caused \"process_linux.go:407: running prestart hook 1 caused \\"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig --device=all --graphics --pid=15868 /var/lib/docker/aufs/mnt/6bee740d7fe7096a7bdb52c84b4251fe09f314aab8fff22c720ee04ae8de8be8]\\nnvidia-container-cli: ldcache error: process /sbin/ldconfig terminated with signal 11\\n\\"\"": unknown.

If I just type the command nvidia-smi, it will show below message:
nvidia-smi

Is there any suggestion for me to try?

@hcv1027 At least for CUDA 9.x, I had success with the following setup: https://github.com/Seanmatthews/ros-docker-gazebo

Was this page helpful?
0 / 5 - 0 ratings