Supposedly SDL2 already allows off-screen rendering on NVidia devices with CUDA enabled and selecting a specific device using
SDL_VIDEODRIVER=offscreen SDL_HINT_CUDA_DEVICE=0 ./CarlaUE4.sh
Please check if this works.
Yes this works for me on linux
This also works for me in linux !
On recent experiments I observed a little bit of instability on the SDL solution.
In some situations, CARLA seems to run much slower. Even taking more than the Timeout of 10
seconds to give a response to the client.
I can run this without screen, but it seems that I can only run on GPU 0 no matter what number I set for SDL_HINT_CUDA_DEVICE.
Can confirm @mahaoran1997 's observation that SDL_HINT_CUDA_DEVICE has no effect. I also observed that SDL_VIDEODRIVER=offscreen isn't necessary if there is no X server present, SDL seems to automatically select that method.
@nsubiron do you have any additional information about SDL rendering on CUDA? I couldn't find any infos and the official SDL site doesn't even list it as an option https://wiki.libsdl.org/FAQUsingSDL .
Tried this, but returns a segmentation fault error
Are you running with nvidia-docker @bhaprayan, i.e.
docker run --runtime=nvidia ...
https://hg.libsdl.org/SDL/log?rev=SDL_HINT_CUDA_DEVICE in SDL source code nothing about SDL_HINT_CUDA_DEVICE



@felipecode @rmst @nsubiron @seken what version SDL you use? @crizCraig
@crizCraig Nopes, was running on a native machine. Got it to work though. I'm running with passing in -carla-server as a flag now. I think that's what solved the issue, don't recall though.
Like @rmst , running with SDL_HINT_CUDA_DEVICE=2 doesn't do anything on my system, neither does setting CUDA_VISIBLE_DEVICES=2 and NVIDIA_VISIBLE_DEVICES=2.
I also tried placing r.GraphicsAdapter=2 in 'CarlaUE4/Saved/Config/LinuxNoEditor/Engine.ini', as suggested in the UE4 forums, that too didn't work.
Finally, I tried using vglrun to specify the GPU, but still CARLA runs on GPU 0 instead of GPU 2 (zero-indexed) on the Ubuntu 16.04 system I'm testing on.
I'm using the version of SDL that ships with Ubuntu 16.04: libsdl2 2.0.4
UPDATE: Upon examining the SDL 2.0.x source code, I can't find evidence that SDL_HINT_CUDA_DEVICE is meaningful.
# SDL_VIDEODRIVER exists
grep -lIr SDL_VIDEODRIVER SDL2-2.0.9 | wc -l
6
# SDL_HINT_CUDA_DEVICE does not
grep -lIr SDL_HINT_CUDA_DEVICE SDL2-2.0.9 | wc -l
0
Unless this variable is somehow generated and not hard coded or has been removed in SDL 2.0+, it appears to be a myth (one which is also propagated in other project discussions, not just CARLA). I have asked for clarification on the SDL web forums.
For those interested, here are some other combinations I've tried. Singularity image is the CARLA provided Docker Ubuntu 16.04 turned into a Singularity image, the host is Ubuntu 18.04:
# host info, truncated some stuff
nvidia-smi -L | wc -l
16
awk -F '=' '/VERSION=/ {print $2}' /etc/os-release
"18.04.2 LTS (Bionic Beaver)"
nvidia-smi | awk '/Driver Version/ {print $3}'
418.67
singularity --version
singularity version 3.0.2-87
# create sif from upstream carla image
singularity pull docker://carlasim/carla:0.9.5
# create a writeable home directory with the binary the way CARLA entryscript expects
CARLA_WORKSPACE=`pwd`/workspace/home/carla
install -d $CARLA_WORKSPACE
singularity exec -C -H $CARLA_WORKSPACE images/carla_0.9.5.sif /bin/bash -c 'cp -r /home/carla/* .'
# image has SDL installed
singularity exec images/carla_0.9.5.sif /bin/bash -c 'apt list --installed | grep sdl'
libsdl2-2.0-0/now 2.0.4+dfsg1-2ubuntu2 amd64 [installed,local]
# Run standard way, runs on GPU 0, as expected
singularity run --nv -C -H $CARLA_WORKSPACE images/carla_0.9.5.sif
# still runs on GPU 0
SINGULARITYENV_SDL_VIDEODRIVER=offscreen SINGULARITYENV_SDL_HINT_CUDA_DEVICE=5 SINGULARITYENV_NVIDIA_VISIBLE_DEVICES=5 SINGULARITYENV_CUDA_VISIBLE_DEVICES=5 singularity run --nv -C -H $CARLA_WORKSPACE images/carla_0.9.5.sif
# on the offhand chance the variables must be on the host, in the container, and the entryscript is canceling them out... still runs on GPU 0
SINGULARITYENV_SDL_VIDEODRIVER=offscreen SDL_VIDEODRIVER=offscreen SINGULARITYENV_SDL_HINT_CUDA_DEVICE=5 SDL_HINT_CUDA_DEVICE=5 SINGULARITYENV_NVIDIA_VISIBLE_DEVICES=5 NVIDIA_VISIBLE_DEVICES=5 SINGULARITYENV_CUDA_VISIBLE_DEVICES=5 CUDA_VISIBLE_DEVICES=5 singularity exec --nv -C -H $CARLA_WORKSPACE images/carla_0.9.5.sif CarlaUE4/Binaries/Linux/CarlaUE4 CarlaUE4 -carla-server
# run without singularity, still on device 0 (not surprising, SDL isn't installed on the host)
SDL_VIDEODRIVER=offscreen SDL_HINT_CUDA_DEVICE=5 NVIDIA_VISIBLE_DEVICES=5 CUDA_VISIBLE_DEVICES=5 workspace/home/carla/CarlaUE4/Binaries/Linux/CarlaUE4 CarlaUE4 -carla-server
I asked over in the nvidia forums for a general solution to selecting which GPU an OpenGL process runs on:
UPDATE: CARLA 0.9.6 will now let you select which GPU to run on, for some reason this wasn't working in CARLA 0.9.5 (possibly an artifact of the way UE4 was built).
Our contacts at nVidia found where the mysterious SDL_HINT_CUDA_DEVICE variable exists, it was in the UE4 source, NOT the upstream SDL or CARLA codebase. The UE4 vendor has their own flavor of SDL in their repository that contains this variable, which explains why I didn't find it while grepping the upstream SDL codebase.
$ SDL_VIDEODRIVER=offscreen SDL_HINT_CUDA_DEVICE=0 ./CarlaUE4.sh &> /tmp/carla0.txt &
$ SDL_VIDEODRIVER=offscreen SDL_HINT_CUDA_DEVICE=1 ./CarlaUE4.sh -carla-world-port=5010 &> /tmp/carla1.txt &
$ SDL_VIDEODRIVER=offscreen SDL_HINT_CUDA_DEVICE=2 ./CarlaUE4.sh -carla-world-port=5020 &> /tmp/carla2.txt &
$ SDL_VIDEODRIVER=offscreen SDL_HINT_CUDA_DEVICE=3 ./CarlaUE4.sh -carla-world-port=5030 &> /tmp/carla3.txt &
$ nvidia-smi
Sat Sep 21 08:48:41 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:00:06.0 Off | 0 |
| N/A 35C P0 48W / 300W | 928MiB / 16130MiB | 37% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-SXM2... On | 00000000:00:07.0 Off | 0 |
| N/A 34C P0 61W / 300W | 928MiB / 16130MiB | 17% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla V100-SXM2... On | 00000000:00:08.0 Off | 0 |
| N/A 33C P0 61W / 300W | 928MiB / 16130MiB | 20% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla V100-SXM2... On | 00000000:00:09.0 Off | 0 |
| N/A 35C P0 60W / 300W | 928MiB / 16130MiB | 18% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 56887 C+G .../Binaries/Linux/CarlaUE4-Linux-Shipping 917MiB |
| 1 57023 C+G .../Binaries/Linux/CarlaUE4-Linux-Shipping 917MiB |
| 2 57147 C+G .../Binaries/Linux/CarlaUE4-Linux-Shipping 917MiB |
| 3 57273 C+G .../Binaries/Linux/CarlaUE4-Linux-Shipping 917MiB |
+-----------------------------------------------------------------------------+
Docker is slightly less intuitive. The environment variable SDL_HINT_CUDA_DEVICE=1 is ignored inside the Docker container for reasons I've yet to determine. Further, docker run --gpus 1 specifies only one GPU is visible, but it defaults to GPU index 0. You need to add the 'device=' prefix, e.g. --gpus 'device=1' to force it to only expose GPU 1 to the container. By restricting GPU visibility, one can force CARLA to run on a specific GPU.
Following up on @qhaas comment, it is possible to have GPU selection working inside docker by changing the FROM nvidia/opengl:1.0-glvnd-runtime-ubuntu16.04 in Release.Dockerfile to FROM nvidia/cudagl:10.0-runtime-ubuntu16.04. With this new image, the SDL_HINT_CUDA_DEVICE=1 achieves the desired effect.
UPDATE: CARLA 0.9.6 will now let you select which GPU to run on, for some reason this wasn't working in CARLA 0.9.5 (possibly an artifact of the way UE4 was built).
Our contacts at nVidia found where the mysterious
SDL_HINT_CUDA_DEVICEvariable exists, it was in the UE4 source, NOT the upstream SDL or CARLA codebase. The UE4 vendor has their own flavor of SDL in their repository that contains this variable, which explains why I didn't find it while grepping the upstream SDL codebase.$ SDL_VIDEODRIVER=offscreen SDL_HINT_CUDA_DEVICE=0 ./CarlaUE4.sh &> /tmp/carla0.txt & $ SDL_VIDEODRIVER=offscreen SDL_HINT_CUDA_DEVICE=1 ./CarlaUE4.sh -carla-world-port=5010 &> /tmp/carla1.txt & $ SDL_VIDEODRIVER=offscreen SDL_HINT_CUDA_DEVICE=2 ./CarlaUE4.sh -carla-world-port=5020 &> /tmp/carla2.txt & $ SDL_VIDEODRIVER=offscreen SDL_HINT_CUDA_DEVICE=3 ./CarlaUE4.sh -carla-world-port=5030 &> /tmp/carla3.txt & $ nvidia-smi Sat Sep 21 08:48:41 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla V100-SXM2... On | 00000000:00:06.0 Off | 0 | | N/A 35C P0 48W / 300W | 928MiB / 16130MiB | 37% Default | +-------------------------------+----------------------+----------------------+ | 1 Tesla V100-SXM2... On | 00000000:00:07.0 Off | 0 | | N/A 34C P0 61W / 300W | 928MiB / 16130MiB | 17% Default | +-------------------------------+----------------------+----------------------+ | 2 Tesla V100-SXM2... On | 00000000:00:08.0 Off | 0 | | N/A 33C P0 61W / 300W | 928MiB / 16130MiB | 20% Default | +-------------------------------+----------------------+----------------------+ | 3 Tesla V100-SXM2... On | 00000000:00:09.0 Off | 0 | | N/A 35C P0 60W / 300W | 928MiB / 16130MiB | 18% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 56887 C+G .../Binaries/Linux/CarlaUE4-Linux-Shipping 917MiB | | 1 57023 C+G .../Binaries/Linux/CarlaUE4-Linux-Shipping 917MiB | | 2 57147 C+G .../Binaries/Linux/CarlaUE4-Linux-Shipping 917MiB | | 3 57273 C+G .../Binaries/Linux/CarlaUE4-Linux-Shipping 917MiB | +-----------------------------------------------------------------------------+Docker is slightly less intuitive. The environment variable
SDL_HINT_CUDA_DEVICE=1is ignored inside the Docker container for reasons I've yet to determine. Further,docker run --gpus 1specifies only one GPU is visible, but it defaults to GPU index 0. You need to add the 'device=' prefix, e.g.--gpus 'device=1'to force it to only expose GPU 1 to the container. By restricting GPU visibility, one can force CARLA to run on a specific GPU.
This doesn't work for me. I'm running CARLA 0.9.6-29 on Ubuntu 18.04 the SDL_HINT_CUDA_DEVICE has literally NO effect on my setup
Any suggestions?
From CARLA 0.9.5 release GPU may also be selected with CUDA_VISIBLE_DEVICES environment variable. This works for me while starting CARLA in docker with multiple GPUs. Possibly also slightly older releases supports this.
Using CUDA_VISIBLE_DEVICES makes no difference in my setup :cry:
Wed Nov 27 13:04:17 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48 Driver Version: 410.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro P400 Off | 00000000:17:00.0 On | N/A |
| 34% 39C P0 N/A / N/A | 1172MiB / 2000MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 208... Off | 00000000:B3:00.0 Off | N/A |
| 41% 30C P8 21W / 250W | 0MiB / 10986MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.3 LTS
Release: 18.04
Codename: bionic
This appears to no longer work on Ubuntu. Any suggestions?
I've just downloaded official carla 0.9.6 release and tested with:
DISPLAY= CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=1 ./CarlaUE4.sh -opengl
And nvidia-smi shows carla running on 2nd card.
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 2085 G /usr/lib/xorg/Xorg 282MiB |
| 1 17473 C+G .../Binaries/Linux/CarlaUE4-Linux-Shipping 669MiB |
CUDA_DEVICE_ORDER=PCI_BUS_IDGreat! But still does not work on my machine. But now at least I see the process showing up on the right GPU
DISPLAY= CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=1 ./CarlaUE4.sh -opengl
4.22.3-0+++UE4+Release-4.22 517 0
Disabling core dumps.
LowLevelFatalError [File:Unknown] [Line: 102]
Exception thrown: bind: Address already in use
Signal 11 caught.
Malloc Size=65538 LargeMemoryPoolOffset=65554
CommonUnixCrashHandler: Signal=11
Malloc Size=65535 LargeMemoryPoolOffset=131119
Malloc Size=123824 LargeMemoryPoolOffset=254960
Engine crash handling finished; re-raising signal 11 for the default handler. Good bye.
Segmentation fault (core dumped)
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1614 G /usr/lib/xorg/Xorg 278MiB |
| 0 8992 G /usr/bin/compiz 291MiB |
| 0 13095 G ...uest-channel-token=10380395948321160125 245MiB |
| 1 3445 G .../Binaries/Linux/CarlaUE4-Linux-Shipping 115MiB |
+-----------------------------------------------------------------------------+
Do you mind sharing to X11 config file, I guess that might be related
Since with DISPLAY set to empty string, simulator is run offscreen mode - it communicates with NVIDIA GPU directly skipping X server. It runs on systems with and without xserver.
This error is caused because you have already started simulator which is binded to 2000 port
Exception thrown: bind: Address already in use
@pawel-ziecina thanks a lot! I had a service running in the background using that port. Now it's working like a charm! Thanks a lot!
I have got this error:
4.22.1-0+++UE4+Release-4.22 517 0
Disabling core dumps.
Signal 11 caught.
Malloc Size=65538 LargeMemoryPoolOffset=65554
CommonUnixCrashHandler: Signal=11
Malloc Size=65535 LargeMemoryPoolOffset=131119
Malloc Size=111328 LargeMemoryPoolOffset=242464
Engine crash handling finished; re-raising signal 11 for the default handler. Good bye.
Signal 11 caught.
Malloc Size=65538 LargeMemoryPoolOffset=65554
CommonUnixCrashHandler: Signal=11
Malloc Size=65535 LargeMemoryPoolOffset=131119
Malloc Size=111328 LargeMemoryPoolOffset=242464
Engine crash handling finished; re-raising signal 11 for the default handler. Good bye.
Signal 11 caught.
Malloc Size=65538 LargeMemoryPoolOffset=65554
CommonUnixCrashHandler: Signal=11
Malloc Size=65535 LargeMemoryPoolOffset=131119
Malloc Size=98832 LargeMemoryPoolOffset=229968
Engine crash handling finished; re-raising signal 11 for the default handler. Good bye.
Segmentation fault (core dumped)
How do I run the Carla training model on a Linux remote server without root access
How can i do when face it锛燂紵锛燂紵锛焢lease help me
[2020.03.06-08.36.02:857][ 0]LogInit: Using OS detected language (en-US).
[2020.03.06-08.36.02:857][ 0]LogInit: Using OS detected locale (en-US).
[2020.03.06-08.36.02:859][ 0]LogTextLocalizationManager: No specific localization for 'en-US' exists, so the 'en' localization will be used.
Signal 11 caught.
Malloc Size=131076 LargeMemoryPoolOffset=131092
CommonLinuxCrashHandler: Signal=11
Malloc Size=65535 LargeMemoryPoolOffset=196655
This conversation got totally obsolete. The current official way to run CARLA offscreen is using Nvidia-docker. Please open a new issue if further discussion is needed.
Most helpful comment
Yes this works for me on linux