Detectron2: RuntimeError: CUDA error: no kernel image is available for execution on the device

Created on 4 Nov 2019  路  18Comments  路  Source: facebookresearch/detectron2

Hi, I'm using detectron2 on a computing cluster and thus have various gpus that the code will be run on as per allocation. detectron was installed successfully and i'm able to import it from python.

However I get the following error on certain(most) gpus:

RuntimeError: CUDA error: no kernel image is available for execution on the device (ROIAlign_forward_cuda at /network/home/guptagun/od/detectron2_repo/detectron2/layers/csrc/ROIAlign/ROIAlign_cuda.cu:361)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7f2459bbc687 in /network/home/guptagun/anaconda3/envs/detectron/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: detectron2::ROIAlign_forward_cuda(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0xa24 (0x7f23f419189c in /network/home/guptagun/od/detectron2_repo/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #2: detectron2::ROIAlign_forward(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0xb6 (0x7f23f4132f66 in /network/home/guptagun/od/detectron2_repo/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #3: <unknown function> + 0x4ec8f (0x7f23f4144c8f in /network/home/guptagun/od/detectron2_repo/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #4: <unknown function> + 0x49750 (0x7f23f413f750 in /network/home/guptagun/od/detectron2_repo/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
<omitting python frames>
frame #9: THPFunction_apply(_object*, _object*) + 0x8d6 (0x7f245a4abe96 in /network/home/guptagun/anaconda3/envs/detectron/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

While on some gpus (one of them being Geforce GTX) the code runs as expected.

I was trying to run the demo.py file through:

python detectron2_repo/demo/demo.py --config-file detectron2_repo/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input ./leftImg8bit.png --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl

Environment

output of python -m detectron2.utils.collect_env.

------------------------  --------------------------------------------------
sys.platform              linux
Python                    3.7.5 (default, Oct 25 2019, 15:51:11) [GCC 7.3.0]
Numpy                     1.15.4
Detectron2 Compiler       GCC 7.4
Detectron2 CUDA Compiler  10.0
DETECTRON2_ENV_MODULE     <not set>
PyTorch                   1.3.0
PyTorch Debug Build       False
torchvision               0.4.1a0+d94043a
CUDA available            True
GPU 0                     GeForce GTX TITAN X
CUDA_HOME                 None
Pillow                    5.3.0
cv2                       4.1.0
------------------------  --------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_50,code=compute_50
  - CuDNN 7.6.3
  - Magma 2.5.1
  - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, 

when i build detectron using : python setup.py build develop
TORCH_CUDA_ARCH_LIST was set empty, and so it should have been compiled for all architectures? (acc to https://github.com/facebookresearch/detectron2/issues/62#issuecomment-549432420)
What can I do while compiling so that I'm able to use detectron on most gpus, or is this an issue with the compute node I'm using?

Thanks,
Gunshi

installation / environment

Most helpful comment

@ppwwyyxx I am facing the exact same issue and my pytorch and detectron2 are compiled with exact same cuda versions. Also, I am facing this issue when I try to run detectron2 on different GPU than the one I have used to compile it. Here I compiled with titanX GPU, so it doesn't work on titanrtx or other GPUs. Note that I haven't installed using pip as I am modifying the codebase(only python files, not touching any cuda implementation) for my research, not sure if that has any effect though. Here is the output of python -m detectron2.utils.collect_env. Could you try installing on one GPU and test on other and see if this is general issue or I messed something up.

------------------------  --------------------------------------------------
sys.platform              linux
Python                    3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0]
Numpy                     1.17.2
Detectron2 Compiler       GCC 7.4
Detectron2 CUDA Compiler  10.0
DETECTRON2_ENV_MODULE     <not set>
PyTorch                   1.3.0
PyTorch Debug Build       False
torchvision               0.4.1a0+d94043a
CUDA available            True
GPU 0                     TITAN V
CUDA_HOME                 /ai/apps/cuda/10.0
NVCC                      Cuda compilation tools, release 10.0, V10.0.130
Pillow                    6.2.0
cv2                       4.1.0
------------------------  --------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CUDA Runtime 10.0
  - NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_50,code=compute_50
  - CuDNN 7.6.3
  - Magma 2.5.1
  - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, 

All 18 comments

Detectron2 CUDA Compiler 10.0

  • CUDA Runtime 10.1

Your cuda versions mismatch and that's not allowed.

Hi, I have actually tried loading both cuda/10.0 and cuda/10.1 modules one by one before running the demo.py command and i still get the error, are you saying i should install pytorch and then detectron2 again but after loading cuda 10.0 specifically?

You should in general look at python -m detectron2.utils.collect_env to see whether this has been fixed prior to running detectron2.
You need to install the pytorch that matches your cuda module and recompile detectron2 afterwards.

@ppwwyyxx I am facing the exact same issue and my pytorch and detectron2 are compiled with exact same cuda versions. Also, I am facing this issue when I try to run detectron2 on different GPU than the one I have used to compile it. Here I compiled with titanX GPU, so it doesn't work on titanrtx or other GPUs. Note that I haven't installed using pip as I am modifying the codebase(only python files, not touching any cuda implementation) for my research, not sure if that has any effect though. Here is the output of python -m detectron2.utils.collect_env. Could you try installing on one GPU and test on other and see if this is general issue or I messed something up.

------------------------  --------------------------------------------------
sys.platform              linux
Python                    3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0]
Numpy                     1.17.2
Detectron2 Compiler       GCC 7.4
Detectron2 CUDA Compiler  10.0
DETECTRON2_ENV_MODULE     <not set>
PyTorch                   1.3.0
PyTorch Debug Build       False
torchvision               0.4.1a0+d94043a
CUDA available            True
GPU 0                     TITAN V
CUDA_HOME                 /ai/apps/cuda/10.0
NVCC                      Cuda compilation tools, release 10.0, V10.0.130
Pillow                    6.2.0
cv2                       4.1.0
------------------------  --------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CUDA Runtime 10.0
  - NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_50,code=compute_50
  - CuDNN 7.6.3
  - Magma 2.5.1
  - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, 

I am facing this issue when I try to run detectron2 on different GPU than the one I have used to compile it.

answered in https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md#common-installation-issues

@ppwwyyxx can you tell me if there is any way to set TORCH_CUDA_ARCH_LIST such that it works across all GPUs which support a particular cuda version? I see this variable set in Dockerfile for a set of GPUs, is there any command to set it for all? Like TORCH_CUDA_ARCH_LIST=All, rather than specifying GPUs TORCH_CUDA_ARCH_LIST="Maxwell;Maxwell+Tegra;Pascal;Volta;Turing"

This is a pytorch question and you can refer to https://github.com/pytorch/pytorch/issues/18781
It does not seem like "All" works for extension compilation at the moment.

I have the same problem trying to execute the following code:

>>> import torch
>>> device = torch.device("cuda")
>>> torch.rand(10).to(device)

RuntimeError: CUDA error: no kernel image is available for execution on the device

I solved the problem changing the driver from open source to NVIDIA proprietary (Ubuntu 18.04)

before: Using NVIDIA driver metapackage from nvidia-driver-410 (open source)
after: Using NVIDIA driver metapackage from nvidia-driver-390 (proprietary)

>>> torch.rand(10).to(device)
tensor([0.9129, 0.8937, 0.7499, 0.5510, 0.5670, 0.9313, 0.3335, 0.4019, 0.2288,
        0.4771])

Hope it helps.
Screenshot from 2020-02-04 20-59-19

Hi @ppwwyyxx! I am having the same problem and the solution from Common Installation Issues didn't help.
I am running the code on Tesla V100 and getting an error. Running on Tesla K80 didn't produce such an error.

The error:

RuntimeError: CUDA error: no kernel image is available for execution on the device (ROIAlign_forward_cuda at /home/veronica/detectron2/detectron2/layers/csrc/ROIAlign/ROIAlign_cuda.cu:364)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7fadadba1193 in /home/veronica/dirs/detectron2/env2/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: detectron2::ROIAlign_forward_cuda(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0x9f4 (0x7fada3f24f2d in /home/veronica/dirs/detectron2/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #2: detectron2::ROIAlign_forward(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0x9c (0x7fada3ea436c in /home/veronica/dirs/detectron2/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #3: + 0x55465 (0x7fada3eb5465 in /home/veronica/dirs/detectron2/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #4: + 0x555fe (0x7fada3eb55fe in /home/veronica/dirs/detectron2/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #5: + 0x4fe33 (0x7fada3eafe33 in /home/veronica/dirs/detectron2/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)

Output of python -m detectron2.utils.collect_env:

`------------------------ ----------------------------------------------------------------------------------
sys.platform linux
Python 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0]
numpy 1.18.1
detectron2 0.1 @/home/veronica/dirs/detectron2/detectron2
detectron2 compiler GCC 6.3
detectron2 CUDA compiler 10.1
detectron2 arch flags sm_37
DETECTRON2_ENV_MODULE
PyTorch 1.4.0 @/home/veronica/dirs/detectron2/env2/lib/python3.7/site-packages/torch
PyTorch debug build False
CUDA available True
GPU 0 Tesla V100-SXM2-16GB
CUDA_HOME /usr/local/cuda
NVCC Cuda compilation tools, release 10.1, V10.1.243
TORCH_CUDA_ARCH_LIST 6.0;6.1;6.2;7.0;7.5
Pillow 7.0.0
torchvision 0.5.0 @/home/veronica/dirs/detectron2/env2/lib/python3.7/site-packages/torchvision
torchvision arch flags sm_35, sm_50, sm_60, sm_70, sm_75
cv2 4.2.0


PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 10.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.1

`

"Detectron2 CUDA Compiler", "CUDA_HOME", "PyTorch built with - CUDA" all contain cuda libraries of the same version.
I also tried to run export TORCH_CUDA_ARCH_LIST=6.0,7.0 & python train_net.py and it didn't help, I got the same error.

You need to __rebuild__ detectron2 with export TORCH_CUDA_ARCH_LIST=6.0;7.0.
Or build on the machine where you run detectron2.

@ppwwyyxx
Hi, I am new to using Pytorch. I am facing the above error. My system specifications are given below:

OS : Ubuntu 16.04
CUDA version - 10.1
Device count - 1
Device name - GeForce GT 720
Device capability - (3,5)
Pytorch version - 1.4.0

Can someone guide me as to how to resolve this issue?

Thanks in advance.

All information about such issues are given in https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md#common-installation-issues

What does the error even mean? what is actually going wrong?

RuntimeError: CUDA error: no kernel image is available for execution on the device

Hi@veronikayurchuk, did you solve your problem, I'm facing the same problem now. Could you give me some advice. Thanks!

You need to rebuild detectron2 with export TORCH_CUDA_ARCH_LIST=6.0,7.0.
Or build on the machine where you run detectron2.

Thank you for your answer.

My setup is
(1) GPU 0,1 GeForce GTX TITAN X (arch=5.2)
(2) GPU 0,1,2,3 TITAN RTX (arch=7.5)

(1), (2) share same conda env.
but problem only for (1).
I guess the difference of ARCH.
torchvision arch flags 3.5, 5.0, 6.0, 7.0, 7.5: there is no arch=5.3 for (1).
to do so, I try export TORCH_CUDA_ARCH_LIST=5.2,7.5 & python -m pip install -e detectron2

but I got a bellow message, could you recommend some solution?

Traceback (most recent call last): File "<string>", line 1, in <module> File "/data2/detectron2/setup.py", line 222, in <module> cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension}, File "/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/setuptools/__init__.py", line 144, in setup return distutils.core.setup(**attrs) File "/home/miruware/anaconda3/envs/blend/lib/python3.6/distutils/core.py", line 148, in setup dist.run_commands() File "/home/miruware/anaconda3/envs/blend/lib/python3.6/distutils/dist.py", line 955, in run_commands self.run_command(cmd) File "/home/miruware/anaconda3/envs/blend/lib/python3.6/distutils/dist.py", line 974, in run_command cmd_obj.run() File "/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/setuptools/command/develop.py", line 38, in run self.install_for_development() File "/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/setuptools/command/develop.py", line 140, in install_for_development self.run_command('build_ext') File "/home/miruware/anaconda3/envs/blend/lib/python3.6/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/home/miruware/anaconda3/envs/blend/lib/python3.6/distutils/dist.py", line 974, in run_command cmd_obj.run() File "/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 87, in run _build_ext.run(self) File "/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run _build_ext.build_ext.run(self) File "/home/miruware/anaconda3/envs/blend/lib/python3.6/distutils/command/build_ext.py", line 339, in run self.build_extensions() File "/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 580, in build_extensions build_ext.build_extensions(self) File "/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions _build_ext.build_ext.build_extensions(self) File "/home/miruware/anaconda3/envs/blend/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions self._build_extensions_serial() File "/home/miruware/anaconda3/envs/blend/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial self.build_extension(ext) File "/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 208, in build_extension _build_ext.build_extension(self, ext) File "/home/miruware/anaconda3/envs/blend/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension depends=ext.depends) File "/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 411, in unix_wrap_ninja_compile cuda_post_cflags = unix_cuda_flags(cuda_post_cflags) File "/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 336, in unix_cuda_flags cflags + _get_cuda_arch_flags(cflags)) File "/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1316, in _get_cuda_arch_flags raise ValueError("Unknown CUDA arch ({}) or GPU not supported".format(arch)) ValueError: Unknown CUDA arch (5.2,7.5) or GPU not supported


(1) GPU 0,1 GeForce GTX TITAN X (arch=5.2)


sys.platform linux
Python 3.6.10 |Anaconda, Inc.| (default, Mar 25 2020, 23:51:54) [GCC 7.3.0]
numpy 1.19.1
detectron2 0.2.1 @/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/detectron2
Compiler GCC 7.5
CUDA compiler CUDA 10.2
detectron2 arch flags 7.5
DETECTRON2_ENV_MODULE
PyTorch 1.5.0 @/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/torch
PyTorch debug build False
GPU available True
GPU 0,1 GeForce GTX TITAN X (arch=5.2)
CUDA_HOME /usr/local/cuda-10.1
Pillow 7.1.2
torchvision 0.6.0a0+82fd1c8 @/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/torchvision
torchvision arch flags 3.5, 5.0, 6.0, 7.0, 7.5
fvcore 0.1.2.post20201013
cv2 3.4.2


PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.2
  • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

(2) GPU 0,1,2,3 TITAN RTX (arch=7.5)


sys.platform linux
Python 3.6.10 |Anaconda, Inc.| (default, Mar 25 2020, 23:51:54) [GCC 7.3.0]
numpy 1.19.1
detectron2 0.2.1 @/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/detectron2
Compiler GCC 7.5
CUDA compiler CUDA 10.2
detectron2 arch flags 7.5
DETECTRON2_ENV_MODULE
PyTorch 1.5.0 @/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/torch
PyTorch debug build False
GPU available True
GPU 0,1,2,3 TITAN RTX (arch=7.5)
CUDA_HOME /usr/local/cuda-10.1
Pillow 7.1.2
torchvision 0.6.0a0+82fd1c8 @/home/miruware/anaconda3/envs/blend/lib/python3.6/site-packages/torchvision
torchvision arch flags 3.5, 5.0, 6.0, 7.0, 7.5
fvcore 0.1.2.post20201013
cv2 3.4.2


PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.2
  • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TORCH_CUDA_ARCH_LIST should be separated by ";" not ",". It was a typo.

TORCH_CUDA_ARCH_LIST should be separated by ";" not ",". It was a typo.

Thank you very much!! save my life!!!
bellow is working
export TORCH_CUDA_ARCH_LIST=7.5\;5.2

Was this page helpful?
0 / 5 - 0 ratings

Related issues

guy4261 picture guy4261  路  4Comments

soumik12345 picture soumik12345  路  3Comments

Ormagardskvaedi picture Ormagardskvaedi  路  4Comments

invisprints picture invisprints  路  4Comments

joeythegod picture joeythegod  路  4Comments