Detectron2: AssertionError: cuda is not available after installation

Created on 9 May 2020 · 5Comments · Source: facebookresearch/detectron2

What did I do?

I followed the instruction to install from a docker container. The process completes well, but when I run a definitely code, it raises the error AssertionError: cuda is not available. Please check your installation..

What command did I run?
python tools/train_net.py --config-file configs/FCOS-Detection/Base-FCOS.yaml --num-gpus 2
What I observed?
The logs are as follows:

Command Line Args: Namespace(config_file='configs/FCOS-Detection/Base-FCOS.yaml', dist_url='tcp://127.0.0.1:49152', eval_only=False, machine_rank=0, num_gpus=2, num_machines=1, opts=[], resume=False)
Traceback (most recent call last):
  File "tools/train_net.py", line 235, in <module>
    args=(args,),
  File "/opt/tiger/conda/lib/python3.7/site-packages/detectron2/engine/launch.py", line 54, in launch
    daemon=False,
  File "/opt/tiger/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/opt/tiger/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 158, in start_processes
    while not context.join():
  File "/opt/tiger/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 119, in join
    raise Exception(msg)
Exception: 

-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/opt/tiger/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 20, in _wrap
    fn(i, *args)
  File "/opt/tiger/conda/lib/python3.7/site-packages/detectron2/engine/launch.py", line 63, in _distributed_worker
    assert torch.cuda.is_available(), "cuda is not available. Please check your installation."
AssertionError: cuda is not available. Please check your installation.

The code I run is from
AdelaiDet.

Expected behavior:

Running without error.

The CUDA is definitely there. When I executed nvcc --version, I got nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Wed_Oct_23_19:24:38_PDT_2019 Cuda compilation tools, release 10.2, V10.2.89.

Environment:

No CUDA runtime is found, using CUDA_HOME='/opt/tiger/cuda'

sys.platform linux
Python 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0]
numpy 1.18.1
detectron2 0.1.2 @/opt/tiger/conda/lib/python3.7/site-packages/detectron2
detectron2 compiler GCC 8.3
detectron2 CUDA compiler not available
DETECTRON2_ENV_MODULE
PyTorch 1.5.0 @/opt/tiger/conda/lib/python3.7/site-packages/torch
PyTorch debug build False
CUDA available False
Pillow 7.0.0
torchvision 0.6.0a0+82fd1c8 @/opt/tiger/conda/lib/python3.7/site-packages/torchvision
fvcore 0.1

PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CPU capability usage: AVX2
Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

installation / environment

Source

tengerye

Most helpful comment

@byronyi Can you say what you did to fix it,
I have the same issue.

aarbelle on 6 Aug 2020

👍3

All 5 comments

The docker file is meant to use like this.

ppwwyyxx on 9 May 2020

The problem that I can't use your Dockerfile directly is that, I have to use some private Docker image. So I have to install it through docker container and export it as image.

tengerye on 10 May 2020

Thanks for clarifying. I thought you were using the dockerfile since you mention docker container.

You need to install pytorch and other dependencies correctly so that torch.cuda.is_available() returns True. Since this is a pytorch function, it has nothing to do with detectron2.

ppwwyyxx on 10 May 2020

Fixed internally. Kinda embarrassing...