Mmdetection: CUDNN_STATUS_NOT_SUPPORTED

Created on 18 Feb 2020 · 14Comments · Source: open-mmlab/mmdetection

When I run libra_faster_rcnn_r101_fpn_1x,an error is reported:"RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input."
TIM图片20200218145702

Source

zuhaoran

Most helpful comment

Modify the code in
https://github.com/open-mmlab/mmdetection/blob/c0ac99eff015c108b34a9f80e3ff59b106dbc62e/mmdet/models/plugins/non_local.py#L110 as following:

y = y.permute(0, 2, 1).contiguous().reshape(n, self.inter_channels, h, w)

shwoo93 on 18 Feb 2020

👍9

All 14 comments

Modify the code in
https://github.com/open-mmlab/mmdetection/blob/c0ac99eff015c108b34a9f80e3ff59b106dbc62e/mmdet/models/plugins/non_local.py#L110 as following:

y = y.permute(0, 2, 1).contiguous().reshape(n, self.inter_channels, h, w)

shwoo93 on 18 Feb 2020

👍9

@zuhaoran Hi, can you run python mmdet/utils/collect_env.py to collect your environment information and paste it here? I did not meet this error before. We need to find the source that causes the error. Thanks!

OceanPang on 19 Feb 2020

@shwoo93 Thank you for your answer

zuhaoran on 21 Feb 2020

@OceanPang Sorry,I haven't used this code.

zuhaoran on 21 Feb 2020

@zuhaoran We just want to confirm the source of the bug. It would be great if you can run the code and paste your env info here. Thanks!

OceanPang on 26 Feb 2020

@OceanPang I tried to run it but it didn't work

zuhaoran on 26 Feb 2020

@zuhaoran We just want to confirm the source of the bug. It would be great if you can run the code and paste your env info here. Thanks!

I met the same problem, and the following is my env info, hope it could be helpful.
sys.platform: linux
Python: 3.7.6 (default, Jan 8 2020, 19:59:22) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.0, V10.0.130
GPU 0,1,2,3,4,5,6,7,8,9: GeForce RTX 2080 Ti
GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
PyTorch: 1.4.0
PyTorch compiling details: PyTorch built with:

GCC 7.3
Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CUDA Runtime 10.0
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
CuDNN 7.6.3
Magma 2.5.1
Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.5.0
OpenCV: 4.2.0
MMCV: 0.3.2
MMDetection: 1.1.0+639f934
MMDetection Compiler: GCC 5.4
MMDetection CUDA Compiler: 10.0

Muzijiajian on 29 Feb 2020

I met the same CUDNN_STATUS_NOT_SUPPORTED error when I run libra_faster_rcnn_r50_fpn_1x.py and libra_retina_rcnn_r50_fpn_1x.py.
sys.platform: ubuntu 18.04
Python: 3.7.6
CUDA: 10.1
cudnn: 7.6.5
PyTorch: 1.4.0
GPU: 0,1,2,3,4,5,6,7 Tesla V100-SXM2
I run cascade_rcnn_r50_fpn_1x.py successfully wth the same environment.

mandylyin on 15 Mar 2020

I build MMCV and MMDetection on Mar 11 with the latest code from master branch.

mandylyin on 15 Mar 2020

@shwoo93 Thank you for your answer. It works.

mandylyin on 15 Mar 2020

I also ran in this error, but i could not locate the plugins folder in mmdetection/mmdet/models
Edit; Found it.
It is in mmdet/ops folder

onexmaster on 19 Apr 2020

Modify the code in
https://github.com/open-mmlab/mmdetection/blob/c0ac99eff015c108b34a9f80e3ff59b106dbc62e/mmdet/models/plugins/non_local.py#L110

as following:
y = y.permute(0, 2, 1).contiguous().reshape(n, self.inter_channels, h, w)

@shwoo93 Thank you. It solved my problem. Could you please tell me how the error was located.

abcxs on 23 May 2020

@shwoo93 Thank you for your answer!
@OceanPang I think that's wired, since reshape will automatically make the tensor into contiguous shape. How does it get revelant with cuDNN error?