Faster-rcnn.pytorch: RuntimeError: cuda runtime error (8) : invalid device function at /pytorch/torch/lib/THC/THCTensorCopy.cu:204

Created on 31 Mar 2018 · 7Comments · Source: jwyang/faster-rcnn.pytorch

Loading pretrained weights from data/pretrained_model/vgg16_caffe.pth
/home/lb/faster-rcnn.pytorch/lib/model/rpn/rpn.py:68: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
rpn_cls_prob_reshape = F.softmax(rpn_cls_score_reshape)
THCudaCheck FAIL file=/pytorch/torch/lib/THC/THCTensorCopy.cu line=204 error=8 : invalid device function
Traceback (most recent call last):
File "trainval_net.py", line 322, in
rois_label = fasterRCNN(im_data, im_info, gt_boxes, num_boxes)
File "/home/lb/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 325, in __call__
result = self.forward(input, *kwargs)
File "/home/lb/faster-rcnn.pytorch/lib/model/faster_rcnn/faster_rcnn.py", line 50, in forward
rois, rpn_loss_cls, rpn_loss_bbox = self.RCNN_rpn(base_feat, im_info, gt_boxes, num_boxes)
File "/home/lb/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 325, in __call__
result = self.forward(input, *kwargs)
File "/home/lb/faster-rcnn.pytorch/lib/model/rpn/rpn.py", line 78, in forward
im_info, cfg_key))
File "/home/lb/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 325, in __call__
result = self.forward(input, *kwargs)
File "/home/lb/faster-rcnn.pytorch/lib/model/rpn/proposal_layer.py", line 149, in forward
keep_idx_i = keep_idx_i.long().view(-1)
File "/home/lb/.local/lib/python2.7/site-packages/torch/tensor.py", line 51, in long
return self.type(type(self).__module__ + '.LongTensor')
File "/home/lb/.local/lib/python2.7/site-packages/torch/cuda/__init__.py", line 370, in type
return super(_CudaBase, self).type(args, *kwargs)
File "/home/lb/.local/lib/python2.7/site-packages/torch/_utils.py", line 38, in _type
return new_type(self.size()).copy_(self, async)
RuntimeError: cuda runtime error (8) : invalid device function at /pytorch/torch/lib/THC/THCTensorCopy.cu:204

Hi，author,I have the error and I set up python2.7 pytorch0.3.0 cuda8.0 ..what should I do to sovle it..

Source

lori0726

Most helpful comment

I also encounter this problem. The re-compiling mash.sh file also solves this problem.
The detailed change is as follows:

before solving this problem, the make.sh file is:

!/usr/bin/env bash

CUDA_PATH=/usr/local/cuda/

export CUDA_PATH=/usr/local/cuda/

python setup.py build_ext --inplace
rm -rf build

CUDA_ARCH="-gencode arch=compute_52,code=sm_52 "

compile NMS

cd model/nms/src
echo "Compiling nms kernels by nvcc..."
nvcc -c -o nms_cuda_kernel.cu.o nms_cuda_kernel.cu
-D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH

cd ../
python build.py

compile roi_pooling

cd ../../
cd model/roi_pooling/src
echo "Compiling roi pooling kernels by nvcc..."
nvcc -c -o roi_pooling.cu.o roi_pooling_kernel.cu
-D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH
cd ../
python build.py

compile roi_align

cd ../../
cd model/roi_align/src
echo "Compiling roi align kernels by nvcc..."
nvcc -c -o roi_align_kernel.cu.o roi_align_kernel.cu
-D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH
cd ../
python build.py

compile roi_crop

cd ../../
cd model/roi_crop/src
echo "Compiling roi crop kernels by nvcc..."
nvcc -c -o roi_crop_cuda_kernel.cu.o roi_crop_cuda_kernel.cu
-D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH
cd ../
python build.py

since I use the Titan Pascal X GPU. There is also no problem when I run the command, i.e., sh mash.sh. However I encounter the problem "RuntimeError: cuda runtime error (8) : invalid device function at /pytorch/torch/lib/THC/THCTensorCopy.cu:204" when I run trainval_net.py. So I change the make.sh as follows:

!/usr/bin/env bash

CUDA_PATH=/usr/local/cuda/

export CUDA_PATH=/usr/local/cuda/

python setup.py build_ext --inplace
rm -rf build

CUDA_ARCH="-gencode arch=compute_30,code=sm_30
-gencode arch=compute_35,code=sm_35
-gencode arch=compute_50,code=sm_50
-gencode arch=compute_52,code=sm_52
-gencode arch=compute_60,code=sm_60
-gencode arch=compute_61,code=sm_61 "

compile NMS

cd model/nms/src
echo "Compiling nms kernels by nvcc..."
nvcc -c -o nms_cuda_kernel.cu.o nms_cuda_kernel.cu
-D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH

cd ../
python build.py

compile roi_pooling

compile roi_align

compile roi_crop

The problem is solved when I re-compiled make.sh.

cltdevelop on 18 Jul 2018

😄2 👍2

All 7 comments

ok，I sovle it...

lori0726 on 2 Apr 2018

👍2

I met the same problem, could you please tell me how to fix it? Thanks @lori0726

JiasiWang on 2 Apr 2018

@JiasiWang I re-compiled “make.sh”~~

lori0726 on 3 Apr 2018

👍4

yeah, it does solve the problem... Thanks！ @lori0726

JiasiWang on 3 Apr 2018

👍3

I also encounter this problem. The re-compiling mash.sh file also solves this problem.
The detailed change is as follows:

before solving this problem, the make.sh file is:

!/usr/bin/env bash

CUDA_PATH=/usr/local/cuda/

export CUDA_PATH=/usr/local/cuda/

python setup.py build_ext --inplace
rm -rf build

CUDA_ARCH="-gencode arch=compute_52,code=sm_52 "

compile NMS

cd model/nms/src
echo "Compiling nms kernels by nvcc..."
nvcc -c -o nms_cuda_kernel.cu.o nms_cuda_kernel.cu
-D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH

cd ../
python build.py

compile roi_pooling

compile roi_align

compile roi_crop

!/usr/bin/env bash

CUDA_PATH=/usr/local/cuda/

export CUDA_PATH=/usr/local/cuda/

python setup.py build_ext --inplace
rm -rf build

compile NMS

cd model/nms/src
echo "Compiling nms kernels by nvcc..."
nvcc -c -o nms_cuda_kernel.cu.o nms_cuda_kernel.cu
-D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CUDA_ARCH

cd ../
python build.py

compile roi_pooling

compile roi_align

compile roi_crop

The problem is solved when I re-compiled make.sh.

cltdevelop on 18 Jul 2018

😄2 👍2

Hi Guys
I have exactly same problem
This is the command and the error that I get

$ CUDA_VISIBLE_DEVICES=0 python2 ./example/main.py --dataset mpii --a
rch hg --stack 8 --block 1 --features 256 --checkpoint ./checkpoint/mpii/hg-s8-b1 --anno-path "/home/saleh/Documents/terme4_master/semester_proj/pytorch_pose/pytorch-pose/data/mpii/mpii_selected_annotations_debug.json" --image-path /home/saleh/Documents/terme4_master/semester_proj/pytorch_pose/pytorch-pose/short_list --epochs 5 -d
/home/saleh/anaconda3/envs/py27/lib/python2.7/site-packages/torch/cuda/init.py:117: UserWarning:
Found GPU0 GeForce GT 710M which is of cuda capability 2.1.
PyTorch no longer supports this GPU because it is too old.

warnings.warn(old_gpu_warn % (d, name, major, capability[1]))
GeForce GT 710M
==> creating model 'hg', stacks=8, blocks=1
Total params: 25.59M
Mean: 0.4404, 0.4440, 0.4327
Std: 0.2458, 0.2410, 0.2468

Epoch: 1 | LR: 0.00025000
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549614443593/work/aten/src/THC/generic/THCTensorMath.cu line=14 error=8 : invalid device function
Traceback (most recent call last):
File "./example/main.py", line 434, in
main(parser.parse_args())
File "./example/main.py", line 154, in main
args.debug, args.flip)
File "./example/main.py", line 200, in train
output = model(input)
File "/home/saleh/anaconda3/envs/py27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(input, *kwargs)
File "/home/saleh/anaconda3/envs/py27/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
return self.module(inputs[0], *kwargs[0])
File "/home/saleh/anaconda3/envs/py27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(input, *kwargs)
File "/home/saleh/Documents/terme4_master/semester_proj/pytorch_pose/pytorch-pose/example/../pose/models/hourglass.py", line 158, in forward
x = self.conv1(x)
File "/home/saleh/anaconda3/envs/py27/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(input, *kwargs)
File "/home/saleh/anaconda3/envs/py27/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 320, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuda runtime error (8) : invalid device function at /opt/conda/conda-bld/pytorch_1549614443593/work/aten/src/THC/generic/THCTensorMath.cu:14