Yolov5: thop error: m.total_ops += torch.DoubleTensor([int(total_ops)]) RuntimeError: expected device cuda:0 but got device cpu

Created on 26 Jun 2020 · 7Comments · Source: ultralytics/yolov5

Connected to pydev debugger (build 193.5662.61)
Namespace(augment=False, batch_size=8, conf_thres=0.001, data='data/voc.yaml', device='', img_size=640, iou_thres=0.65, merge=False, save_json=False, single_cls=False, task='val', verbose=False, weights='weights/best.pt')
Using CUDA device0 _CudaDeviceProperties(name='GeForce RTX 2080 Ti', total_memory=11019MB)
device1 _CudaDeviceProperties(name='GeForce RTX 2080 Ti', total_memory=11016MB)

Model Summary: 191 layers, 7.30634e+06 parameters, 0 gradients
Fusing layers...
Model Summary: 140 layers, 7.29776e+06 parameters, 6.61683e+06 gradients
Caching labels ../voc/labels/test0712.npy (4952 found, 0 missing, 0 empty, 0 dup
Class Images Targets P R [email protected]: %s to exist.
NoneType: None
Expected: %s to exist.
NoneType: None
Expected: %s to exist.
NoneType: None
Expected: %s to exist.
NoneType: None
Expected: %s to exist.
NoneType: None
Expected: %s to exist.
NoneType: None
Expected: %s to exist.
NoneType: None
Expected: %s to exist.
NoneType: None
Class Images Targets P R [email protected]
Traceback (most recent call last):
File "/home/zhou/Myself_code/yolov5/yolov5/test.py", line 105, in test
inf_out, train_out = model(img, augment=augment) # inference and training outputs
File "/home/zhou/anaconda3/envs/Better/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(input, *kwargs)
File "/home/zhou/Myself_code/yolov5/yolov5/models/yolo.py", line 90, in forward
return self.forward_once(x, profile) # single-scale inference, train
File "/home/zhou/Myself_code/yolov5/yolov5/models/yolo.py", line 110, in forward_once
x = m(x) # run
File "/home/zhou/anaconda3/envs/Better/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(input, *kwargs)
File "/home/zhou/Myself_code/yolov5/yolov5/models/common.py", line 87, in forward
return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
File "/home/zhou/anaconda3/envs/Better/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(input, *kwargs)
File "/home/zhou/Myself_code/yolov5/yolov5/models/common.py", line 25, in fuseforward
return self.act(self.conv(x))
File "/home/zhou/anaconda3/envs/Better/lib/python3.7/site-packages/torch/nn/modules/module.py", line 534, in __call__
hook_result = hook(self, input, result)
File "/home/zhou/anaconda3/envs/Better/lib/python3.7/site-packages/thop/vision/basic_hooks.py", line 31, in count_convNd
m.total_ops += torch.DoubleTensor([int(total_ops)])
RuntimeError: expected device cuda:0 but got device cpu

bug

Source

zhouhing

Most helpful comment

@glenn-jocher Wow! Thank you very much, I really like your work, I will pull the latest code and try again.

zhouhing on 7 Jul 2020

👍2

All 7 comments

@zhouhing hmm, that's a new error message I've not seen before! It seems to be related to cuda-cpu operations but also shows m.total_ops, which I think is a thop package operations for computing FLOPs which can cause problems. Make sure that thop is not installed on your system and try again.

If that doesn't fix the problem, then you can help us reproduce this by testing to see if the error can appear on coco128.yaml in the colab notebook, and then sending us a link to your problematic notebook.

glenn-jocher on 26 Jun 2020

@zhouhing hmm, that's a new error message I've not seen before! It seems to be related to cuda-cpu operations but also shows m.total_ops, which I think is a thop package operations for computing FLOPs which can cause problems. Make sure that thop is not installed on your system and try again.

If that doesn't fix the problem, then you can help us reproduce this by testing to see if the error can appear on coco128.yaml in the colab notebook, and then sending us a link to your problematic notebook.

Thank you！I uninstall the thop, and now I can run the test.py. Happy！！！！

zhouhing on 27 Jun 2020

😄1

Hmm, I'm glad it's working now!

We use thop for computing FLOPS also, but it seems to have a problem in certain cases, so we've dropped it into a try statement, which tries to compute FLOPS if you have it installed. The problem seems to be when it errors and the try statement excepts, the thop has inserted extra parameters/attributes into the model layers which remain there, causing the errors down the road you see.

TODO: Address thop issues.

glenn-jocher on 27 Jun 2020

@zhouhing we've updated the code to better handle thop. Can you git pull or reclone to get the latest code and see if you are still seeing this error? Thanks!

glenn-jocher on 7 Jul 2020

@glenn-jocher Wow! Thank you very much, I really like your work, I will pull the latest code and try again.

zhouhing on 7 Jul 2020

👍2

@zhouhing can you confirm if this issue has been resolved please? I'd like to remove the TODO tag and close this out to stay organized. Thank you!

glenn-jocher on 11 Jul 2020

@zhouhing I will remove TODO and close this issue under the assumption it has been resolved since I have not heard back from you. Let me know if you have any other problems.

glenn-jocher on 25 Jul 2020

Was this page helpful?

0 / 5 - 0 ratings