Darknet: rtx3070 couldn't work

Created on 13 Nov 2020  路  6Comments  路  Source: AlexeyAB/darknet

I brought a new rtx3070,installed cuda 10.1(also try 10.2).cudnn7(also try 8 with cuda10.2),after successfuly compiled darknet.exe,i want to test a picture,but it just get stuck there and doesn't,could anybody help me ,or there is something wrong with my rtx3070
image

Most helpful comment

Finally i solved my problem.hope my solution could help others.
after reinstall cuda11.1,cudnn8.0,and correctly set opencv, i change the setting of darknet.vcxproj
usecompute_86,sm_86(as my compute capability) instead of compute_30,sm_30;compute_75,sm_75,also the cuda version,and it compiled successfully!
if somebody meet the similar problem,just try it!

All 6 comments

When you opened the bug it asks the following information. Could you provide it all?


If you want to report a bug - provide:
* description of a bug
* what command do you use?
* do you use Win/Linux/Mac?
* attach screenshot of a bug with previous messages in terminal
* in what cases a bug occurs, and in which not?
* if possible, specify date/commit of Darknet that works without this bug
* show such screenshot with info

./darknet detector test cfg/coco.data cfg/yolov4.cfg yolov4.weights data/dog.jpg
 CUDA-version: 10000 (10000), cuDNN: 7.4.2, CUDNN_HALF=1, GPU count: 1
 CUDNN_HALF=1
 OpenCV version: 4.2.0
 0 : compute_capability = 750, cudnn_half = 1, GPU: GeForce RTX 2070
net.optimized_memory = 0
mini_batch = 1, batch = 8, time_steps = 1, train = 0
   layer   filters  size/strd(dil)      input                output

sorry for didn't give the correct bug information.I use a new rtx3070 gpu in win10/vs2019 to do my work.my problem is ,after i installed cuda(10.2)with cudnn8,opencv(3.4.12),and i successfully compiled the darknet ,get the darknet.exe file,than i use the following command

darknet.exe detector test cfg/coco.data cfg/yolov4.cfg yolov4.weights -i 0 -thresh 0.25 -ext_output dog.jpg

and it worked as follow

F:\yolo\darknet-master\darknet-master\build\darknet\x64>darknet.exe detector test cfg/coco.data cfg/yolov4.cfg yolov4.weights -i 0 -thresh 0.25 -ext_output dog.jpg
 CUDA-version: 10020 (11010), cuDNN: 8.0.2, CUDNN_HALF=1, GPU count: 1
 CUDNN_HALF=1
 OpenCV version: 3.4.12
 0 : compute_capability = 860, cudnn_half = 1, GPU: GeForce RTX 3070
net.optimized_memory = 0
mini_batch = 1, batch = 8, time_steps = 1, train = 0
   layer   filters  size/strd(dil)      input                output
   0

it just stoped there,i waited a few minutes,got the result

F:\yolo\darknet-master\darknet-master\build\darknet\x64>darknet.exe detector test cfg/coco.data cfg/yolov4-tiny.cfg yolov4-tiny.weights -i 0 -thresh 0.25 -ext_output dog.jpg
 CUDA-version: 10020 (11010), cuDNN: 8.0.2, CUDNN_HALF=1, GPU count: 1
 CUDNN_HALF=1
 OpenCV version: 3.4.12
 0 : compute_capability = 860, cudnn_half = 1, GPU: GeForce RTX 3070
net.optimized_memory = 0
mini_batch = 1, batch = 1, time_steps = 1, train = 0
   layer   filters  size/strd(dil)      input                output
   0 conv     32       3 x 3/ 2    416 x 416 x   3 ->  208 x 208 x  32 0.075 BF
   1 conv     64       3 x 3/ 2    208 x 208 x  32 ->  104 x 104 x  64 0.399 BF
   2 conv     64       3 x 3/ 1    104 x 104 x  64 ->  104 x 104 x  64 0.797 BF
   3 route  2                                  1/2 ->  104 x 104 x  32
   4 conv     32       3 x 3/ 1    104 x 104 x  32 ->  104 x 104 x  32 0.199 BF
   5 conv     32       3 x 3/ 1    104 x 104 x  32 ->  104 x 104 x  32 0.199 BF
   6 route  5 4                                    ->  104 x 104 x  64
   7 conv     64       1 x 1/ 1    104 x 104 x  64 ->  104 x 104 x  64 0.089 BF
   8 route  2 7                                    ->  104 x 104 x 128
   9 max                2x 2/ 2    104 x 104 x 128 ->   52 x  52 x 128 0.001 BF
  10 conv    128       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 128 0.797 BF
  11 route  10                                 1/2 ->   52 x  52 x  64
  12 conv     64       3 x 3/ 1     52 x  52 x  64 ->   52 x  52 x  64 0.199 BF
  13 conv     64       3 x 3/ 1     52 x  52 x  64 ->   52 x  52 x  64 0.199 BF
  14 route  13 12                                  ->   52 x  52 x 128
  15 conv    128       1 x 1/ 1     52 x  52 x 128 ->   52 x  52 x 128 0.089 BF
  16 route  10 15                                  ->   52 x  52 x 256
  17 max                2x 2/ 2     52 x  52 x 256 ->   26 x  26 x 256 0.001 BF
  18 conv    256       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 256 0.797 BF
  19 route  18                                 1/2 ->   26 x  26 x 128
  20 conv    128       3 x 3/ 1     26 x  26 x 128 ->   26 x  26 x 128 0.199 BF
  21 conv    128       3 x 3/ 1     26 x  26 x 128 ->   26 x  26 x 128 0.199 BF
  22 route  21 20                                  ->   26 x  26 x 256
  23 conv    256       1 x 1/ 1     26 x  26 x 256 ->   26 x  26 x 256 0.089 BF
  24 route  18 23                                  ->   26 x  26 x 512
  25 max                2x 2/ 2     26 x  26 x 512 ->   13 x  13 x 512 0.000 BF
  26 conv    512       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x 512 0.797 BF
  27 conv    256       1 x 1/ 1     13 x  13 x 512 ->   13 x  13 x 256 0.044 BF
  28 conv    512       3 x 3/ 1     13 x  13 x 256 ->   13 x  13 x 512 0.399 BF
  29 conv    255       1 x 1/ 1     13 x  13 x 512 ->   13 x  13 x 255 0.044 BF
  30 yolo
[yolo] params: iou loss: ciou (4), iou_norm: 0.07, cls_norm: 1.00, scale_x_y: 1.05
nms_kind: greedynms (1), beta = 0.600000
  31 route  27                                     ->   13 x  13 x 256
  32 conv    128       1 x 1/ 1     13 x  13 x 256 ->   13 x  13 x 128 0.011 BF
  33 upsample                 2x    13 x  13 x 128 ->   26 x  26 x 128
  34 route  33 23                                  ->   26 x  26 x 384
  35 conv    256       3 x 3/ 1     26 x  26 x 384 ->   26 x  26 x 256 1.196 BF
  36 conv    255       1 x 1/ 1     26 x  26 x 256 ->   26 x  26 x 255 0.088 BF
  37 yolo
[yolo] params: iou loss: ciou (4), iou_norm: 0.07, cls_norm: 1.00, scale_x_y: 1.05
nms_kind: greedynms (1), beta = 0.600000
Total BFLOPS 6.910
avg_outputs = 310203
 Allocate additional workspace_size = 0.01 MB
Loading weights from yolov4-tiny.weights...
 seen 64, trained: 32012 K-images (500 Kilo-batches_64)
Done! Loaded 38 layers from weights-file
 Detection layer: 30 - type = 27
 Detection layer: 37 - type = 27
 Detection layer: 161 - type = 27
dog.jpg: Predicted in 23721.728000 milli-seconds.
bicycle: 92%    (left_x:  114   top_y:  128   width:  458   height:  299)
dog: 98%        (left_x:  129   top_y:  225   width:  184   height:  316)
truck: 92%      (left_x:  464   top_y:   77   width:  221   height:   93)
pottedplant: 33%        (left_x:  681   top_y:  109   width:   37   height:   46)

i followed the recommend setting step,but why did it cost so much time?what should i do to resolve this problem? @bizmate

How slow/fast is your F drive? Tried running it from another drive?
In one command you are using yolov4 and in the other yolov4-tiny? Which one is the right one that you are troubleshooting?
Can you time how long it takes? ie in linux i would just run time in front of the command.

If you are familiar with docker, does it run well with it? You can see an example in this script
https://github.com/bizmate/bash-essentials/blob/master/docs/DARKNET_DETECT_AND_TRASH.md
but you need docker to be properly configured to use your GPU (check official documentation)

My F drive is quite fast,and i try to put it in another drive,the same problem.
I used yolov4 and yolov4-tiny to test the different costs of time.the yolov4-tiny.weights cost about 23721milli-seconds and the yolov4.weights costs almost three times.
i don't know much of docker,but i heared that RTX 30 series need cuda11 to match the ampere architecture,so i tried to install
cuda11.0 and 11.1,however,both of them couldn't even compile,still confused.

That is why i dont use darknet without docker. Too many compilation troubleshooting errors. I have logged several on this project but they are not answered. Unless you can fix the cuda installation try docker

Finally i solved my problem.hope my solution could help others.
after reinstall cuda11.1,cudnn8.0,and correctly set opencv, i change the setting of darknet.vcxproj
usecompute_86,sm_86(as my compute capability) instead of compute_30,sm_30;compute_75,sm_75,also the cuda version,and it compiled successfully!
if somebody meet the similar problem,just try it!

Was this page helpful?
0 / 5 - 0 ratings