I brought a new rtx3070,installed cuda 10.1(also try 10.2).cudnn7(also try 8 with cuda10.2),after successfuly compiled darknet.exe,i want to test a picture,but it just get stuck there and doesn't,could anybody help me ,or there is something wrong with my rtx3070

When you opened the bug it asks the following information. Could you provide it all?
If you want to report a bug - provide:
* description of a bug
* what command do you use?
* do you use Win/Linux/Mac?
* attach screenshot of a bug with previous messages in terminal
* in what cases a bug occurs, and in which not?
* if possible, specify date/commit of Darknet that works without this bug
* show such screenshot with info
./darknet detector test cfg/coco.data cfg/yolov4.cfg yolov4.weights data/dog.jpg
CUDA-version: 10000 (10000), cuDNN: 7.4.2, CUDNN_HALF=1, GPU count: 1
CUDNN_HALF=1
OpenCV version: 4.2.0
0 : compute_capability = 750, cudnn_half = 1, GPU: GeForce RTX 2070
net.optimized_memory = 0
mini_batch = 1, batch = 8, time_steps = 1, train = 0
layer filters size/strd(dil) input output
sorry for didn't give the correct bug information.I use a new rtx3070 gpu in win10/vs2019 to do my work.my problem is ,after i installed cuda(10.2)with cudnn8,opencv(3.4.12),and i successfully compiled the darknet ,get the darknet.exe file,than i use the following command
darknet.exe detector test cfg/coco.data cfg/yolov4.cfg yolov4.weights -i 0 -thresh 0.25 -ext_output dog.jpg
and it worked as follow
F:\yolo\darknet-master\darknet-master\build\darknet\x64>darknet.exe detector test cfg/coco.data cfg/yolov4.cfg yolov4.weights -i 0 -thresh 0.25 -ext_output dog.jpg
CUDA-version: 10020 (11010), cuDNN: 8.0.2, CUDNN_HALF=1, GPU count: 1
CUDNN_HALF=1
OpenCV version: 3.4.12
0 : compute_capability = 860, cudnn_half = 1, GPU: GeForce RTX 3070
net.optimized_memory = 0
mini_batch = 1, batch = 8, time_steps = 1, train = 0
layer filters size/strd(dil) input output
0
it just stoped there,i waited a few minutes,got the result
F:\yolo\darknet-master\darknet-master\build\darknet\x64>darknet.exe detector test cfg/coco.data cfg/yolov4-tiny.cfg yolov4-tiny.weights -i 0 -thresh 0.25 -ext_output dog.jpg
CUDA-version: 10020 (11010), cuDNN: 8.0.2, CUDNN_HALF=1, GPU count: 1
CUDNN_HALF=1
OpenCV version: 3.4.12
0 : compute_capability = 860, cudnn_half = 1, GPU: GeForce RTX 3070
net.optimized_memory = 0
mini_batch = 1, batch = 1, time_steps = 1, train = 0
layer filters size/strd(dil) input output
0 conv 32 3 x 3/ 2 416 x 416 x 3 -> 208 x 208 x 32 0.075 BF
1 conv 64 3 x 3/ 2 208 x 208 x 32 -> 104 x 104 x 64 0.399 BF
2 conv 64 3 x 3/ 1 104 x 104 x 64 -> 104 x 104 x 64 0.797 BF
3 route 2 1/2 -> 104 x 104 x 32
4 conv 32 3 x 3/ 1 104 x 104 x 32 -> 104 x 104 x 32 0.199 BF
5 conv 32 3 x 3/ 1 104 x 104 x 32 -> 104 x 104 x 32 0.199 BF
6 route 5 4 -> 104 x 104 x 64
7 conv 64 1 x 1/ 1 104 x 104 x 64 -> 104 x 104 x 64 0.089 BF
8 route 2 7 -> 104 x 104 x 128
9 max 2x 2/ 2 104 x 104 x 128 -> 52 x 52 x 128 0.001 BF
10 conv 128 3 x 3/ 1 52 x 52 x 128 -> 52 x 52 x 128 0.797 BF
11 route 10 1/2 -> 52 x 52 x 64
12 conv 64 3 x 3/ 1 52 x 52 x 64 -> 52 x 52 x 64 0.199 BF
13 conv 64 3 x 3/ 1 52 x 52 x 64 -> 52 x 52 x 64 0.199 BF
14 route 13 12 -> 52 x 52 x 128
15 conv 128 1 x 1/ 1 52 x 52 x 128 -> 52 x 52 x 128 0.089 BF
16 route 10 15 -> 52 x 52 x 256
17 max 2x 2/ 2 52 x 52 x 256 -> 26 x 26 x 256 0.001 BF
18 conv 256 3 x 3/ 1 26 x 26 x 256 -> 26 x 26 x 256 0.797 BF
19 route 18 1/2 -> 26 x 26 x 128
20 conv 128 3 x 3/ 1 26 x 26 x 128 -> 26 x 26 x 128 0.199 BF
21 conv 128 3 x 3/ 1 26 x 26 x 128 -> 26 x 26 x 128 0.199 BF
22 route 21 20 -> 26 x 26 x 256
23 conv 256 1 x 1/ 1 26 x 26 x 256 -> 26 x 26 x 256 0.089 BF
24 route 18 23 -> 26 x 26 x 512
25 max 2x 2/ 2 26 x 26 x 512 -> 13 x 13 x 512 0.000 BF
26 conv 512 3 x 3/ 1 13 x 13 x 512 -> 13 x 13 x 512 0.797 BF
27 conv 256 1 x 1/ 1 13 x 13 x 512 -> 13 x 13 x 256 0.044 BF
28 conv 512 3 x 3/ 1 13 x 13 x 256 -> 13 x 13 x 512 0.399 BF
29 conv 255 1 x 1/ 1 13 x 13 x 512 -> 13 x 13 x 255 0.044 BF
30 yolo
[yolo] params: iou loss: ciou (4), iou_norm: 0.07, cls_norm: 1.00, scale_x_y: 1.05
nms_kind: greedynms (1), beta = 0.600000
31 route 27 -> 13 x 13 x 256
32 conv 128 1 x 1/ 1 13 x 13 x 256 -> 13 x 13 x 128 0.011 BF
33 upsample 2x 13 x 13 x 128 -> 26 x 26 x 128
34 route 33 23 -> 26 x 26 x 384
35 conv 256 3 x 3/ 1 26 x 26 x 384 -> 26 x 26 x 256 1.196 BF
36 conv 255 1 x 1/ 1 26 x 26 x 256 -> 26 x 26 x 255 0.088 BF
37 yolo
[yolo] params: iou loss: ciou (4), iou_norm: 0.07, cls_norm: 1.00, scale_x_y: 1.05
nms_kind: greedynms (1), beta = 0.600000
Total BFLOPS 6.910
avg_outputs = 310203
Allocate additional workspace_size = 0.01 MB
Loading weights from yolov4-tiny.weights...
seen 64, trained: 32012 K-images (500 Kilo-batches_64)
Done! Loaded 38 layers from weights-file
Detection layer: 30 - type = 27
Detection layer: 37 - type = 27
Detection layer: 161 - type = 27
dog.jpg: Predicted in 23721.728000 milli-seconds.
bicycle: 92% (left_x: 114 top_y: 128 width: 458 height: 299)
dog: 98% (left_x: 129 top_y: 225 width: 184 height: 316)
truck: 92% (left_x: 464 top_y: 77 width: 221 height: 93)
pottedplant: 33% (left_x: 681 top_y: 109 width: 37 height: 46)
i followed the recommend setting step,but why did it cost so much time?what should i do to resolve this problem? @bizmate
How slow/fast is your F drive? Tried running it from another drive?
In one command you are using yolov4 and in the other yolov4-tiny? Which one is the right one that you are troubleshooting?
Can you time how long it takes? ie in linux i would just run time in front of the command.
If you are familiar with docker, does it run well with it? You can see an example in this script
https://github.com/bizmate/bash-essentials/blob/master/docs/DARKNET_DETECT_AND_TRASH.md
but you need docker to be properly configured to use your GPU (check official documentation)
My F drive is quite fast,and i try to put it in another drive,the same problem.
I used yolov4 and yolov4-tiny to test the different costs of time.the yolov4-tiny.weights cost about 23721milli-seconds and the yolov4.weights costs almost three times.
i don't know much of docker,but i heared that RTX 30 series need cuda11 to match the ampere architecture,so i tried to install
cuda11.0 and 11.1,however,both of them couldn't even compile,still confused.
That is why i dont use darknet without docker. Too many compilation troubleshooting errors. I have logged several on this project but they are not answered. Unless you can fix the cuda installation try docker
Finally i solved my problem.hope my solution could help others.
after reinstall cuda11.1,cudnn8.0,and correctly set opencv, i change the setting of darknet.vcxproj
usecompute_86,sm_86(as my compute capability) instead of compute_30,sm_30;compute_75,sm_75,also the cuda version,and it compiled successfully!
if somebody meet the similar problem,just try it!
Most helpful comment
Finally i solved my problem.hope my solution could help others.
after reinstall cuda11.1,cudnn8.0,and correctly set opencv, i change the setting of
darknet.vcxprojuse
compute_86,sm_86(as my compute capability) instead ofcompute_30,sm_30;compute_75,sm_75,also the cuda version,and it compiled successfully!if somebody meet the similar problem,just try it!