Darknet: Another question after 'How can I decrease the loss of training with TITAN RTX?' -label:question

Created on 8 Feb 2019 · 7Comments · Source: AlexeyAB/darknet

https://github.com/AlexeyAB/darknet/issues/2170#issue-397788518
Please read below question which I wrote after training 64000 iterations .

I really appreciate with your answers @HagegeR and @AlexeyAB !!!

Get more training data is the best solution but I can't do that anymore...

More detail about my data
I only need the highest class_id=1 accuracy.
So I labeled class_id=0, 2, 3 which can cause confusing when detecting class_id=1.

I followed your suggestions and changed below parameters.

batch = 64
subdivisions = 16
width = 736
height = 736
learning_rate=0.0001

max 22GB ram was used.

I had split my data with 2772 train images and 100 test images.
I thought that ratio was the biggest problem in my training.
So I changed the ratio by 2315 train images and 457 test images which ratio is 80%/20%.

darknet.exe detector calc_anchors data/obj.data -num_of_clusters 9 -width 736 -height 736 -show

I didn't change random=0 jitter=0 hue=0 exposure=1 saturation=1.

Training Results ( iterations=64000)
Lowest loss = 0.48

11000results_736

I only need the highest class_id=1 accuracy.
But the result was not increased compared to previous training(subdivisions=4, width,height=352)

My questions

Can you explain what are those cloud of points meaning?

When I run YOLO with my trained weight, there are so many false detection. For example, I wanna detect a monkey in forest but some branch are detected as monkeys.
Will training the false detected monkeys with non monkey label help increase accuracy?
Is this a familiar practice?
When I train the YOLO in ubuntu, I couldn't see the map even I start with ./darknet detector train data/obj.data yolo-obj.cfg darknet53.conv.74 -map. So I changed to windows10 for watching the map. Is there any bug with the showing graph?
Oh should I install matplotlib to see the map graph?
I surprised that you used 4 gpus to train YOLO model. Was the connection 4-way SLI or parallel connection?
I installed Darknet in Windows10 and yolo_cpp_dll.dll. Can I reuse the yolo_cpp_dll.dll which has same RTX series GPU? or should I remake that? I configured that I can't reuse that in 10 series gpu.

Thanks again that you spent your precious time for answering my questions.

question

Source

HanSeYeong

All 7 comments

For question 2: you should add some photos with no labels at all, if you don't you will encourage the neural network to always find something, even when there isn't.

HagegeR on 8 Feb 2019

❤1

@HagegeR Thanks for your answering!

1 0.683594 0.626389 0.303125 0.597222
0 0.196094 0.371528 0.079687 0.120833
0 0.297266 0.370833 0.080469 0.136111
0 0.117188 0.366667 0.107813 0.208333

Then should I remain the label number with blank like below?

0.117188 0.366667 0.107813 0.208333

oh, I thought of the answer in the old days and found it.

from the previous answer

Also disarable to have another 8000 images with backgrounds (without objects) with empty txt-files.

Can't I make false objects with empty txt-files?

HanSeYeong on 8 Feb 2019

the most effective way (that could also be considered as cheating) would be to use the frames were you get false positive and add them to your database as image without labels.

but simply use images that don't have any of the objects you want to detect and feed it to your network.

HagegeR on 8 Feb 2019

❤1

@HanSeYoung

Can you explain what are those cloud of points meaning?

Point coords (x,y) is the size of one of your object (w,h)

When I run YOLO with my trained weight, there are so many false detection. For example, I wanna detect a monkey in forest but some branch are detected as monkeys.
Will training the false detected monkeys with non monkey label help increase accuracy?
Is this a familiar practice?

Add to your training dataset more images (200-2000) with branches and without monkey, with empty label-txt-files.

When I train the YOLO in ubuntu, I couldn't see the map even I start with ./darknet detector train data/obj.data yolo-obj.cfg darknet53.conv.74 -map. So I changed to windows10 for watching the map. Is there any bug with the showing graph?
Oh should I install matplotlib to see the map graph?

What error did you get?
Using the latest version of Darknet you will see mAP

in the Window (OpenCV is required)
console
and in chart.png (OpenCV is required)

So in any cases you will see mAP in the console.

I surprised that you used 4 gpus to train YOLO model. Was the connection 4-way SLI or parallel connection?

GPUs are connected by using 3 ways:

PCI-express: GPU0 -> CPU-PCI-express-root -> GPU1 - is used for CUDA and Darknet
old SLI (very slow): GPU0 -> GPU1 - is used for 3D graphics to get false-high-rate-frames
new SLI (very fast) is uses NVlink: GPU0 -> GPU1 can be used in CUDA and Darknet, it is used in the DGX-1/2

speed value: old SLI < PCI-express < new SLI(nvlink)

Darknet that uses CUDA will automatically use PCI-Express (or new SLI very fast NVlink if it is in your system) if you set multi-GPU training -gpus 0,1,2,3

https://en.wikipedia.org/wiki/NVLink

This article needs to be updated. In particular: NVLink is included on Nvidia's Turing-based RTX GPUs, including GeForce RTX 2080. It's used to link two identical GPUs together using bridge, similar to SLI. (September 2018)

I installed Darknet in Windows10 and yolo_cpp_dll.dll. Can I reuse the yolo_cpp_dll.dll which has same RTX series GPU? or should I remake that? I configured that I can't reuse that in 10 series gpu.

If you compiled it with CC7.5 then you can reuse it on any RTX cards.
If you compiled it with both CC3.0 and CC7.5 then you can reuse it on any (Kepler, Maxwell, Pascal, Volta, Turing) GPUs starting from GTX 740