Darknet: training yolov3

Created on 27 Mar 2018 · 26Comments · Source: AlexeyAB/darknet

Bug fixed

Source

RushNuts

Most helpful comment

@VanitarNordic I haven't tested training yet, but you can try to train:

update your code from this repo
create your yolov3_obj.cfg based on yolov3.cfg and change:
- classes= in each of 3 [yolo]-layer
- filters=(classes+5)x3 in each of 3 [convolutional]-layer before [yolo]-layers
download pre-trained weights: https://pjreddie.com/media/files/darknet53.conv.74
darknet.exe detector train data/obj.data yolov3_obj.cfg darknet53.conv.74
train about 5000 iterations

AlexeyAB on 29 Mar 2018

👍2 ❤1 🎉1

All 26 comments

It's not implemented in this fork yet: https://github.com/AlexeyAB/darknet/issues/504
Check the upstream repo in the meanwhile.

fabito on 27 Mar 2018

👍1

@fabito thanks

RushNuts on 28 Mar 2018

@RushNuts It just got added https://github.com/AlexeyAB/darknet/commit/d9ae3dd681ed1c98e807ff937dbbb9cfc4d19fe0 you should pull the latest commit though, seems there have been multiple fixes for compiling issues.
I can see Alexey added two yolov3 scripts in root which shows you how to issue a command using yolov3.

TheMikeyR on 28 Mar 2018

Seems to be an issue regarding the yolov3 update https://github.com/AlexeyAB/darknet/issues/522

You can revert back to an earlier commit https://github.com/AlexeyAB/darknet/commit/47c7af1cea5bbdedf1184963355e6418cb8b1b4f with this command while being in the directory git checkout 47c7af1cea5bbdedf1184963355e6418cb8b1b4f

TheMikeyR on 28 Mar 2018

@RushNuts @fabito @TheMikeyR

I fixed it.
Try to update your code from this repo and re-compile from scratch:

make clean
make -j8

I've already tested the detection of Yolo v2 and Yolo v3: https://github.com/AlexeyAB/darknet/issues/522#issuecomment-376865263
But have not yet tested the training of Yolo v3.
If something goes wrong with the training, let me know.

AlexeyAB on 28 Mar 2018

👍1

@RushNuts

i make: ./darknet detector test cfg/coco.data cfg/tiny-yolo.cfg yolov2-tiny.weights data/dog.jpg

It was just a bug in the yolov2-tiny.cfg in the original repo, now it is fixed: https://github.com/pjreddie/darknet/blame/master/cfg/yolov2-tiny.cfg#L123

AlexeyAB on 28 Mar 2018

@RushNuts @fabito @TheMikeyR
Now you can try to train Yolo v3. But I have not tested it yet.

AlexeyAB on 29 Mar 2018

@RushNuts Yolo v3 affects approximately 20 files, including these files: demo.c and image.c

AlexeyAB on 29 Mar 2018

@AlexeyAB

Does the repo support training of the V3? the initial weights of the V3 is different.

VanitarNordic on 29 Mar 2018

From @AlexeyAB 's latest commits, it should support training YOLO V3. Please check out the latest repo.

sivagnanamn on 29 Mar 2018

@VanitarNordic I haven't tested training yet, but you can try to train:

update your code from this repo
create your yolov3_obj.cfg based on yolov3.cfg and change:
- classes= in each of 3 [yolo]-layer
- filters=(classes+5)x3 in each of 3 [convolutional]-layer before [yolo]-layers
download pre-trained weights: https://pjreddie.com/media/files/darknet53.conv.74
darknet.exe detector train data/obj.data yolov3_obj.cfg darknet53.conv.74
train about 5000 iterations

AlexeyAB on 29 Mar 2018

👍2 ❤1 🎉1

@RushNuts Try to use subvisions=32 or subvisions=64

AlexeyAB on 29 Mar 2018

Yolo v3 can be succesfully trained using this repo: https://github.com/AlexeyAB/darknet/issues/504#issuecomment-377290060

AlexeyAB on 29 Mar 2018

@AlexeyAB

So it seems you tested it yourself now isn't it?

VanitarNordic on 29 Mar 2018

@VanitarNordic Yes, I tested it myselft.

AlexeyAB on 29 Mar 2018

@AlexeyAB

You have trained it on your own custom dataset. Do you consider a performance improvement comparing to Yolo-V2 on your own dataset?

How much the detection speed in FPS has changed?

VanitarNordic on 29 Mar 2018

@VanitarNordic

I trained on my own dataset, about 15 000 images for 6 classes, but only 5000 iterations

Yolo v3 with resolution 224x224 ~= Yolo v2 with resolution 416x416 on 5000 iterations.
For more iterations Yolo v3 should have higher precision (mAP). With the same resolition more higher.

For correct comparison it is necessary to train about 20 000 - 40 000 iterations!

Yolo v3 - 224x224 - 31 FPS - mAP = 90.69 % - trained only 5000 iterations
Yolo v2 - 416x416 - 29 FPS - mAP = 90.62 % - trained only 6000 iterations

Yolo v3 - 224x224 - 31 FPS - mAP = 90.69 %

 for thresh = 0.25, precision = 0.97, recall = 0.99, F1-score = 0.98
 for thresh = 0.25, TP = 8490, FP = 271, FN = 88, average IoU = 74.62 %

 mean average precision (mAP) = 0.906889, or 90.69 %
Total Detection Time: 465.000000 Seconds

Yolo v2 - 416x416 - 29 FPS - mAP = 90.62 %

 for thresh = 0.25, precision = 0.97, recall = 1.00, F1-score = 0.98
 for thresh = 0.25, TP = 8575, FP = 278, FN = 3, average IoU = 76.82 %

 mean average precision (mAP) = 0.906237, or 90.62 %
Total Detection Time: 476.000000 Seconds

Official Precision/Speed:

68747470733a2f2f6873746f2e6f72672f776562742f70772f7a642f306a2f70777a64306a623967377a6e745f646273797739717a626e7674692e6a706567

AlexeyAB on 29 Mar 2018

👍2 🎉1

Hi Alexey,
Thanks for bringing in YOLO v3. I was able to build it and run the tests quite easily.
However, when I try to train the VOC data from scratch (2007 only), I get a really high Loss and the training messages contain a bunch of nans for class.

I was wondering if you can list the steps of training on your own data (other than COCO). It would be a huge help.

ps. I have been successfully able to train YOLOv2 on my own data using your repo.

sonalambwani on 29 Mar 2018

@sonalambwani
Yolo v3 shows nan up to 1000 iterations - this is the normal behavior of Yolo v3. Just train more.

AlexeyAB on 29 Mar 2018

Phew! Thanks

On Thu, Mar 29, 2018, 3:44 PM Alexey notifications@github.com wrote:

@sonalambwani https://github.com/sonalambwani
Yolo v3 shows nan up to 1000 iterations - this is the normal behavior of
Yolo v3. Just train more.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/AlexeyAB/darknet/issues/511#issuecomment-377350296,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AdYlb8719a0qhmk_WdeQhz6PPoy7VFXuks5tjTmSgaJpZM4S8S8Z
.

sonalambwani on 30 Mar 2018

@AlexeyAB

Yolo v3 shows nan up to 1000 iterations - this is the normal behavior of Yolo v3. Just train more.

usually -nan is a result of nu-normal images in the augmentation phase (if the dataset is healthy), and when we have -nan that iteration does not add any value. But you know better. Would you please double check more?

VanitarNordic on 30 Mar 2018

@VanitarNordic
This is the beginning of normal training of Yolo v3, as you can see there are many -nan here:
But final result is good - mAP = 90.69 %

Also check Joseph's answer: https://github.com/pjreddie/darknet/issues/566#issuecomment-376193026
If there are only some nan then training goes well, but if there are all nan then training goes wrong.

D:\Darknet2\darknet\build\darknet\x64>darknet.exe detector train data/obj.data y
olov3_obj.cfg darknet53.conv.74
yolov3_obj
layer     filters    size              input                output
    0 conv     32  3 x 3 / 1   224 x 224 x   3   ->   224 x 224 x  32
...
  105 conv     33  1 x 1 / 1    28 x  28 x 256   ->    28 x  28 x  33
  106 detection
Loading weights from darknet53.conv.74...
 seen 64
Done!
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
 If error occurs - run training with flag: -dont_show
Loaded: 2.353000 seconds
Region 82 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.502225
, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 94 Avg IOU: 0.389245, Class: 0.435512, Obj: 0.600462, No Obj: 0.503660, .
5R: 0.285714, .75R: 0.000000,  count: 7
Region 106 Avg IOU: 0.239299, Class: 0.475226, Obj: 0.402887, No Obj: 0.535192,
.5R: 0.000000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: 0.476170, Class: 0.571396, Obj: 0.444353, No Obj: 0.500995, .
5R: 0.000000, .75R: 0.000000,  count: 1
Region 94 Avg IOU: 0.447275, Class: 0.400346, Obj: 0.520463, No Obj: 0.502764, .
5R: 0.250000, .75R: 0.000000,  count: 4
Region 106 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.53334
2, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 82 Avg IOU: 0.335612, Class: 0.549323, Obj: 0.332219, No Obj: 0.501743, .
5R: 0.333333, .75R: 0.000000,  count: 3
Region 94 Avg IOU: 0.427187, Class: 0.675035, Obj: 0.660047, No Obj: 0.503516, .
5R: 0.500000, .75R: 0.000000,  count: 2
Region 106 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.53282
2, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 82 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.500734
, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 94 Avg IOU: 0.298235, Class: 0.441220, Obj: 0.485503, No Obj: 0.503760, .
5R: 0.142857, .75R: 0.000000,  count: 7
Region 106 Avg IOU: 0.335862, Class: 0.463578, Obj: 0.650037, No Obj: 0.534691,
.5R: 0.000000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: 0.598648, Class: 0.375109, Obj: 0.701654, No Obj: 0.499867, .
5R: 1.000000, .75R: 0.000000,  count: 1
Region 94 Avg IOU: 0.454718, Class: 0.392864, Obj: 0.518321, No Obj: 0.503738, .
5R: 0.333333, .75R: 0.000000,  count: 3
Region 106 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.53504
2, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 82 Avg IOU: 0.372241, Class: 0.425744, Obj: 0.557666, No Obj: 0.500633, .
5R: 0.000000, .75R: 0.000000,  count: 2
Region 94 Avg IOU: 0.258128, Class: 0.338691, Obj: 0.472565, No Obj: 0.503493, .
5R: 0.000000, .75R: 0.000000,  count: 1
Region 106 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.53494
3, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 82 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.501154
, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 94 Avg IOU: 0.325241, Class: 0.471191, Obj: 0.614726, No Obj: 0.503628, .
5R: 0.000000, .75R: 0.000000,  count: 3
Region 106 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.53531
0, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 82 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.502855
, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 94 Avg IOU: 0.357644, Class: 0.474404, Obj: 0.508615, No Obj: 0.503468, .
5R: 0.166667, .75R: 0.000000,  count: 6
Region 106 Avg IOU: 0.286589, Class: 0.459594, Obj: 0.583236, No Obj: 0.534190,
.5R: 0.000000, .75R: 0.000000,  count: 1

 1: 302.202576, 302.202576 avg, 0.000000 rate, 4.474000 seconds, 64 images
Loaded: 0.002000 seconds
Region 82 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.500461
, .5R: -nan(ind), .75R: -nan(ind),  count: 0

Good detection:
predictions

AlexeyAB on 30 Mar 2018

👍1

@RushNuts Tiny yolo has a low precision.
Try to do the same with original repo, do you get the same result for yolov2-tiny.cfg?
./darknet detector test cfg/coco.data cfg/yolov2-tiny.cfg yolov2-tiny.weights data/dog.jpg

AlexeyAB on 30 Mar 2018

I have the same result using my repo. So may be yolov2-tiny.weight badly trained, or required high threshold. You can ask about this issue in Joseph's repo: https://github.com/pjreddie/darknet

predictions