Darknet: training yolov3

Created on 27 Mar 2018  Â·  26Comments  Â·  Source: AlexeyAB/darknet

Bug fixed

Most helpful comment

@VanitarNordic I haven't tested training yet, but you can try to train:

  • update your code from this repo

  • create your yolov3_obj.cfg based on yolov3.cfg and change:

    • classes= in each of 3 [yolo]-layer
    • filters=(classes+5)x3 in each of 3 [convolutional]-layer before [yolo]-layers
  • download pre-trained weights: https://pjreddie.com/media/files/darknet53.conv.74

  • darknet.exe detector train data/obj.data yolov3_obj.cfg darknet53.conv.74

  • train about 5000 iterations

All 26 comments

It's not implemented in this fork yet: https://github.com/AlexeyAB/darknet/issues/504
Check the upstream repo in the meanwhile.

@fabito thanks

@RushNuts It just got added https://github.com/AlexeyAB/darknet/commit/d9ae3dd681ed1c98e807ff937dbbb9cfc4d19fe0 you should pull the latest commit though, seems there have been multiple fixes for compiling issues.
I can see Alexey added two yolov3 scripts in root which shows you how to issue a command using yolov3.

Seems to be an issue regarding the yolov3 update https://github.com/AlexeyAB/darknet/issues/522

You can revert back to an earlier commit https://github.com/AlexeyAB/darknet/commit/47c7af1cea5bbdedf1184963355e6418cb8b1b4f with this command while being in the directory git checkout 47c7af1cea5bbdedf1184963355e6418cb8b1b4f

@RushNuts @fabito @TheMikeyR

I fixed it.
Try to update your code from this repo and re-compile from scratch:

make clean
make -j8

I've already tested the detection of Yolo v2 and Yolo v3: https://github.com/AlexeyAB/darknet/issues/522#issuecomment-376865263
But have not yet tested the training of Yolo v3.
If something goes wrong with the training, let me know.

@RushNuts

i make: ./darknet detector test cfg/coco.data cfg/tiny-yolo.cfg yolov2-tiny.weights data/dog.jpg

It was just a bug in the yolov2-tiny.cfg in the original repo, now it is fixed: https://github.com/pjreddie/darknet/blame/master/cfg/yolov2-tiny.cfg#L123

@RushNuts @fabito @TheMikeyR
Now you can try to train Yolo v3. But I have not tested it yet.

@RushNuts Yolo v3 affects approximately 20 files, including these files: demo.c and image.c

@AlexeyAB

Does the repo support training of the V3? the initial weights of the V3 is different.

From @AlexeyAB 's latest commits, it should support training YOLO V3. Please check out the latest repo.

@VanitarNordic I haven't tested training yet, but you can try to train:

  • update your code from this repo

  • create your yolov3_obj.cfg based on yolov3.cfg and change:

    • classes= in each of 3 [yolo]-layer
    • filters=(classes+5)x3 in each of 3 [convolutional]-layer before [yolo]-layers
  • download pre-trained weights: https://pjreddie.com/media/files/darknet53.conv.74

  • darknet.exe detector train data/obj.data yolov3_obj.cfg darknet53.conv.74

  • train about 5000 iterations

@RushNuts Try to use subvisions=32 or subvisions=64

Yolo v3 can be succesfully trained using this repo: https://github.com/AlexeyAB/darknet/issues/504#issuecomment-377290060

@AlexeyAB

So it seems you tested it yourself now isn't it?

@VanitarNordic Yes, I tested it myselft.

@AlexeyAB

You have trained it on your own custom dataset. Do you consider a performance improvement comparing to Yolo-V2 on your own dataset?

How much the detection speed in FPS has changed?

@VanitarNordic

I trained on my own dataset, about 15 000 images for 6 classes, but only 5000 iterations

Yolo v3 with resolution 224x224 ~= Yolo v2 with resolution 416x416 on 5000 iterations.
For more iterations Yolo v3 should have higher precision (mAP). With the same resolition more higher.

For correct comparison it is necessary to train about 20 000 - 40 000 iterations!

  • Yolo v3 - 224x224 - 31 FPS - mAP = 90.69 % - trained only 5000 iterations
  • Yolo v2 - 416x416 - 29 FPS - mAP = 90.62 % - trained only 6000 iterations

  • Yolo v3 - 224x224 - 31 FPS - mAP = 90.69 %
 for thresh = 0.25, precision = 0.97, recall = 0.99, F1-score = 0.98
 for thresh = 0.25, TP = 8490, FP = 271, FN = 88, average IoU = 74.62 %

 mean average precision (mAP) = 0.906889, or 90.69 %
Total Detection Time: 465.000000 Seconds

  • Yolo v2 - 416x416 - 29 FPS - mAP = 90.62 %
 for thresh = 0.25, precision = 0.97, recall = 1.00, F1-score = 0.98
 for thresh = 0.25, TP = 8575, FP = 278, FN = 3, average IoU = 76.82 %

 mean average precision (mAP) = 0.906237, or 90.62 %
Total Detection Time: 476.000000 Seconds

Official Precision/Speed:

68747470733a2f2f6873746f2e6f72672f776562742f70772f7a642f306a2f70777a64306a623967377a6e745f646273797739717a626e7674692e6a706567

Hi Alexey,
Thanks for bringing in YOLO v3. I was able to build it and run the tests quite easily.
However, when I try to train the VOC data from scratch (2007 only), I get a really high Loss and the training messages contain a bunch of nans for class.

I was wondering if you can list the steps of training on your own data (other than COCO). It would be a huge help.

ps. I have been successfully able to train YOLOv2 on my own data using your repo.

@sonalambwani
Yolo v3 shows nan up to 1000 iterations - this is the normal behavior of Yolo v3. Just train more.

Phew! Thanks

On Thu, Mar 29, 2018, 3:44 PM Alexey notifications@github.com wrote:

@sonalambwani https://github.com/sonalambwani
Yolo v3 shows nan up to 1000 iterations - this is the normal behavior of
Yolo v3. Just train more.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/AlexeyAB/darknet/issues/511#issuecomment-377350296,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AdYlb8719a0qhmk_WdeQhz6PPoy7VFXuks5tjTmSgaJpZM4S8S8Z
.

@AlexeyAB

Yolo v3 shows nan up to 1000 iterations - this is the normal behavior of Yolo v3. Just train more.

usually -nan is a result of nu-normal images in the augmentation phase (if the dataset is healthy), and when we have -nan that iteration does not add any value. But you know better. Would you please double check more?

@VanitarNordic
This is the beginning of normal training of Yolo v3, as you can see there are many -nan here:
But final result is good - mAP = 90.69 %

Also check Joseph's answer: https://github.com/pjreddie/darknet/issues/566#issuecomment-376193026
If there are only some nan then training goes well, but if there are all nan then training goes wrong.

D:\Darknet2\darknet\build\darknet\x64>darknet.exe detector train data/obj.data y
olov3_obj.cfg darknet53.conv.74
yolov3_obj
layer     filters    size              input                output
    0 conv     32  3 x 3 / 1   224 x 224 x   3   ->   224 x 224 x  32
...
  105 conv     33  1 x 1 / 1    28 x  28 x 256   ->    28 x  28 x  33
  106 detection
Loading weights from darknet53.conv.74...
 seen 64
Done!
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
 If error occurs - run training with flag: -dont_show
Loaded: 2.353000 seconds
Region 82 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.502225
, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 94 Avg IOU: 0.389245, Class: 0.435512, Obj: 0.600462, No Obj: 0.503660, .
5R: 0.285714, .75R: 0.000000,  count: 7
Region 106 Avg IOU: 0.239299, Class: 0.475226, Obj: 0.402887, No Obj: 0.535192,
.5R: 0.000000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: 0.476170, Class: 0.571396, Obj: 0.444353, No Obj: 0.500995, .
5R: 0.000000, .75R: 0.000000,  count: 1
Region 94 Avg IOU: 0.447275, Class: 0.400346, Obj: 0.520463, No Obj: 0.502764, .
5R: 0.250000, .75R: 0.000000,  count: 4
Region 106 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.53334
2, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 82 Avg IOU: 0.335612, Class: 0.549323, Obj: 0.332219, No Obj: 0.501743, .
5R: 0.333333, .75R: 0.000000,  count: 3
Region 94 Avg IOU: 0.427187, Class: 0.675035, Obj: 0.660047, No Obj: 0.503516, .
5R: 0.500000, .75R: 0.000000,  count: 2
Region 106 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.53282
2, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 82 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.500734
, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 94 Avg IOU: 0.298235, Class: 0.441220, Obj: 0.485503, No Obj: 0.503760, .
5R: 0.142857, .75R: 0.000000,  count: 7
Region 106 Avg IOU: 0.335862, Class: 0.463578, Obj: 0.650037, No Obj: 0.534691,
.5R: 0.000000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: 0.598648, Class: 0.375109, Obj: 0.701654, No Obj: 0.499867, .
5R: 1.000000, .75R: 0.000000,  count: 1
Region 94 Avg IOU: 0.454718, Class: 0.392864, Obj: 0.518321, No Obj: 0.503738, .
5R: 0.333333, .75R: 0.000000,  count: 3
Region 106 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.53504
2, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 82 Avg IOU: 0.372241, Class: 0.425744, Obj: 0.557666, No Obj: 0.500633, .
5R: 0.000000, .75R: 0.000000,  count: 2
Region 94 Avg IOU: 0.258128, Class: 0.338691, Obj: 0.472565, No Obj: 0.503493, .
5R: 0.000000, .75R: 0.000000,  count: 1
Region 106 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.53494
3, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 82 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.501154
, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 94 Avg IOU: 0.325241, Class: 0.471191, Obj: 0.614726, No Obj: 0.503628, .
5R: 0.000000, .75R: 0.000000,  count: 3
Region 106 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.53531
0, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 82 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.502855
, .5R: -nan(ind), .75R: -nan(ind),  count: 0
Region 94 Avg IOU: 0.357644, Class: 0.474404, Obj: 0.508615, No Obj: 0.503468, .
5R: 0.166667, .75R: 0.000000,  count: 6
Region 106 Avg IOU: 0.286589, Class: 0.459594, Obj: 0.583236, No Obj: 0.534190,
.5R: 0.000000, .75R: 0.000000,  count: 1

 1: 302.202576, 302.202576 avg, 0.000000 rate, 4.474000 seconds, 64 images
Loaded: 0.002000 seconds
Region 82 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.500461
, .5R: -nan(ind), .75R: -nan(ind),  count: 0

Good detection:
predictions

@RushNuts Tiny yolo has a low precision.
Try to do the same with original repo, do you get the same result for yolov2-tiny.cfg?
./darknet detector test cfg/coco.data cfg/yolov2-tiny.cfg yolov2-tiny.weights data/dog.jpg

I have the same result using my repo. So may be yolov2-tiny.weight badly trained, or required high threshold. You can ask about this issue in Joseph's repo: https://github.com/pjreddie/darknet

predictions

@RushNuts Yes

@AlexeyAB I wonder when should this train will stop? If I don't stop it manually, would be always training ?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hemp110 picture hemp110  Â·  3Comments

rezaabdullah picture rezaabdullah  Â·  3Comments

qianyunw picture qianyunw  Â·  3Comments

kebundsc picture kebundsc  Â·  3Comments

off99555 picture off99555  Â·  3Comments