Darknet: Training on additional images/labels

Created on 2 Mar 2018 · 14Comments · Source: AlexeyAB/darknet

Hi@AlexeyAB, thank you for your great contribution.
I have a question which puzzled me a lot these days.
We have about 60000 types of images with labels, for example
image: img0-0.jpg label:0
image: img0-1.jpg label:0
.......................
image:img0-8.jpg label:0

image:img1-0.jpg label:1
image:img1-1.jpg label:1
.......................
image:img1-8.jpg label:1
.......................
image:img60000-0.jpg label:60000
image:img60000-1.jpg label:60000
.......................
image:img60000-8.jpg label:60000

my question is: if we have a new image/label, say img60001-(0-8).jpg and label:60001, do we need to train the model with all 60001 images/labels? it will cost lots of time, just as you said, 1 types need about 2000 iterations, the first time, we need 60000x2000 iterations, it is not a problem, but if we add several new images/labels, we want to train it based on the model-60000 weights, not to train it from the scratch, is it possible? thank you.

Source

anguoyang

Most helpful comment

1.is there any samples or tutorial on transfer learning or fine tuning?

Transfer learning:
Ex:
./darknet detector train cfg/voc.data cfg/yolo-voc.cfg darknet19_448.conv.23.weights
In the command above, I've initializing my network (yolo-voc.cfg) with weights from a pre-trained network weights darknet19_448.conv.23.weights.

To explain in simple terms, the neural network defined in the .cfg file is nothing but an empty skeleton. Before training, we've to initialize the network weights to facilitate proper learning. There are many methods available to initialize weights (ex: random initialization, xavier initialization etc). If we use any such techniques, it is commonly referred as training from scratch. This kind of initialization taken time to converge.

For faster convergence, we can use an pre-trained network weights to initialize our current empty network before starting to train. This type of initialization usually converges faster than random (or) xavier initializtion. During training weights in all the layers will be updated.

Fine tuning:
Similar to transfer learning, but the weights of only few layers will be updated (usually the top layer i.e., the final convolution layer in YOLO V2). The weights of the remaining layers are frozen.

You can use the same command as transfer learning, with just one extra parameter added to the cfg file stopbackward=1 in the cfg file in this line:

https://github.com/AlexeyAB/darknet/blob/ea09a6e0b38e1ddf43ffcd81d27f0506411eb8e4/cfg/yolo.cfg#L232

2.what is the relationship between images per class and total classes?

In your case, number of classes is 60000 & images per class is 8. Training with just 8 images per class may not be enough for the model to learn all the features from the object of interest. You may need more number of images per class for getting good results.

sivagnanamn on 2 Mar 2018

👍5 ❤1

All 14 comments

By type you mean a new class right? 60.000 classes is a lot :open_mouth: it will be great if you can share your results if it succeeds.

About not training from scratch, you can check out this https://github.com/AlexeyAB/darknet/issues/245#issuecomment-340532626 hope this answer your question, in case you have any further questions don't hesitate to comment or make a new issue.

TheMikeyR on 2 Mar 2018

👍2 😄1

yes, a new class, we have not trained on 60000 classes yet, but we hope to:)

anguoyang on 2 Mar 2018

As @TheMikeyR has pointed out clearly, fine tuning or transfer learning is the way to go about this problem if you keep adding new classes. But if you have only 8 images per class (as mentioned in your example), it may be too less variety and the model may not converge.

It will be helpful for the community if you could share your experience after training for 60k classes. Good luck :+1:

sivagnanamn on 2 Mar 2018

hi@sivagnanamn , thank you for your reply, I have 2 additional questions:
1.is there any samples or tutorial on transfer learning or fine tuning?
2.what is the relationship between images per class and total classes?

we have about 60k types of objects need to be detected, and we could take photo on these objects, yes, the man power is huge, and it is not finished yet, I will sure share experience when training

anguoyang on 2 Mar 2018

1.is there any samples or tutorial on transfer learning or fine tuning?

Transfer learning:
Ex:
./darknet detector train cfg/voc.data cfg/yolo-voc.cfg darknet19_448.conv.23.weights
In the command above, I've initializing my network (yolo-voc.cfg) with weights from a pre-trained network weights darknet19_448.conv.23.weights.

Fine tuning:
Similar to transfer learning, but the weights of only few layers will be updated (usually the top layer i.e., the final convolution layer in YOLO V2). The weights of the remaining layers are frozen.

You can use the same command as transfer learning, with just one extra parameter added to the cfg file stopbackward=1 in the cfg file in this line:

https://github.com/AlexeyAB/darknet/blob/ea09a6e0b38e1ddf43ffcd81d27f0506411eb8e4/cfg/yolo.cfg#L232

2.what is the relationship between images per class and total classes?

sivagnanamn on 2 Mar 2018

👍5 ❤1

You should add all new images to old images (8 images very little, and it makes no sense to add them, you should add at least 2000 images), so you have 62 000 images:

and coninue training - if you keep the same number of classes (increase max_batches= in the cfg-file)
do transfer-learning or fine-tuning - if you change number of classes

AlexeyAB on 2 Mar 2018

Hi@sivagnanamn @AlexeyAB , thank you all for your great help!
For the images/each class, we may get 50 at most for each class, however 2000 is a little bit difficult.

anguoyang on 4 Mar 2018

Hi@AlexeyAB , sorry for reopen this.
although with only about 8 images for each class, the training log shows good performance, and the average loss is low, please see this screenshot:
https://imgur.com/a/e6Qxk
https://imgur.com/338qwC0
however, the testing result is not so good

anguoyang on 28 Mar 2018

@anguoyang Yes, that is as it should be.

When you train on small number of images, then network learn to detect objects well only on these images. But it can't detect objects on another images.

AlexeyAB on 28 Mar 2018

it's exciting to see that yolo has updated to v3 just now! hope it will improve the accuracy:)

anguoyang on 28 Mar 2018

@anguoyang There's very less chance that you'll get acceptable accuracy by changing from V2 to V3. Because you're training with just 8 images per class, so YOLO V3 will also overfit & perform well only during training. The only way to improve performance is to add variety to your trianing data.

sivagnanamn on 29 Mar 2018

👍1

@AlexeyAB - I added about 12,000 additional images and want to "continue" training..

but since my existing weights have already reached the "final" weights stage, I can not seem to simply "continue" training??

All it does when I point it to the existing weights file is save it back to "final" weights file name and exits training immediately.. ?

should I do "transfer learning" and if so, whats the "partial" syntax for extracting the proper weights from a YOLO2 VOC cfg file?

THANK YOU!

kooscode on 10 Jun 2018

@kooscode

If you didn't change number of classes, then just train with flag -clear at the end of training command
If you changed number of classes, then do transfer learning

AlexeyAB on 10 Jun 2018

👍2

Thanks! that will work great! I also just increased the max_batches = 120k as per your previous recommendation (should have read this thread better)

thank you!!!

kooscode on 11 Jun 2018

Was this page helpful?

0 / 5 - 0 ratings