Darkflow: Transfer learning

Created on 28 Jun 2017  ·  18Comments  ·  Source: thtrieu/darkflow

Hi,
Although a new cfg uses pre-trained weights, during training the whole weights of the model get updated. Is it possible to update only the initialized weights of the last layer and keep the other weights frozen. It would decrease training time drastically.

Most helpful comment

For anybody who wants to understand to freeze a certain number of layers and then train the last remaining layers, you need to open up /darkflow/net/build.py and find the line of code like:

self.ntrain = len(darknet.layers)

and change it to the number of layers you want to train (not the number of layers you want to freeze). So if you want to train the last layer, you change the code to:
self.ntrain = 1

Then depending on how you install darkflow, you may need to uninstall it from pip and install it again.

All 18 comments

thanks a lot. I am able to train with just some layers now by modifying that code. But I wonder why len(darknet.layers) is 53 when i printed it out. There should be only 30 layers in the cfg.

please, can you teach me how to transfer learing by using the darkflow?
I change the the class of .cfg and the filter, but in the training the error show up: AssertionError: labels.txt and cfg/yolo-voc.cfg indicate inconsistent class numbers, can you give me some suggestion, please!

Have you changed the labels.txt to match the number of class in your cfg? That seems to be your problem.
For transfer learning, you just need to go the build.py at line 59 and change the len(darknet.layers) to the number of layers that you want to train. I suggest you print out the len(darknet.layers) and try out different number to get a sense of how it works.

Load | Nope | conv 3x3p1_1 +bnorm leaky | (?, 13, 13, 1024)
Load | Nope | conv 1x1p0_1 +bnorm leaky | (?, 13, 13, 512)
Init | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 13, 13, 1024)
Init | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 13, 13, 512)

You should see something similar to this

@borasy I modified the number of training layers as suggested, but the saved checkpoints can not be loaded properly. It throw errors like

2017-06-30 13:17:23.109268: W tensorflow/core/framework/op_kernel.cc:1158] Not found: Key 0-convolutional/biases not found in checkpoint

Any suggestion?

@borasy Thank you! I have changed the label.txt to match the number of class , and also the filter. And
I don't know what is the next step, whether or not change the len(darknet.layers)

@crazylyf i'm not sure about that. But maybe after changing the number of training layers, you need to train again, and cannot use the old checkpoints from when training layers weren't changed.

@wflijunnan After that it's all up to you. If you want to train only the last layer then change the len(darknet.layers) but if you wanna train the whole network, then just training normally as the example in README

Just an update though, after finishing training using transfer learning, my model can't detect any boxes, while not using transfer learning, my model can produce boxes quite well. Both training are done on similar number of epochs. Don't know what's wrong.

@borasy Have you succeed in loading transfer training models?

@crazylyf i can load the checkpoints from the transfer learning training model with no problem.

@borasy can you give me some suggestion, how to hange the len(darknet.layers) ,please, thank you

@borasy, any news related to this subject? Did you have to make any modifications to load the checkpoints?

I've implemented this partially in #493 but tbh I have no idea how to disable training on batch normalization layers. If anyone wants to help, please head there.

Another question: I have 4 band images as input and I want to use the pre-trained weights as much as possible. Only the first kernel will now have size ...x4 (noChannels) , not...x3. So how can I merge the pre-trained weights to my model with the random initialization of those extra weights in darkflow?

Is there any blog explaining the transfer learning of YOLO network?

For anybody who wants to understand to freeze a certain number of layers and then train the last remaining layers, you need to open up /darkflow/net/build.py and find the line of code like:

self.ntrain = len(darknet.layers)

and change it to the number of layers you want to train (not the number of layers you want to freeze). So if you want to train the last layer, you change the code to:
self.ntrain = 1

Then depending on how you install darkflow, you may need to uninstall it from pip and install it again.

@crazylyf we are getting the same error, could you manage to load the checkpoint for prediction?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

wonny2001 picture wonny2001  ·  4Comments

ma3252788 picture ma3252788  ·  3Comments

hrshovon picture hrshovon  ·  5Comments

borasy picture borasy  ·  3Comments

ShawnDing1994 picture ShawnDing1994  ·  4Comments