Darkflow: Transfer learning

Created on 28 Jun 2017 · 18Comments · Source: thtrieu/darkflow

Hi,
Although a new cfg uses pre-trained weights, during training the whole weights of the model get updated. Is it possible to update only the initialized weights of the last layer and keep the other weights frozen. It would decrease training time drastically.

Source

borasy

👍1

Most helpful comment

For anybody who wants to understand to freeze a certain number of layers and then train the last remaining layers, you need to open up /darkflow/net/build.py and find the line of code like:

self.ntrain = len(darknet.layers)

and change it to the number of layers you want to train (not the number of layers you want to freeze). So if you want to train the last layer, you change the code to:
self.ntrain = 1

Then depending on how you install darkflow, you may need to uninstall it from pip and install it again.

ashleyjsands on 17 Sep 2018

👍2

All 18 comments

See Issue #163 The new line number might be https://github.com/thtrieu/darkflow/blob/master/darkflow/net/build.py#L59

jcarletgo on 28 Jun 2017

thanks a lot. I am able to train with just some layers now by modifying that code. But I wonder why len(darknet.layers) is 53 when i printed it out. There should be only 30 layers in the cfg.

borasy on 29 Jun 2017

please， can you teach me how to transfer learing by using the darkflow？
I change the the class of .cfg and the filter, but in the training the error show up: AssertionError: labels.txt and cfg/yolo-voc.cfg indicate inconsistent class numbers, can you give me some suggestion, please!

wflijunnan on 30 Jun 2017

Have you changed the labels.txt to match the number of class in your cfg? That seems to be your problem.
For transfer learning, you just need to go the build.py at line 59 and change the len(darknet.layers) to the number of layers that you want to train. I suggest you print out the len(darknet.layers) and try out different number to get a sense of how it works.

You should see something similar to this

borasy on 30 Jun 2017

👍1

@borasy I modified the number of training layers as suggested, but the saved checkpoints can not be loaded properly. It throw errors like

2017-06-30 13:17:23.109268: W tensorflow/core/framework/op_kernel.cc:1158] Not found: Key 0-convolutional/biases not found in checkpoint

Any suggestion?

crazylyf on 30 Jun 2017

@borasy Thank you! I have changed the label.txt to match the number of class , and also the filter. And
I don't know what is the next step, whether or not change the len(darknet.layers)

wflijunnan on 30 Jun 2017

@crazylyf i'm not sure about that. But maybe after changing the number of training layers, you need to train again, and cannot use the old checkpoints from when training layers weren't changed.

borasy on 3 Jul 2017

@wflijunnan After that it's all up to you. If you want to train only the last layer then change the len(darknet.layers) but if you wanna train the whole network, then just training normally as the example in README

borasy on 3 Jul 2017

Just an update though, after finishing training using transfer learning, my model can't detect any boxes, while not using transfer learning, my model can produce boxes quite well. Both training are done on similar number of epochs. Don't know what's wrong.

borasy on 3 Jul 2017

@borasy Have you succeed in loading transfer training models?

crazylyf on 3 Jul 2017

@crazylyf i can load the checkpoints from the transfer learning training model with no problem.

borasy on 3 Jul 2017

@borasy can you give me some suggestion， how to hange the len(darknet.layers) ，please， thank you

wflijunnan on 6 Jul 2017

@borasy, any news related to this subject? Did you have to make any modifications to load the checkpoints?

fabiocapsouza on 26 Sep 2017

I've implemented this partially in #493 but tbh I have no idea how to disable training on batch normalization layers. If anyone wants to help, please head there.

sheerun on 29 Dec 2017

Another question: I have 4 band images as input and I want to use the pre-trained weights as much as possible. Only the first kernel will now have size ...x4 (noChannels) , not...x3. So how can I merge the pre-trained weights to my model with the random initialization of those extra weights in darkflow?

onurbarut on 30 Dec 2017

Is there any blog explaining the transfer learning of YOLO network?

Ashwanthkumard on 12 Jun 2018

For anybody who wants to understand to freeze a certain number of layers and then train the last remaining layers, you need to open up /darkflow/net/build.py and find the line of code like:

self.ntrain = len(darknet.layers)

and change it to the number of layers you want to train (not the number of layers you want to freeze). So if you want to train the last layer, you change the code to:
self.ntrain = 1

Then depending on how you install darkflow, you may need to uninstall it from pip and install it again.

ashleyjsands on 17 Sep 2018

👍2

@crazylyf we are getting the same error, could you manage to load the checkpoint for prediction?