Dlib: Problem in fine-tuning face recognition model

Created on 6 Nov 2017 · 14Comments · Source: davisking/dlib

Hi, I'm trying to fine tune the face recognition model to my own dataset, however there layers difference between the dlib_face_recognition_resnet_model_v1.dat and the net_type variable in the example script for training metrics with dnn. The batch_normalization layer in the provided model have already been transformed into affine layer, which make it impossible to be trained cmiiw. When I try to train the provided model with the affine layer, I got nan loss. So I think we need the dlib_face_recognition_resnet_model_v1.dat with the batch_normalization ones in order to permit us to fine-tune it. Can you share the model with the batch_normalization ones? Or maybe show us the way how to retrain it by manually assign the weight per-layer except the affine ones? Thanks

inactive

Source

alphinside

Most helpful comment

Yeah, that's an issue with the model. I honestly don't have the version
with batch normalization layers because I accidentally overwrote the file
with the affine version :/

Anyway, I don't think fine tuning the network is a great idea anyway. I
would just train something on top of the network.

davisking on 6 Nov 2017

👍3

All 14 comments

Yeah, that's an issue with the model. I honestly don't have the version
with batch normalization layers because I accidentally overwrote the file
with the affine version :/

Anyway, I don't think fine tuning the network is a great idea anyway. I
would just train something on top of the network.

davisking on 6 Nov 2017

👍3

@davisking In fact, i meet the same problem, i would like to fine-tuning with weighs from dlib_face_recognition_resnet_model_v1.dat. As you said, you recommend training something on top of the network? or train from the beginning?
I am wondering what kind of batch normalization method used when you training dlib_face_recognition_resnet_model_v1.dat?

bikong2 on 25 Nov 2017

Train something on top of the network like a linear SVM.

davisking on 25 Nov 2017

The model "dlib_face_recognition_resnet_model_v1.dat" couldn`t distinguish asian people very well so i want to train new network on top of the model with Asian people images. Is the loss layer of new network still the loss_metric ? and how to keep the accuracy of white people and improve the accuracy of Asian people simultaneously if i got Asian people images only?

LI-ZONG-HAN on 31 Aug 2018

Should be. You can use only yours if you can not get former dataset that Davis used. But you should check time by time if it remembers non Asian features as like before. This is easier when you have other dataset by preparing recipe at the beginning (at least helps) even by Asian/Not Asian sample count ratio. You can check by passing same frame to both models and compare with the ground truth. Would you publish one of last snapshots if it is fine for you, so we can upload it to public models repo? So we can contribute in same way. Or we can train and share model on our ci (fast) if you can share your dataset.

isgursoy on 31 Aug 2018

Sure ! However i just started to train the new network ( just added a fc layer and loss_metric for training). I would share the model after i got a good one. My data were downloaded from CAS-PEAL face database. you could google and download it.

LI-ZONG-HAN on 31 Aug 2018

May be we can touch if it is not used in first training.

isgursoy on 31 Aug 2018

@davisking current model had not seen that dataset before right?

isgursoy on 31 Aug 2018

No, it wasn't used.

davisking on 31 Aug 2018

Hello, @LI-ZONG-HAN, can you share with us the improvements and new network optmized to asian?

gustavomr on 12 Sep 2018

After trying different number of fc layers with different hyperparameters such as learning rate and margin in loss_metric layer, there is no obvious improvement at Asian people. The different Asian people are still too close in the 128-features space. I think there is no enough freedom for training Asian people after feature-extraction. Some good features of Asian faces could be lost. Maybe training feature-extraction layers is the only way.

LI-ZONG-HAN on 19 Sep 2018

Warning: this issue has been inactive for 20 days and will be automatically closed on 2018-10-26 if there is no further activity.

If you are waiting for a response but haven't received one it's likely your question is somehow inappropriate. E.g. you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's documentation, or a Google search.