Hi, I'm trying to fine tune the face recognition model to my own dataset, however there layers difference between the dlib_face_recognition_resnet_model_v1.dat and the net_type variable in the example script for training metrics with dnn. The batch_normalization layer in the provided model have already been transformed into affine layer, which make it impossible to be trained cmiiw. When I try to train the provided model with the affine layer, I got nan loss. So I think we need the dlib_face_recognition_resnet_model_v1.dat with the batch_normalization ones in order to permit us to fine-tune it. Can you share the model with the batch_normalization ones? Or maybe show us the way how to retrain it by manually assign the weight per-layer except the affine ones? Thanks
Yeah, that's an issue with the model. I honestly don't have the version
with batch normalization layers because I accidentally overwrote the file
with the affine version :/
Anyway, I don't think fine tuning the network is a great idea anyway. I
would just train something on top of the network.
@davisking In fact, i meet the same problem, i would like to fine-tuning with weighs from dlib_face_recognition_resnet_model_v1.dat. As you said, you recommend training something on top of the network? or train from the beginning?
I am wondering what kind of batch normalization method used when you training dlib_face_recognition_resnet_model_v1.dat?
Train something on top of the network like a linear SVM.
The model "dlib_face_recognition_resnet_model_v1.dat" couldn`t distinguish asian people very well so i want to train new network on top of the model with Asian people images. Is the loss layer of new network still the loss_metric ? and how to keep the accuracy of white people and improve the accuracy of Asian people simultaneously if i got Asian people images only?
Should be. You can use only yours if you can not get former dataset that Davis used. But you should check time by time if it remembers non Asian features as like before. This is easier when you have other dataset by preparing recipe at the beginning (at least helps) even by Asian/Not Asian sample count ratio. You can check by passing same frame to both models and compare with the ground truth. Would you publish one of last snapshots if it is fine for you, so we can upload it to public models repo? So we can contribute in same way. Or we can train and share model on our ci (fast) if you can share your dataset.
Sure ! However i just started to train the new network ( just added a fc layer and loss_metric for training). I would share the model after i got a good one. My data were downloaded from CAS-PEAL face database. you could google and download it.
May be we can touch if it is not used in first training.
@davisking current model had not seen that dataset before right?
No, it wasn't used.
Hello, @LI-ZONG-HAN, can you share with us the improvements and new network optmized to asian?
After trying different number of fc layers with different hyperparameters such as learning rate and margin in loss_metric layer, there is no obvious improvement at Asian people. The different Asian people are still too close in the 128-features space. I think there is no enough freedom for training Asian people after feature-extraction. Some good features of Asian faces could be lost. Maybe training feature-extraction layers is the only way.
Warning: this issue has been inactive for 20 days and will be automatically closed on 2018-10-26 if there is no further activity.
If you are waiting for a response but haven't received one it's likely your question is somehow inappropriate. E.g. you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's documentation, or a Google search.
Warning: this issue has been inactive for 31 days and will be automatically closed on 2018-10-24 if there is no further activity.
If you are waiting for a response but haven't received one it's likely your question is somehow inappropriate. E.g. you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's documentation, or a Google search.
Notice: this issue has been closed because it has been inactive for 35 days. You may reopen this issue if it has been closed in error.
Most helpful comment
Yeah, that's an issue with the model. I honestly don't have the version
with batch normalization layers because I accidentally overwrote the file
with the affine version :/
Anyway, I don't think fine tuning the network is a great idea anyway. I
would just train something on top of the network.