Insightface: Tripletloss(Facenet) for few shot learning

Created on 15 Aug 2018  路  6Comments  路  Source: deepinsight/insightface

For few-shot learning tasks like IDCard<->Camera face verification(identification), we only have two face images for each person in most cases for training. Under such situation, metric learning approaches can be tried such as tripletloss.

STEPS:

  1. Prepare insightface '.rec' dataset from your IDCard/camera face images.
  2. Finetuning pretrained models with tripletloss, for example:
CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train_triplet.py --data-dir $DATA_DIR \
  --network "$NETWORK" --lr 0.005 --pretrained "$PRETRAINED" --per-batch-size 60 

We use GPU to do the semi-hard mining so training will be fast.

RESULTS:

We have a private IDCard/Camera face image dataset with 220K identities. Each person has two or more photos, one from IDCard and the others from camera. Split it as 8-2 for training(176K IDs) and testing(45K IDs). We report top1 accuracy and TAR vs FAR for 1:N identification task(N=45K).
(Note that we do not use idcard training data in Model-A and Model-B)

| | DESC. | Rank-1 | TAR@FAR=1e-3 | TAR@FAR=1e-4 |
| -------- | ---------------------------------------- | -------------------- | ------------ | ------------ |
| Model-A1 | LResNet100E trained on ms1m-v1 with Softmax loss | 26.9% | 0.3% | 0.06% |
| Model-A2 | LResNet100E trained on ms1m-v1 with ArcFace loss | 70.7% | 17% | 8% |
| Model-A3 | LResNet100E trained on ms1m-v2 with ArcFace loss | 76.8% | 21% | 9% |
| Model-B | LResNet100E trained on (ms1m-v2+Glint-Asia) with ArcFace loss | 82.4% | 33% | 16% |
| Model-C | Triplet-loss finetuning on Model-B | 95.2%(still ongoing) | 78% | 26% |

Example

Most helpful comment

Will you release the emore + Glin-Asia data, or show how to combine the two?

All 6 comments

176k seems not a huge number, have you ever tried finetuning using arcface instead of triplet-loss? I think that might achieve better result.

Will you release the emore + Glin-Asia data, or show how to combine the two?

Could you upload a triploss finetune log file? @nttstar

Arcface is a good example of deep learning popularization.
I would like to thank nttstar for working on the arcface code as a deep learning researcher.
I developed a tensorflow learning code based on your mxnet face recognition training code.
I would like to cooperate with you. If you ask, I can send you my tensorflow arcface code. However, I can not open it on github due to some problems.
I am using tensorflow, so it can be inconvenient for you to use mxnet. But I want to cooperate with you.
My skype id is kwakjiwon1986.
If you accept my request, please link me.
Thanks,
Kwak Ji Won

@nttstar Would you mind releasing the training log for Model-B? I am not getting good accuracy while training MobileFaceNet on emore + glint-asia combined dataset.
Thank you very much!

I trained a model similar to Model-B with limited resource (two 2080Ti). Not sure I reach the full potential of the most complicated model. The accuracy on agedb_30 is slightly lower than I like:

testing verification..
(12000, 512)
infer time 23.104655
[lfw][552000]XNorm: 20.384019
[lfw][552000]Accuracy-Flip: 0.99783+-0.00269
testing verification..
(14000, 512)
infer time 27.093829
[cfp_fp][552000]XNorm: 21.291860
[cfp_fp][552000]Accuracy-Flip: 0.98386+-0.00478
testing verification..
(12000, 512)
infer time 23.229146
[agedb_30][552000]XNorm: 21.327194
[agedb_30][552000]Accuracy-Flip: 0.97550+-0.00624

Was this page helpful?
0 / 5 - 0 ratings