Insightface: why acc on the Training Set improve so slow?

Created on 8 Jan 2019  ·  5Comments  ·  Source: deepinsight/insightface

I have test the three models in the instruction on the training set ms1m, the three models as follows:
(1). Train ArcFace with LResNet100E-IR.
(2) Train CosineFace with LResNet50E-IR.
(3)Train Softmax with LMobileNetE.
and I found with the same training stage (same means haved trained the same dataset with same epoch),different model have different acc on training set, especially model (1),the acc on training set grow so slow.

the result as follows:

lr-batch-epoch: 0.1 11999 0
model  | acc on Lfw | acc on cfp_fp | acc on agedb_30 | Acc on training set
(1)-11999 | 98.76 | 85.73 | 90.57 | 0.015430
(2)-11999 | 98.45 | 80.66 | 89.18 | 0.042383
(3)-11999 | 97.71 | 83.64 | 83.03 | 0.203125
lr-batch-epoch: 0.1 19767 3
(1)-19767 | 99.17 | 87.04 | 94.37 | 0.041
(2)-19767 | 99.12 | 86.61 | 93.52 | 0.096
(3)-19767 | 99.12 | 91.86 | 89.95 | 0.56

       when batch epoch is 11999,From the result above looks like  model(1) get the base result on testing dataset, however the acc only~0.01,sames because the LResNet100E-IR  archtecture design good , only random weight and bias get a good result.
        however when batch epoch is 19767 ,model(1) seems not so good ,so why the trainging set acc on model(1) grow so slow?? Does anyone has good method to improve the acc grow speed?

Most helpful comment

What I mean is to remove margin when you want to get the real softmax training acc. Margin can be only considered on loss and backward propagation.

All 5 comments

Another problem:
from the result: (1)-11999 | 98.76 | 85.73 | 90.57 | 0.015430 ,it means the model(1) just learn little knowledge from training set(because the acc on training set just 0.015), it has a good result on test dataset:agedb,cpf,lfw. However, Every coin has two sides! If model(1) fall into local optimum, how to help it jump out of local optimum when its acc on training set improve so slow?

Thank you very much for reply

If you want, add the margin back to see the REAL softmax training acc.

@nttstar
what means add the margin back ?

@nttstar , what do u mean the “real” acc? Does the different margin influent the trainging acc? I'm not sure my understanding is right. maybe when margin is bigger the training set brobably fall into the inner margin area between the two classifocation ,so the result in the inner margin didn't take into consideration?

What I mean is to remove margin when you want to get the real softmax training acc. Margin can be only considered on loss and backward propagation.

Was this page helpful?
0 / 5 - 0 ratings