Facenet: A new model trained on VGGFace2

Created on 22 Jan 2018  Â·  65Comments  Â·  Source: davidsandberg/facenet

I trained a model on VGGFace2 using center loss. The embedding is powerful than the subset of MS-Celeb. I can make the model public with the two available mode. @davidsandberg

Most helpful comment

Here is the link to download a pre-trained model trained with inception-ResNet-v1 with center loss function on VGGFace2 dataset.
Please give your general feedback.

All 65 comments

@Shahnawazgrewal can you please share with us how you've trained the checkpoint? and can you also share with us the checkpoint?

I trained with VGGFace2 as well, the model is not as good as MS Cele. Although the accuracy might be equal or better than MS, the TPR at 0.001 FPR is much lower (98.X compared to 99.X)

I eventually combined this two datasets and reached 99.73% accuracy and 99.63% TPR w/ 0.001 FPR.

@Shahnawazgrewal could U please kindly share with us how do U train the VGGFace2 model?
When we try to train the data, we always got error as below. Thanks!
OutOfRangeError (see above for traceback): FIFOQueue '_1_batch_join/fifo_queue' is closed and has insufficient elements (requested 9, current size 0)
[[Node: batch_join = QueueDequeueUpToV2[component_types=[DT_FLOAT, DT_INT64], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch_join/fifo_queue, _arg_batch_size_0_0)]]

@Shahnawazgrewal For sure, if U can share your model via Google Drive and etc., that is also much appreciated. Thanks!

@Shahnawazgrewal I add center loss to caffe, I met the error "math_functions.cu:155] Check failed: error == cudaSuccess (11 vs. 0) invalid argument" when I train,have you met this problem or you know how to fix it? thx

can you please decrease maximum number of epochs. Please read issue #105
I have similar error for MS-Celeb-1M dataset. @syy6

For sure, I will share the model with you guys. @syy6

Did you train on a subset of MS-Celeb-1M. @JianbangZ

@Shahnawazgrewal my subset of MS-Celeb-1M is 70k identities, 4.5 million images. I can achieve 99.5% accuracy and 99.3% TPR with it

@Shahnawazgrewal , actually I even tried to reduce the number of epochs, but the issue still exists......

@Shahnawazgrewal it would be very nice if you could share the hyperparameters you used in training. I've recently tried to use the VGGFace2 to train by triplet loss but with no luck. The LFW accuracy and validation rate just levelled off at around 0.96 and 0.7.

@syy6 Did you check this out before? #600

@yipsang @Shahnawazgrewal , I just find the issue, one of the input png is broken in my computer, so I got this error. After removing the png, it seems to be fine now.

@JianbangZ, could U please share with us how U take the duplicate between MS & VGG2 dataset? If U look at the namelist of two datasets, certain names are duplicate in both dataset.

@Shahnawazgrewal Dude... where have you uploaded your model?

Here is the link to download a pre-trained model trained with inception-ResNet-v1 with center loss function on VGGFace2 dataset.
Please give your general feedback.

@Shahnawazgrewal , Did you perform the pre-training on MS-Celeb-1M and then fine-tune on VGGFace2 dataset?

No. I didn't
I trained with inception-ResNet-v1 with center loss function on VGGFace2 dataset from scratch.
More specifically
I downloaded loosely cropped faces dataset from the VG-GFace2
(http://www.robots.ox.ac.uk/~vgg/data/vgg_face2/ . The dataset is aligned with 160×160 image size and 32 pixels margin based on Multi-task CNN. I trained the model on aligned dataset for 100 epochs with an RMSProp optimizer. @Yeongjae

@Shahnawazgrewal could you make proper comments for checkpoint files which has been uploaded in dropbox!

@Shahnawazgrewal based on our evaluation, your model is truly powerful than both provided pretrain-model. I wonder the reason behind this wonderful improvement? Is the dataset used for training the root cause?

P.S. Thank your for uploading this wonderful pretrained checkpoint

@tenggyut , VGGFace2 dataset is considered to be a deep dataset (higher number of image per identity). In my opinion, this could be the reason. In addition, I observed that the model trained on VGGFace2 produced better representation of previously unseen faces.

@Shahnawazgrewal did you train the model as a classifier or using triple loss?

I trained the model based on center loss. @tenggyut

@Shahnawazgrewal
I have few questions about your implement detail:

  1. Do you do 2D alignment or just crop 160x160 bounding box after MTCNN.
  2. How is the learning rate you use in RMSProp? Do you decrease learning rate?
    Thanks for your sharing!
  1. just crop 160x160 bounding box after MTCNN.
  2. I used default settings available.

Did you train the model with softmax loss combined with center loss? or you just train it with center loss?

combined.

Validation on LFW dataset with the model trained on VGGFace2
Runnning forward pass on LFW images
Model directory: /home/super/datasets/lfw/vggface2-cl
Metagraph file: model-20171216-232945.meta
Checkpoint file: model-20171216-232945.ckpt-100000
Runnning forward pass on LFW images
Accuracy: 0.992+-0.004
Validation rate: 0.96000+-0.01880 @ FAR=0.00067
Area Under Curve (AUC): 0.999
Equal Error Rate (EER): 0.008

I trained with cosine Face algorithms . accuracy is 0.995, validation rate = 0.985

@JianbangZ what do you mean cosine face algorithms? did you replace the L2 norm in center loss with cosine similarity? Or you mean the paper the author released last year, SphereFace?

@JianbangZ Have you tried ArcFace?Can you share your model?

@Shahnawazgrewal
Q1: Did you modify learning rate? if yes, can you share your modify value?
Q2: Did you modify param before training step, ex: weight_decay, center_loss factor, center_loss alpha..etc, if yes , can you share this? thanks for your helping.

@Shahnawazgrewal
How many epochs do the converged model takes ?
What the Loss/RegLoss are after the model converges on the vgg dataset ?
Hoping to get your advices. Thank you!

I use the default settings from the facenet implementation. @akimo12345

  1. No.
  2. No.

I trained model for 100 epoches. @yemenr

@ShahnawazgrewalI am facing this issue while using this pre-trained model of VGGFace2:-
/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Loading model...
2018-05-21 12:46:43.698367: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-05-21 12:46:43.698396: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-05-21 12:46:43.698400: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-05-21 12:46:43.698404: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-05-21 12:46:43.698422: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Model loaded
Loading MTCNN Face detection model
MTCNN Model loaded
[INFO] camera sensor warming up...
Traceback (most recent call last):
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1039, in _do_call
return fn(*args)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1021, in _run_fn
status, run_metadata)
File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
next(self.gen)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value InceptionResnetV1/Conv2d_1a_3x3/weights
[[Node: InceptionResnetV1/Conv2d_1a_3x3/weights/read = Identity[T=DT_FLOAT, _class=["loc:@InceptionResnetV1/Conv2d_1a_3x3/weights"], _device="/job:localhost/replica:0/task:0/cpu:0"](InceptionResnetV1/Conv2d_1a_3x3/weights)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 153, in
main(args);
File "main.py", line 24, in main
camera_recog()
File "main.py", line 53, in camera_recog
features_arr = extract_feature.get_features(aligns)
File "/home/anju/rashmi_folder/FaceRec_old_before_24Apr_2018/face_feature.py", line 30, in get_features
return self.sess.run(self.embeddings, feed_dict = {self.x : images})
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 778, in run
run_metadata_ptr)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 982, in _run
feed_dict_string, options, run_metadata)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1032, in _do_run
target_list, options, run_metadata)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1052, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value InceptionResnetV1/Conv2d_1a_3x3/weights
[[Node: InceptionResnetV1/Conv2d_1a_3x3/weights/read = Identity[T=DT_FLOAT, _class=["loc:@InceptionResnetV1/Conv2d_1a_3x3/weights"], _device="/job:localhost/replica:0/task:0/cpu:0"](InceptionResnetV1/Conv2d_1a_3x3/weights)]]

Caused by op 'InceptionResnetV1/Conv2d_1a_3x3/weights/read', defined at:
File "main.py", line 151, in
extract_feature = FaceFeature(FRGraph)
File "/home/anju/rashmi_folder/FaceRec_old_before_24Apr_2018/face_feature.py", line 21, in __init__
resnet.inference(self.x, 0.6, phase_train=False)[0], 1, 1e-10); #some magic numbers that u dont have to care about
File "/home/anju/rashmi_folder/FaceRec_old_before_24Apr_2018/architecture/inception_resnet_v1.py", line 155, in inference
reuse=reuse)
File "/home/anju/rashmi_folder/FaceRec_old_before_24Apr_2018/architecture/inception_resnet_v1.py", line 185, in inception_resnet_v1
scope='Conv2d_1a_3x3')
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(args, *current_args)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 918, in convolution
outputs = layer.apply(inputs)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/layers/base.py", line 320, in apply
return self.__call__(inputs, *kwargs)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/layers/base.py", line 286, in __call__
self.build(input_shapes[0])
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/layers/convolutional.py", line 138, in build
dtype=self.dtype)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 1049, in get_variable
use_resource=use_resource, custom_getter=custom_getter)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 948, in get_variable
use_resource=use_resource, custom_getter=custom_getter)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 349, in get_variable
validate_shape=validate_shape, use_resource=use_resource)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 1389, in wrapped_custom_getter
*args, *
kwargs)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/layers/base.py", line 275, in variable_getter
variable_getter=functools.partial(getter, *kwargs))
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/layers/base.py", line 228, in _add_variable
trainable=trainable and self.trainable)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1334, in layer_variable_getter
return _model_variable_getter(getter, *args, *
kwargs)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1326, in _model_variable_getter
custom_getter=getter, use_resource=use_resource)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(args, *current_args)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 262, in model_variable
use_resource=use_resource)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(args, *current_args)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 217, in variable
use_resource=use_resource)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 341, in _true_getter
use_resource=use_resource)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 714, in _get_single_variable
validate_shape=validate_shape)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 197, in __init__
expected_shape=expected_shape)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 316, in _init_from_args
self._snapshot = array_ops.identity(self._variable, name="read")
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1338, in identity
result = _op_def_lib.apply_op("Identity", input=input, name=name)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op
op_def=op_def)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
self._traceback = _extract_stack()

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value InceptionResnetV1/Conv2d_1a_3x3/weights
[[Node: InceptionResnetV1/Conv2d_1a_3x3/weights/read = Identity[T=DT_FLOAT, _class=["loc:@InceptionResnetV1/Conv2d_1a_3x3/weights"], _device="/job:localhost/replica:0/task:0/cpu:0"](InceptionResnetV1/Conv2d_1a_3x3/weights)]]

please help to solve this.

@Shahnawazgrewal You model is powerful to my problem. Can you share the code which you used for training the model on VGGFace2 ? I want to fine tuning you model. Thank you very much !

I used the same code train_softmax.py with default parameters.

@Shahnawazgrewal Thank you very much !

@Shahnawazgrewal When you train you model on VGGFace2,did you prefilter the dataset ?

No. It is pretty clean dataset.

@yipsang I tried to run a train_tripletloss.py for training the VGG dataset, but the program is crashed when it was saving a checkpoint model. How can you train a VGG dataset by using triplet loss? Is there any changes should I make?

@thuoctran I meet the same problem. You should modify train_tripletloss.py line 175 saver.restore(sess,os.path.expanduser(args.pretrained_model)) to these ckpt = tf.train.get_checkpoint_state(os.path.expanduser(args.pretrained_model)) if ckpt and ckpt.model_checkpoint_path: saver.restore(sess, ckpt.model_checkpoint_path)

@Shahnawazgrewal Did not you modify the parameter center_loss_factor,when you train your model? I find that the default value is 0.0. Did you used the value 0.0 train your model ?

I used 1e-2. @Laviyy

Thank you.

@JianbangZ You trained model with the combined dataset of VGGFace2 and MS Cele. I want to know how to combined the two datasets.? Which algorithm did you use, when you train your model ? Triplet loss, softmax loss, center loss or others? Can you share your model with us? Thank any way!

@Shahnawazgrewal First of all, thank you for your help. I trained my model according your direction, but it not good. I want to know the value of margin when you crop image with mtcnn. I find some faces are still slant. Did you used affine transformation to rotate the faces to upright faces?

@Laviyy margin is 30 ,this is good

@rain2008204 ok, thank you!

@syy6 hello,i also have the problem(OutOfRangeError). How can i do to solve it ?

@phoebushe , I just checked if there is any broken images for the input and I find one image is broken.

@syy6 , VGGFACE2 dataset have over 300 million images, could u tell me how to check the images?Thx!!!!!

@phoebushe , just use Python, code snip as below.
`from PIL import Image, ExifTags

image=Image.open(file_path)

image.verify()`

@syy6 Thanks a lot!!

@syy6 image.verify() is the best way to check up to now.

hi all,
In this repo, @davidsandberg use LWF with LWF dataset and pairs.txt file.
How to evaluate with your dataset (have use pairs.txt file, ...) when you fine-tune Asian-face dataset?
I donn't know how to evaluate for new dataset (I fine tune for Asian-face dataset)?
Thank you

Q1: What is your learning rate, learning rate decay epochs and learning rate decay epochs in the experiment using center loss? I find that the default settings of these hyperparameters are 0.1, 100 and 1.0, respectively. Did you modify them? If yes, can you share your modify value?

Q2: Have you tried training a initial inception-resnet-v1 with triplet loss using CASIA WebFace, MS-Celeb-1M or VGGFace2 ? I wonder if the triplet loss can well pre-train the inception-resnet-v1 and obtain better performance in LFW dataset in both rank-1 accuracy and the verification rates?

How to train on images of size 100,100 rather than 160,160 ?

Hi @Shahnawazgrewal , we are very much impressed with the performance of you model. Thank you.

@Shahnawazgrewal Hey thanks for releasing your model! However, it looks like the link is broken. Did you change the location of the link by any chance? Thanks in advance!

can some one tell me what computer spec your using to train the model using this dataset ,or the cloud spec if trained in cloud thanks in advance .I am beginner in computer vision and trying to train a model of my own from scratch using this dataset , the model similar to inception model used in facenet

Hi @Shahnawazgrewal
Do you know how can I fine-tune this classifier for my dataset?

@Shahnawazgrewal link doesn`t work

Link broken

Was this page helpful?
0 / 5 - 0 ratings