Facenet: Question: what algorithm to use for clustering with embedding vectors?

Created on 19 Oct 2017  路  12Comments  路  Source: davidsandberg/facenet

Just wondering how do I do clustering after I get a bunch of embeddings from a number of pictures?

All 12 comments

I have done it with DBSCAN but based on the ecleudian distance you can calculate with the embedding vectors. You can find my code at #441 if you are interested.

this is a different cluster algorithm:
https://github.com/zhly0/facenet-face-cluster-chinese-whispers-
which you do not need to specify the number of cluster number,all you need to do is specify the threshold.
https://github.com/davidsandberg/facenet/issues/370

@zhly0
The DBSCAN algorithm doesn't require you to specify the clusters, you just feed it the distance matrix and it will cluster based on the chosen threshold as well.

I have also cluster using the DBSCAN as it doesn't require to specify the clusters.

@Shahnawazgrewal @MaartenBloemen I have tried clustering with out using mtcnn (align function) , precomuputed and cropped all images using dlib in a directory of shape(4000,160,160,3), all 4000 instances belongs to different classes (images with various different brightness, saturation etc.. )
in cluster.py, i have substituted at main() as
'In main function
with tf.Graph().as_default():
with tf.Session() as sess:
facenet.load_model(args.model)
#image_list = load_images_from_folder(args.data_dir)
#images = align_data(image_list, args.image_size, args.margin, pnet, rnet, onet)
images = load_images_from_folder(args.data_dir)
Then performed clustering only Repeated images falls into same cluster. threshold tested with 1.0 to 0.5 still all resulted same! ,
MY INPUT DIR CONTAINS IMAGE LIKE THIS BELOW
selection_021

which pretrain model are you using? @RaviRaaja

@Shahnawazgrewal 20170512-110547 and vgg face 2 pretrained model uploaded by you

can you upload the subset of images (say 100) on dropbox to test? @RaviRaaja

Some of the clusters with eps = 0.50
screenshot_2018-03-24_15-07-46
screenshot_2018-03-24_15-08-37
screenshot_2018-03-24_15-08-55
screenshot_2018-03-24_15-09-23
screenshot_2018-03-24_15-09-31

You can also use HDSCAN. @RaviRaaja

@Shahnawazgrewal will try with hdscan , can you add some more note about dbscan experiment for above results , Chinese whisper clustering normalisation is not done , and in dbscan normalization is done , does prewhiten(module name) do have impact on clustering??

I have not experiment with prewhiten module. @RaviRaaja
Apparently, results are okay.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

shdyn picture shdyn  路  4Comments

billtiger picture billtiger  路  3Comments

MaartenBloemen picture MaartenBloemen  路  3Comments

MrXu picture MrXu  路  3Comments

tonybaigang picture tonybaigang  路  3Comments