Hi,
I'm trying to apply faiss to my face identification approach, but don't have a very clear idea how to do it.
Like I get 5000 feature vectors of 1000 identities for train, and 100 identities for query. What I am trying to do is use the 5000 feature vectors to build the index and save this index, when do the query, load the index and use 100 identities to query.
Is that a correct idea? and if i expand the feature vectors number of 1000 identities to 10000, do I need to re-train the index? also if i expand the identities to 2000 with feature vectors number to 10000, do I need to re-train the index?
For that number of observations, you can probably use the IndexFlatL2, roughly like:
dim = # ... dimension of your features ...
index = faiss.IndexFlatL2(dim)
index.add(index_feats)
dinstances, neighbors = index.search(query_feats, 16)
Lets say, there are 3 persons A, B and C with faces in a 100 images. Their embeddings are generated from a neural network that is ideally supposed to project same faces within 128D hypersphere of radius 0.6. However, it doesn't necessarily do so. After running the network over these 100 images, we have 100 embeddings clustered around 3 major points with most of them staying within the radius of 0.6 except some.
If one trains an SVM on this, it is possible to map these outlier embeddings to one of the 3 face classes.
However, how can one achieve something similar in FAISS? Is there a provision of this sort?
FAISS returns the closest vectors to a given input vector. What if that closest vector is actually an anomaly and belonging to a different face?
Most helpful comment
For that number of observations, you can probably use the
IndexFlatL2, roughly like: