Faiss: How to delete some vectors from index and then do a search without re-training.

Created on 7 Feb 2018  路  4Comments  路  Source: facebookresearch/faiss

Hi everyone.
I want to exclude some vectors from my query after I've done training.

That is, once I train and add vectors:

index.train(xb) 
index.add(xb)

I'd like to take off, for instance, xb[10:20] from the query

D, I = index.search(xq, k=10)

and still return me 10 nearest neighbors

Thanks and
May the faiss be with you!!

question

Most helpful comment

Does vectors include xb? The train vectors are not added to the database by calling train(); the database remains empty after train.

If vectors does include xb, then you have two options. First, if you only want to exclude 10 vectors as in your example with k=10, just query with k=20 and then remove any of the 10 vectors you wanted excluded if they happen to occur in the results.

If it is a substantial number of vectors that you want excluded (i.e., hundreds or thousands or millions), then you can remove them from the index, but that is very slow, and some indices don't support removal. Otherwise, you can build multiple indices that cover the different subsets that you wish to capture, trained on the original vectors.

All 4 comments

Does vectors include xb? The train vectors are not added to the database by calling train(); the database remains empty after train.

If vectors does include xb, then you have two options. First, if you only want to exclude 10 vectors as in your example with k=10, just query with k=20 and then remove any of the 10 vectors you wanted excluded if they happen to occur in the results.

If it is a substantial number of vectors that you want excluded (i.e., hundreds or thousands or millions), then you can remove them from the index, but that is very slow, and some indices don't support removal. Otherwise, you can build multiple indices that cover the different subsets that you wish to capture, trained on the original vectors.

I meant to say

index.train(xb) 
index.add(xb)

My apologies for the confusion. I corrected it.


So I'd like to recursively search for nearest neighbors of a query vector xq, then get maybe 10 nearest neighbor vectors, and then do a search on all these vectors separately until I gather, say, a thousand vectors. Ideally, I'd like to be able to remove certain indices from index.

Right now, I am invoking the index.search() function a few times. It returns duplicate vectors which I weed out. I was only wondering if there was a more clever way built into the index or quantizer object.

Thanks

Hi
Faiss can only search neighbors in a fixed dataset. It is not a generic database engine, so you cannot build composite queries like in SQL.

No activity. Closing.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ilyakhov picture ilyakhov  路  3Comments

xxllp picture xxllp  路  3Comments

linghuang picture linghuang  路  3Comments

hipitt picture hipitt  路  3Comments

daniellevy picture daniellevy  路  3Comments