Hi everyone.
I want to exclude some vectors from my query after I've done training.
That is, once I train and add vectors:
index.train(xb)
index.add(xb)
I'd like to take off, for instance, xb[10:20] from the query
D, I = index.search(xq, k=10)
and still return me 10 nearest neighbors
Thanks and
May the faiss be with you!!
Does vectors include xb? The train vectors are not added to the database by calling train(); the database remains empty after train.
If vectors does include xb, then you have two options. First, if you only want to exclude 10 vectors as in your example with k=10, just query with k=20 and then remove any of the 10 vectors you wanted excluded if they happen to occur in the results.
If it is a substantial number of vectors that you want excluded (i.e., hundreds or thousands or millions), then you can remove them from the index, but that is very slow, and some indices don't support removal. Otherwise, you can build multiple indices that cover the different subsets that you wish to capture, trained on the original vectors.
I meant to say
index.train(xb)
index.add(xb)
My apologies for the confusion. I corrected it.
So I'd like to recursively search for nearest neighbors of a query vector xq, then get maybe 10 nearest neighbor vectors, and then do a search on all these vectors separately until I gather, say, a thousand vectors. Ideally, I'd like to be able to remove certain indices from index.
Right now, I am invoking the index.search() function a few times. It returns duplicate vectors which I weed out. I was only wondering if there was a more clever way built into the index or quantizer object.
Thanks
Hi
Faiss can only search neighbors in a fixed dataset. It is not a generic database engine, so you cannot build composite queries like in SQL.
No activity. Closing.
Most helpful comment
Does
vectorsincludexb? The train vectors are not added to the database by calling train(); the database remains empty after train.If
vectorsdoes includexb, then you have two options. First, if you only want to exclude 10 vectors as in your example with k=10, just query with k=20 and then remove any of the 10 vectors you wanted excluded if they happen to occur in the results.If it is a substantial number of vectors that you want excluded (i.e., hundreds or thousands or millions), then you can
removethem from the index, but that is very slow, and some indices don't support removal. Otherwise, you can build multiple indices that cover the different subsets that you wish to capture, trained on the original vectors.