Faiss: How to search by ID range?

Created on 29 Jan 2018 · 4Comments · Source: facebookresearch/faiss

If I create an index with 100,000 data (IndexFlatL2,IndexIDMap2), each data has a different timestamp as ID. Now use the time range to query the data, which contains only 10,000 data (or ids) as the data being queried.

Now I do this by using "numpy.where(cond)" to find the ID in the time range, using "index.reconstruct(ID)" to take out the data, and then creating a new index (faiss. Index_factory (128, 'IDMap,Flat')), and searching. It's inefficient.

How to do scope queries without creating new index?
What is the correct way to do it?

question

Source

hipitt

👍2

All 4 comments

Hi,
Faiss is not a DBMS where you can query by any field, only similarity queries are supported.
If you need to filter by id range, you either:

filter the output of Faiss
not use Faiss at all, make a linear array of ids, and filter the output of that array sequentially.

mdouze on 29 Jan 2018

👍1

Will ID range search be supported in the future? This is very useful for search. Thanks!

AmierCheng on 19 Jun 2019

👍1

It is unlikely that it will be supported out-of-the-box. What you can do is using an inverted list scanner object to control which parts of the inverted lists you access (for example based on a predicate on the ids):
https://github.com/facebookresearch/faiss/wiki/Inverted-list-objects-and-scanners#the-invertedlistscanner-object

mdouze on 19 Jun 2019

👍1

Can I put a bitmap for filtering? This can reduce some score calculations during the recall