Faiss: How to search by ID range?

Created on 29 Jan 2018  路  4Comments  路  Source: facebookresearch/faiss

If I create an index with 100,000 data (IndexFlatL2,IndexIDMap2), each data has a different timestamp as ID. Now use the time range to query the data, which contains only 10,000 data (or ids) as the data being queried.

Now I do this by using "numpy.where(cond)" to find the ID in the time range, using "index.reconstruct(ID)" to take out the data, and then creating a new index (faiss. Index_factory (128, 'IDMap,Flat')), and searching. It's inefficient.

How to do scope queries without creating new index?
What is the correct way to do it?

question

All 4 comments

Hi,
Faiss is not a DBMS where you can query by any field, only similarity queries are supported.
If you need to filter by id range, you either:

  • filter the output of Faiss
  • not use Faiss at all, make a linear array of ids, and filter the output of that array sequentially.

Will ID range search be supported in the future? This is very useful for search. Thanks!

It is unlikely that it will be supported out-of-the-box. What you can do is using an inverted list scanner object to control which parts of the inverted lists you access (for example based on a predicate on the ids):
https://github.com/facebookresearch/faiss/wiki/Inverted-list-objects-and-scanners#the-invertedlistscanner-object

Can I put a bitmap for filtering? This can reduce some score calculations during the recall

Was this page helpful?
0 / 5 - 0 ratings