Faiss: do search while at the same time adding incrementally

Created on 14 Jun 2018  路  4Comments  路  Source: facebookresearch/faiss

Hi,
there are 100 million 128-d vectors in my database, trained and added,
and I'm going to add around half a million vectors every 4 hours.
Can I search the index while at the same time adding new vectors.
Thanks for your help!

question

Most helpful comment

Hi
No you can't, see

https://github.com/facebookresearch/faiss/wiki/Threads-and-asynchronous-calls

Your options are, from simplest to most complex:

  • lock the index for search during add.

  • perform the add in a separate temp index (that is trained in the same way as the main index) then merge the main index with the temp index (see https://github.com/facebookresearch/faiss/wiki/Special-operations-on-indexes#splitting-and-merging-indexes). The index will still be unavailable for search during the merge but downtime will be shorter.

  • at search time, during the 4 hours, copy the index to an offline index, add vectors to that one, and swap indexes every 4 hours. No downtime but the index is stored twice.

All 4 comments

Hi
No you can't, see

https://github.com/facebookresearch/faiss/wiki/Threads-and-asynchronous-calls

Your options are, from simplest to most complex:

  • lock the index for search during add.

  • perform the add in a separate temp index (that is trained in the same way as the main index) then merge the main index with the temp index (see https://github.com/facebookresearch/faiss/wiki/Special-operations-on-indexes#splitting-and-merging-indexes). The index will still be unavailable for search during the merge but downtime will be shorter.

  • at search time, during the 4 hours, copy the index to an offline index, add vectors to that one, and swap indexes every 4 hours. No downtime but the index is stored twice.

Appreciate for your valuable options锛丂mdouze
As far as I understand, faiss keeps the vectors index in the same order as they were added.
For eg: vector_a was the 1000th added, then he owns index No.1000 in faiss index.
So for the second option "perform the add in a separate temp index", should I worry about the index number?
For eg: main index has 1000 vectors, then the 1st vector of temp index will own index No.1001 in the final index.
Thanks!

No activity, closing.

@KangRinpoche
just set the No.index you need to add_with_ids

Was this page helpful?
0 / 5 - 0 ratings

Related issues

0DF0Arc picture 0DF0Arc  路  3Comments

danny1984 picture danny1984  路  3Comments

zoe-cheung picture zoe-cheung  路  3Comments

Tony-Hou picture Tony-Hou  路  3Comments

minjiaz picture minjiaz  路  3Comments