Faiss: params for IndexIVFPQ

Created on 16 Nov 2018 · 4Comments · Source: facebookresearch/faiss

Greetings!
for now there are 3 million 512-d vectors in my database, running on a 4 CPU 16G RAM machine.
200,000 vectors were trained, and index params as below
'''
m = 8
k = 120
quantizer = faiss.IndexFlatL2(d)
index = faiss.IndexIVFPQ(quantizer, d, 32, m, 8)
index.nprobe = 32

'''

search one vector cost around 50 ms ( pretty good），but the accuracy is not satisfactory.
Is there any adjustment on the index params can improve accuracy while at the meantime keep searching time less than 100ms
Thanks!

question

Source

KangRinpoche

Most helpful comment

Also, it seems you could use a simple IndexFlatL2 instead of an IndexIVFPQ, as 1. your data would fit in memory uncompressed and 2. you are already performing an exhaustive (approximate) search. Using an IndexFlatL2 would perform an exact exhaustive search, hence accuracy would not be an issue. See the wiki page for choosing an index.

beauby on 16 Nov 2018

👍2

All 4 comments

Currently, you seem to have nlist = nprobe, which means you are effectively performing an exhaustive search, and your accuracy issue probably comes from the PQ encoding (here you are compressing each vector from 512 32bit floats to 8 bytes codes, which means a compression factor of 256). You could try increasing the m parameter. Thoughts @mdouze?

beauby on 16 Nov 2018

👍2

@beauby Thanks a lot.
I am getting to know IndexIVFPQ for preparing a larger database -_- .As you suggested, should I increase the last param encoded bits(8) while increasing param m ? and is my xt quantity(200,000) many enough for 3 million database.

KangRinpoche on 16 Nov 2018

No activity, closing.