Faiss: Can Index be shard in CPU?

Created on 28 Feb 2018  Â·  3Comments  Â·  Source: facebookresearch/faiss

Summary

Platform

OS:

Running on :

  • [X] CPU
  • [ ] GPU

Reproduction instructions

Hi, I wondering is it possible to use shard index in CPU and how to do it if possible?

I have this concern because I want to do Parallelization in CPU to speed up the search procedure. like each shard_index have a 1/n of the whole index and merge the results in the end.

question

Most helpful comment

@0DF0Arc Here is an example:

import faiss
nbshards = 10
nlist = 100
m = 8 # number of bytes per vector
k = 4

# First build the quantization indexes:
quantizers = [faiss.IndexFlatL2(d) for _ in range(nbshards)]
# Then build the shards:
shards = [faiss.IndexIVFPQ(q, d, nlist, m, 8) for q in quantizers]
# Finally build the sharded index:
index = faiss.IndexShards(d, True)
# and add the shards to it:
for s in shards:
    index.add_shard(s)

# Then use as a standard index:
index.train(...)

index.search(...)

All 3 comments

Hi,
Yes it is possible, see https://rawgit.com/facebookresearch/faiss/master/docs/html/structfaiss_1_1IndexShards.html
Note that batched queries are already parallelized, so it will help only if you do 1 query at a time.

@mdouze THX, I read the code but still don't really understand how to do it. Can you give a short example, especially on how to split the index (like index_cpu_to_gpu_mutiple)?
In my case, I have a trained IVFPQ index added with 10 million vectors, I want to split this index into n(like 10) sub_indexes, then query a vector seperately, and finally merge all results.

@0DF0Arc Here is an example:

import faiss
nbshards = 10
nlist = 100
m = 8 # number of bytes per vector
k = 4

# First build the quantization indexes:
quantizers = [faiss.IndexFlatL2(d) for _ in range(nbshards)]
# Then build the shards:
shards = [faiss.IndexIVFPQ(q, d, nlist, m, 8) for q in quantizers]
# Finally build the sharded index:
index = faiss.IndexShards(d, True)
# and add the shards to it:
for s in shards:
    index.add_shard(s)

# Then use as a standard index:
index.train(...)

index.search(...)
Was this page helpful?
0 / 5 - 0 ratings