Hello guys. Before everything, I need to appreciate you for your brilliant library.
We have millions of pictures, and we are trying to design a distributed system based on Faiss. I am just wondering if there is a way that we could partition and split out our index between multi machines and speared our search requests on them.
If so, I would be grateful if you share a best practice architecture with me
There is a very simple implementation of a distributed index here: https://github.com/facebookresearch/faiss/tree/master/benchs/distributed_ondisk
Note that the distribution is over inverted lists, the dataset is not sharded.
@mdouze Thanks for your answer. Is there a way that we could shared the data store as well. I mean is there a way that we could design a distributed sharded system with Faiss. something like ElasticSearch. Is there a best practice in this case?
@alirahm2 Hey, we are developing a vector searching engine: Milvus(https://github.com/milvus-io/milvus), which has distributed solution: mishards. You may check it out.
@JinHai-CN thanks for your reply. Unfortunately, I could not find a resource regarding the mishards in your repo/doc. Could you please give me more context on it. Thank you
No activity, closing.
@JinHai-CN thanks for your reply. Unfortunately, I could not find a resource regarding the mishards in your repo/doc. Could you please give me more context on it. Thank you
here:
https://github.com/milvus-io/milvus/blob/master/shards/README.md
Most helpful comment
here:
https://github.com/milvus-io/milvus/blob/master/shards/README.md