Hi,
I have been using Faiss in my PyTorch model (a custom module).
Although both Faiss and PyTorch has GPU version, it seems the communication between them has to be back to CPU again:
I have to copy tensor to CPU (as numpy ndarray) before sending it Faiss API,
xq = xq_tensor.data.cpu().numpy
D, I = index.search(xq, k)
This becomes the bottleneck of the model right now.
I wonder if there is a way to update index xb, search with queryxq directly using PyTorch GPU tensor. Or do you have any plan in building such feature?
Thank you so much!
Best,
Hao
Hi
That is a very good question. With a small modification to faiss, it is possible to not transfer the data
to CPU
Patch for faiss:
https://gist.github.com/mdouze/d82fa89eb47e2ea8fd841e5454d71b88
Test script:
https://gist.github.com/mdouze/e393931abc9f8ed93e2f63516db5e4f4
With similar tricks, it is possible to avoid copying D and I to CPU.
The underlying GPU Faiss C++ index APIs accept either GPU or CPU pointers as input, it's just a manner of getting the right pointer passed. No copy is required if the allocation is on the same GPU as the index.
However, a warning: if you are passing GPU data, the computation is ordered with respect to StandardGpuResources::getDefaultStream(device), which is deliberately not the default (null) stream. If you need the results ordered with respect to a different stream (for instance, the default stream), you'll have to synchronize with that stream.
https://github.com/facebookresearch/faiss/blob/master/gpu/StandardGpuResources.h#L51
Hi, tons of thanks for the patch and test script. I am trying to avoid transferring D and I now and got some difficulties in returning gpu pointers when calling index.search(xq, k) @mdouze Do I need to make another patch and could you help a bit on that? Thank you so much!
Working on a proper patch....
The latest version makes interop between Faiss and Pytorch easier. See
https://github.com/facebookresearch/faiss/blob/master/gpu/test/test_pytorch_faiss.py
this is great @mdouze !
Hi,
May I ask if this is already part of the repo's master branch?
Yes, see test above.
Most helpful comment
The latest version makes interop between Faiss and Pytorch easier. See
https://github.com/facebookresearch/faiss/blob/master/gpu/test/test_pytorch_faiss.py