index = faiss.IndexFlatL2(d)
and
index.add(xb)
index = faiss.IndexIVFPQ(coarse_quantizer, d, nlist, m, faiss.METRIC_L2)
The above are all based on Euclid distance. How can I build index/search based on cosine similarity using faiss python package?
index = faiss.IndexFlatIP(d)
IP stands for "inner product". If you have normalized vectors, the inner product becomes cosine similarity.
Here is an overview of the available indices:
https://github.com/facebookresearch/faiss/wiki/Faiss-indexes
This is searching for the cosine similarity! Not the cosine distance! Cosine distance is 1-
Just adding example if noob like me came here to find how to calculate the Cosine similarity from scratch
import faiss
dataSetI = [.1, .2, .3]
dataSetII = [.4, .5, .6]
x = np.array([dataSetI]).astype(np.float32)
q = np.array([dataSetII]).astype(np.float32)
index = faiss.index_factory(3, "Flat", faiss.METRIC_INNER_PRODUCT)
index.ntotal
faiss.normalize_L2(x)
index.add(x)
faiss.normalize_L2(q)
distance, index = index.search(q, 5)
print('Distance by FAISS:{}'.format(distance))
from scipy import spatial
result = 1 - spatial.distance.cosine(dataSetI, dataSetII)
print('Distance by FAISS:{}'.format(result))
Most helpful comment
index = faiss.IndexFlatIP(d)IP stands for "inner product". If you have normalized vectors, the inner product becomes cosine similarity.
Here is an overview of the available indices:
https://github.com/facebookresearch/faiss/wiki/Faiss-indexes