Faiss: how to search with cosine similarity

Created on 1 Apr 2017  Â·  8Comments  Â·  Source: facebookresearch/faiss

hi, I only see two choices for searching: METRIC_INNER_PRODUCT, METRIC_L2. how can I search with cosine similarity?

question

Most helpful comment

@billkle1n
It seems that faiss.normalize_L2() doesn't have a return value. It normalizes the matrix in place. So instead of
index.train(normalize_L2(training_vectors)),
it should be
normalize_L2(training_vectors)
index.train(training_vectors)

All 8 comments

Hi

Please L2-normalize the vectors before adding and searching, then search with METRIC_INNER_PRODUCT.

@mdouze ,I have the same question. But how to search with METRIC_INNER_PRODUCT? can you give example and code? thanks.

@yhpku something like that:

from faiss import normalize_L2

# ...

index.train(normalize_L2(training_vectors))
index.add(normalize_L2(index_vectors))
index.search(normalize_L2(search_vectors), 5)

The metric inner product flag is set when the index is built.

@billkle1n
It seems that faiss.normalize_L2() doesn't have a return value. It normalizes the matrix in place. So instead of
index.train(normalize_L2(training_vectors)),
it should be
normalize_L2(training_vectors)
index.train(training_vectors)

I have a question, when i try normalize_L2(dest_array_one) , i get the error:
File "", line 1, in
File "/root/anaconda3/envs/faiss/lib/python2.7/site-packages/faiss/__init__.py", line 523, in normalize_L2
fvec_renorm_L2(x.shape[1], x.shape[0], swig_ptr(x))
TypeError: in method 'fvec_renorm_L2', argument 3 of type 'float *'

@13293824182 make sure your array is of type float32

Just adding example if noob like me came here to find how to calculate the Cosine similarity from scratch

import faiss
dataSetI = [.1, .2, .3]
dataSetII = [.4, .5, .6]

dataSetII = [.1, .2, .3]

x = np.array([dataSetI]).astype(np.float32)
q = np.array([dataSetII]).astype(np.float32)
index = faiss.index_factory(3, "Flat", faiss.METRIC_INNER_PRODUCT)
index.ntotal
faiss.normalize_L2(x)
index.add(x)
faiss.normalize_L2(q)
distance, index = index.search(q, 5)
print('Distance by FAISS:{}'.format(distance))

To Tally the results check the cosine similarity of the following example

from scipy import spatial

result = 1 - spatial.distance.cosine(dataSetI, dataSetII)
print('Distance by FAISS:{}'.format(result))

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hashyong picture hashyong  Â·  3Comments

Ljferrer picture Ljferrer  Â·  3Comments

daniellevy picture daniellevy  Â·  3Comments

zoe-cheung picture zoe-cheung  Â·  3Comments

jukaradayi picture jukaradayi  Â·  3Comments