Sample data inside train feature:
[0.42307538 2.3160305 0.24694297 ... 1.0617049 0.09734876 1.0471938 ]
my train feature sample size:
3122708 X 2048
Error in part:
import faiss # make faiss available
index_sim = faiss.IndexFlatL2(2048) # build the index
print(index_sim.is_trained)
index_sim.add(tr_features_np) # add vectors to the index
print(index_sim.ntotal)
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-7-418d31a6035c> in <module>()
2 index_sim = faiss.IndexFlatL2(2048) # build the index
3 print(index_sim.is_trained)
----> 4 index_sim.add(tr_features_np) # add vectors to the index
5 print(index_sim.ntotal)
~/anaconda3/lib/python3.6/site-packages/faiss/__init__.py in replacement_add(self, x)
95
96 def replacement_add(self, x):
---> 97 assert x.flags.contiguous
98 n, d = x.shape
99 assert d == self.d
Any Help!
Adding a numpy array of vectors to the index only works with contiguous arrays. The one that you have might have been manipulated in some way as to no longer be C-contiguous.
You can try np.ascontiguousarray:
index_sim.add(np.ascontiguousarray(tr_features_np))
I also noticed that maybe those asserts in faiss.py should be removed, because swig_ptr already checks for C-contiguous arrays and yields a better error message when this happens: ValueError: array is not C-contiguous.
Thanks, it worked!
One more doubt please, can you please tell me how to _save_ this variable _index_sim_ in some file, after adding the train feature.
Because if everytime I am running _.add_ function it take like 5 to 10 minutes. so I just want to save this variable somewhere and load it later when I want to use it.
The data_type of variable _index_sim_ is: _faiss.swigfaiss.IndexFlatL2_
Thanx!
Admittedly, it was a bit hard for me to find the subject on the wiki pages, but here it is. You are looking for faiss.write_index and faiss.read_index:
faiss.write_index(index_sim, "sim.index") # save
index_sim = faiss.read_index("sim.index") # load
Thank you! so much. Have a great day.
@Enet4: changed the chapter name in the wiki to make it easier to find.
@mdouze Yes, it seems to be much better now. :+1:
Most helpful comment
Adding a numpy array of vectors to the index only works with contiguous arrays. The one that you have might have been manipulated in some way as to no longer be C-contiguous.
You can try np.ascontiguousarray:
I also noticed that maybe those asserts in faiss.py should be removed, because
swig_ptralready checks for C-contiguous arrays and yields a better error message when this happens:ValueError: array is not C-contiguous.