I am currently experimenting with all indexes to see each one's performance in terms of memory usages and the search speed. Are there any suggestions to see how much memory CPU or GPU is being used by a given Index. I understand that using nvidia-smi and htop is one way of going about it but these won't give very accurate results.
OS: Ubuntu
Faiss version: 1.6.1
Faiss compilation options:
Running on:
Interface:
I used memory_profiler to profile the amount of memory being used by the code and am getting inconsistent results. Did not see a significant increase in memory usage even after increasing the number of bits in LSH by two times i.e. from 64 bits to 128 bits. I also wanted to understand how does LSH go on about indexing the vectors as the reconstruct function is not implemented for LSH.
Line # Mem usage Increment Line Contents
================================================
49 106.211 MiB 106.211 MiB @profile
50 def return_index():
51 127.641 MiB 10.238 MiB features = read_file("features.txt")
52 256.266 MiB 128.625 MiB features_1 = read_file("features_1.txt")
53 16000.984 MiB 15744.719 MiB features_2 = read_files("features_2.txt")
54 16001.945 MiB 0.961 MiB index = faiss.IndexLSH(256, 64)
55 23709.297 MiB 7707.352 MiB concat_features = np.concatenate((features_1, features_2), axis=0)
56 23643.062 MiB 0.000 MiB features_1 = []
57 16001.445 MiB 0.000 MiB features_2 = []
58 16001.445 MiB 0.000 MiB index.train(concat_features)
59 16001.684 MiB 0.238 MiB index.add(concat_features)
60 8294.129 MiB 0.000 MiB concat_features = []
61 8294.129 MiB 0.000 MiB return index
Line # Mem usage Increment Line Contents
================================================
49 94.363 MiB 94.363 MiB @profile
50 def return_index():
51 233.695 MiB 139.332 MiB features_1 = read_files("features_1.txt")
52 15978.453 MiB 15744.758 MiB features_2 = read_files("features_2.txt")
53 15979.125 MiB 0.672 MiB index = faiss.IndexLSH(256, 128)
54 23686.680 MiB 7707.555 MiB concat_features = np.concatenate((features_1, features_2), axis=0)
55 23620.242 MiB 0.000 MiB faces_features = []
56 15978.641 MiB 0.000 MiB features = []
57 15978.641 MiB 0.000 MiB index.train(concat_features)
58 15979.059 MiB 0.418 MiB index.add(concat_features)
59 8271.504 MiB 0.000 MiB concat_features = []
60 8271.504 MiB 0.000 MiB return index
It seems that the memory usage is doubled in the index.add.
64 bit: +0.238 MiB
128 bit: +0.418 MiB
Obviously this is dwarfed by the size of the input array.
Hi @mdouze, I am confused by the fact that even after deleting the input arrays the memory occupied is 8.2 GB. Ideally it should just be 0.238 MiB and 0.418 MiB for the 64 bit and 128 bit representations respectively right??
See
https://github.com/facebookresearch/faiss/wiki/FAQ#why-does-the-ram-usage-not-go-down-when-i-delete-an-index
so it is not guaranteed.