I have a Python FAISS GPU application, in which I have to load an index to the GPU multiple times (overwriting the old one). I encountered a problem since the GPU memory is not released after the Python variable has been overwritten. Is there a way to release the index from the GPU memory?
Also, is there a was to block less than 18 percent of the GPU memory in the Python API?
OS: Ubuntu 18.04LTS
Running on:
Interface:
main_index is already loaded to the GPU and I want to release it's memory and replace it by my_index .
I tried loading using either:
my_index = faiss.index_cpu_to_all_gpus(my_index)
main_index = my_index
Or:
co = faiss.GpuClonerOptions()
res = faiss.StandardGpuResources()
my_index = faiss.index_cpu_to_gpu(res, 0, my_index, co)
main_index = my_index
Both methods block additional 18 percent of the GPU so I cannot overwrite the index just add additional memory on the GPU.
Kind of worked around it with main_index.reset() before the assignment. It seems to work fine, please let me know if there is a better option.
Still looking for the syntax for lowering the 18 percent GPU memory allocation.
Check
res.noTempMemory()
and
res.setTempMemory
Most helpful comment
Check
res.noTempMemory()and
res.setTempMemory