I have built the index by the dataset,and stored on dask.But the dataset is changing by adding vector or deleting vectors frequently.
So does I must rebuild the index everytime or just add/delete the vector from the index built before?
You can use the add() and remove_ids() methods.
@beauby ,THX. Assume that,If I use the IVFx,and delete too many vectors from the database.the clusters built before is not correct.
if num_vectors <1000,I use IndexFlatL2,else, use IVFx,x = num_vectors /100. I'm not sure it is right?
As long as the distribution of the vectors in your training set is close to that of your dataset, the clustering should be ok.
Regarding the number of clusters, the right number depends on the structure of your data.
Closing as the issue is resolved. Feel free to keep commenting should you need further help.
You can use the
add()andremove_ids()methods.
Why there is no documentation of adding and removing vectors?. For example, if someone saved an index and would like to remove vectors.
If someone knows how to use these methods or has a link to look for plz share with us.
Most helpful comment
Why there is no documentation of adding and removing vectors?. For example, if someone saved an index and would like to remove vectors.
If someone knows how to use these methods or has a link to look for plz share with us.