import spacy
nlp = spacy.load('en_core_web_lg')
nlp.vocab.vectors.most_similar(nlp('cats').vector.reshape(1,300))
returns following
(array([], dtype=uint64), array([0], dtype=int32), array([nan], dtype=float32))
While computing the results i get the following warning
/anaconda3/envs/deep/lib/python3.6/site-packages/ipykernel_launcher.py:1: RuntimeWarning: divide by zero encountered in true_divide
"""Entry point for launching an IPython kernel.
/anaconda3/envs/deep/lib/python3.6/site-packages/ipykernel_launcher.py:1: RuntimeWarning: invalid value encountered in true_divide
"""Entry point for launching an IPython kernel.
/anaconda3/envs/deep/lib/python3.6/site-packages/numpy/core/_methods.py:28: RuntimeWarning: invalid value encountered in reduce
return umr_maximum(a, axis, None, out, keepdims, initial)
Hi @ines, any idea what I should do fix it?. Is there any version you recommend I roll back to in which this feature works?
@emiguevara but i am not using a GPU, its a plain MacBook
@ines any kind of response would be appreciated
The problem seems to be with the nlp.vocab.vectors.data. The first row is all zeros, so normalizing the vectors for cosine distance does a divide by zero.
In [1]: import spacy
In [2]: nlp = spacy.load('en_core_web_lg')
In [3]: nlp.vocab.vectors.data
Out[3]:
array([[ 0. , 0. , 0. , ..., 0. , 0. ,
0. ],
[ 0.012001 , 0.20751 , -0.12578 , ..., 0.13871 , -0.36049 ,
-0.035 ],
[-0.082752 , 0.67204 , -0.14987 , ..., -0.1918 , -0.37846 ,
-0.06589 ],
...,
[ 0.42247 , -0.28522 , -0.38661 , ..., 0.27521 , 0.23623 ,
-0.72113 ],
[ 0.47918 , -0.32734 , -0.23593 , ..., -0.19494 , -0.065226 ,
-0.36282 ],
[-0.63354 , -0.1503 , -0.36161 , ..., 0.26216 , -0.12094 ,
0.0038262]], dtype=float32)
I think it might be the OOV vector?
Just a side note:
It works fine with en_core_web_md model
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
The problem seems to be with the nlp.vocab.vectors.data. The first row is all zeros, so normalizing the vectors for cosine distance does a divide by zero.
I think it might be the OOV vector?