I'm working with spaCy 1.8.2. I run this example:
https://spacy.io/docs/usage/lightning-tour#examples-word-vectors
and get an assertion error. I investigated and found that:
apples.has_vector returns True but oranges.has_vector returns False. The same with boots and hippos, one has vector and the other one doesn't. I'm working with the default model, that is the one you get when you run:
python -m spacy download en
Note: I found something similar for the alpha 2.0 release and wrote a comment about it in https://github.com/explosion/spaCy/issues/1105
The small model only has vectors for 5000 words. You can get the full vectors data with:
python -m spacy download en_core_web_md
python -m spacy link en_core_web_md en --force
This will download the larger model and link it to the name en.
@honnibal , thanks for the answer. I think this should be clarified in the "lightning tour" example that I referred to. Some (most?) new users (for example me!) run those examples to understand the library and check that the installation is correct. When one of those examples fails on the default installation, it's not a good sign for the new user :)
I got the same confusion. Clarification will help a lot!
Also, on my osX installation, I needed to make the string unicode for the example to work
doc = nlp(u"Apples and oranges are similar. Boots and hippos aren't.")
Thanks for the feedback – I agree! For the new docs, I've made this example a bit more clear and added a note about the word vectors.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
@honnibal , thanks for the answer. I think this should be clarified in the "lightning tour" example that I referred to. Some (most?) new users (for example me!) run those examples to understand the library and check that the installation is correct. When one of those examples fails on the default installation, it's not a good sign for the new user :)