I copied and pasted the updated files to ...\spacy\lang\el and I got this error. Does anyone have any idea why that happened ? Thank you in advance
import spacy
from spacy.lang.el import Greek
nlp = Greek()
doc = nlp('围胃蔚蟼')
doc[0].lemma_
Traceback(most recent call last):
File "", line 1, in
File "token.pyx",line 8,in spacy.tokens.token.Token.lemma_._ _get_ _
TypeError: lookup() got an unexpected keyword argument 'orth'
Hi,
I copied and pasted the updated files to ...\spacy\lang\el
What exactly do you mean - which files did you update, and how did you update them ?
It looks like something's gone wrong with your spaCy installation, probably connected to the files you were copying. Did you install from source? Did it compile properly? Or did you install it via pip ?
Hello,
I installed spacy via pip. However, I needed to modify the lemmatizer for the greek language, so I edited the following files as this user did https://github.com/giannisdaras/spaCy/commit/fe94e696d3dc5abdfb846d152ebf489518419513
and then I addresed the following issue that occured https://github.com/explosion/spaCy/issues/4272 . I tried all of the above in two versions of spacy , 2.23 and 2.1.8. I'm also using a virtual enviroment .
Right, so spaCy 2.1.8 won't work for you, but in 2.2.3 this bug should be fixed. I just tried it out myself and didn't get any errors, and there's even a unit test ensuring that it works.
It's probably best to create an entirely clean environment, install 2.2.3, run your code WITHOUT making any changes to the files, and check whether that works. Please let me know whether it does. Then afterwards, you could try changing files and making sure nothing breaks in between.
Do note that the changes made by @giannisdaras should be included in the 2.2.3 release already.
Hello,I followed all of the above steps. There is no error , but the thing is , that lemmatization doesn't work for the Greek language as it should. 螜nstead , .lemma_ returns the word itself.
import spacy
from spacy.lang.el import Greek
doc = nlp('伪纬蠈蟻伪蟽蔚蟼')
doc[0].lemma_
'伪纬蠈蟻伪蟽蔚蟼'
Are you installing spaCy with the lookups data, as described here? https://spacy.io/usage#pip
For example, pip install spacy[lookups]. Otherwise, it won't have the lemma rules and tables available.
Thank you very much for your help, it turns out that I had to install the lookups data as well as the model 'el'. Most words are lemmatized correctly, so I'm assuming that the ones that are not , are not included in the lookup table.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.