Spacy: Spacy do not extract all entities.

Created on 2 Mar 2017  路  4Comments  路  Source: explosion/spaCy

Can you explain me why spacy not extract entity from such simple sentance.

from spacy.en import English
parser = English()
t = parser('Facebook is the best.')
t.ents
()

All 4 comments

The entities are recognised by the tagger, not the parser.

I think this is more likely a limitation of the NER model. If the model doesn't lean too heavily on gazetteers, this kind of sentence is actually deceptively tricky to classify.

E.g. token capitalization is a less significant feature at the start of the sentence. If we rearrange the sentence a little, you can see it correctly catches the Facebook entity:

>>> nlp(u'Facebook is the best.').ents
()
>>> nlp(u'Some say Facebook is the best.').ents
(Facebook,)

The new version 1.8.0 comes with bug fixes to the NER training procedure and a new save_to_directory() method. We've also updated the docs with more information on training and NER training in particular:

I hope this helps!

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings