Spacy: use of NER after update fails

Created on 13 Mar 2016  路  5Comments  路  Source: explosion/spaCy

I'm trying to to train NER at it's succeed, with following code:

code for NER update is taken from https://github.com/spacy-io/spaCy/issues/187

import plac

from spacy.en import English
from spacy.gold import GoldParse
import os

fname = 'mmmm'

nlp = English(parser=False) # Avoid loading the parser, for quick load times

doc = nlp.tokenizer(u'Lions and tigers and grizzly bears!')
nlp.tagger(doc)

nlp.entity.add_label('ANIMAL') # <-- New in v0.100

indices = tuple(range(len(doc)))
words = [w.text for w in doc]
tags = [w.tag_ for w in doc]
heads = [0 for _ in doc]
deps = ['' for _ in doc]

ner = ['U-ANIMAL', 'O', 'U-ANIMAL', 'O', 'B-ANIMAL', 'L-ANIMAL', 'O']

annot = GoldParse(doc, (indices, words, tags, heads, deps, ner))

loss = nlp.entity.train(doc, annot)
i = 0
while loss != 0 and i < 1000:
loss = nlp.entity.train(doc, annot)
i += 1
print("Used %d iterations" % i)

nlp.entity(doc)
for ent in doc.ents:
print(ent.text, ent.label_)
nlp.entity.model.dump(os.getcwd())

Than I load the saved model it also succeed. Than after loading I'm trying to apply the model on the sentence -> this part is failed

here is the code:

from spacy.en import English
import os

path = os.getcwd() + '/dic/mmmm' # path to the model
path = path.decode('utf-8')

nlp = English(parser=False)
nlp.entity.model.load(path)

doc = nlp(u'Lions and tigers and grizzly bears!')
ents = list(doc.ents)

print ents

bug

Most helpful comment

Hey,

Sorry for the delay getting to this.

It seems that spaCy isn't saving the labelling properly. Re-dding the label before loading is the correct workaround for now, until we fix this.

All 5 comments

michael135, did you manage to get this solved? I'm having the same difficulties and would love to know if it's a bug or if I'm missing something somewhere.

Just as a test, I dumped out the entity model without making any changes to it and then trained my new entities and dumped out the new entity model. I then used cmp to verify that the two dumps were different and they in fact were. So it appears as though my newly trained entities are in fact getting dumped but don't seem to be available after loading.

michael135, I dug into the source code a bit and may have figured out a solution to our problem. This may not be a correct solution as I'm far from experienced with spacy, but it seems as though you have to tell spacy about your custom entity labels along with loading the model you dumped. So before you call:

nlp.entity.model.load(path)

call:

nlp.entity.add_label('ANIMAL')

This does the trick for me but again, I have no idea if it's the correct solution, hannibal will probably have to weigh in on that one.

Hey,

Sorry for the delay getting to this.

It seems that spaCy isn't saving the labelling properly. Re-dding the label before loading is the correct workaround for now, until we fix this.

That means, that the label should be added twice before saving pickle and after loading it?

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tonywangcn picture tonywangcn  路  3Comments

ajayrfhp picture ajayrfhp  路  3Comments

muzaluisa picture muzaluisa  路  3Comments

peterroelants picture peterroelants  路  3Comments

enerrio picture enerrio  路  3Comments