Rasa: multiple entities foe example ner_crf

Created on 7 Jun 2017  路  8Comments  路  Source: RasaHQ/rasa

rasa NLU version 0.8.6

"pipeline": ["nlp_spacy", "ner_crf", "ner_synonyms", "intent_featurizer_spacy", "intent_classifier_sklearn"],

Operating system osx

Issue:
in text that has more than one entity, except of the last one, all the rest are shown as "-" instead of its actual entity . ideas ?

{u'entities': [{u'end': 14,
u'entity': '-',
u'extractor': u'ner_crf',
u'start': 11,
u'value': u'747'},
{u'end': 21,
u'entity': '-',
u'extractor': u'ner_crf',
u'start': 15,
u'value': u'kansas'},
{u'end': 24,
u'entity': '-',
u'extractor': u'ner_crf',
u'start': 22,
u'value': u'st'},
{u'end': 34,
u'entity': 'time',
u'extractor': u'ner_crf',
u'start': 28,
u'value': u'4:10pm'}],
Thank you

Content of configuration file (if used & relevant):

type

All 8 comments

That seems odd. I think to reproduce this, it would be very helpful to access your training data. Please send me the configuration file and the data to [email protected] if possible.

@karnigili can you please also tell me the sentence of the above example?

sure @tmbo
"see you at 747 kansas st at 4:10pm".

also, I sent the mail with the files

Ok the issue is caused due to whitespaces in your entity annotations. Have a look at you training data - some of the value fields of your annotation contain trailing whitespaces, you need to remove them otherwise they don't align with the tokenization.

We can't really fix this - but I will make sure we show a warning about non-aligned tokens instead of the dashes.

Hi @tmbo , thank you!

Is there anything else that would cause that?
I have strip() and adjust the range for all entities yet it is not being resolved.

thank you

If you send me the modified training data I am able to take another look.

@karnigili There are still entities in there that have a value ending with a space, e.g.:

         {
            "start": 13,
            "end": 23,
            "value": "documents ",
            "entity": "docs"
          }

I'd suggest to install the latest master and it will print warnings for every non aligned entity value.

Great ! thank you so much :):)

Was this page helpful?
0 / 5 - 0 ratings