rasa NLU version: 0.10.0a0 and also 0.9
Used backend / pipeline: spacy_sklearn
Operating system : macOS Sierra
Issue:
Hey guys,
I am training a model with the pipeline "spacy_sklearn". The training set is about 1000 utterances with five different entity types.
I did the whole installation process you described in the documentation. However, when I'm testing the model, it detects the different entities without delivering the entity type (it delivers '-'). I tried a lot of different configurations, but none of them worked for me.
In case I add the "entity_crf_BILOU_flag" and set it to true I don't get any entities delivered. The same happens when I remove "entity_crf_features". To me, it seems like I configured it in the wrong way because it detects the correct value.
Thanks guys!
Content of reply file:
{
"text": "show information for cloud computing",
"rasa_reply": {
"entities": [
{
"start": 21,
"entity": "-",
"end": 26,
"value": "cloud",
"extractor": "ner_crf"
}
]
}
}
Content of configuration file:
{
"pipeline": "spacy_sklearn",
"path" : "../models/spacy_sklearn_1000_19-07-2017",
"data" : "../set/trainandtest/rasa/train/1000/",
"entity_crf_features": [
["low", "title", "upper", "pos", "pos2"],
["bias", "low", "word3", "word2", "upper", "title", "digit", "pos", "pos2", "pattern"],
["low", "title", "upper", "pos", "pos2"]]
}
Can you provide a sample of your training data?
That's one of the utterances (training data):
{
"text": "book conference room close to MTY20 next monday at 7 am all day",
"intent": "BookRoom",
"entities": [
{
"end": 34,
"value": "MTY20",
"entity": "building",
"start": 30
}
]
}
I don't know if this is the problem, but I could believe that it is. The end position isn't correct. should be 35.
You may have seen a warning of this in the console or log output. Not in a place to confirm that at the moment.
@foldingparrot Do you mind sending me the whole training data at [email protected]? I think I saw this before and looking at the dataset might quickly resolve it (I think it is an issue in the training data)
I think @wrathagom might be right here - I think the cause can be wrong entity offsets (so start and end values of the entitiy). I am not sure we warn about that, but we definitely should
You guys are right. I fixed that and it works as expected now!
Thanks a lot!