Rasa NLU version: 0.12.3
Operating system (windows, osx, ...): Windows 10
Content of model configuration file:
{
"pipeline": "spacy_sklearn",
"path" : "./models/nlu",
"data" : "./data/nlu.json"
}
Issue:
Suppose, date is an entity name which store date value.
"I want to go to Bangladesh on 12/10/2015". From the above text the value for date entity is 12/10/2015. I have heard Spacy and Duckling has feature which can easily extract this.
Can anyone please help me on how to do this? Where to write anything? Where to include anything?
Any help will be appreciated.
Is what I use.
@sipvoip provides a snippet of a pipeline. To use spacy or duckling you will need to change your pipeline from
"pipeline": "spacy_sklearn"
to
pipeline:
- name: "nlp_spacy"
- name: "tokenizer_spacy"
- name: "intent_entity_featurizer_regex"
- name: "intent_featurizer_spacy"
- name: "ner_crf"
- name: "ner_spacy",
- name: "ner_duckling",
dimensions: ["amount-of-money", "distance", "duration", "email", "number", "ordinal", "phone-number", "quantity", "temperature", "time", "unit", "unit-of-duration", "url", "volume"]
- name: "ner_synonyms"
- name: "intent_classifier_sklearn"
let me know if that doesn't get things working for you.
@jahid-ict can we close this issue, or do you have more questions?
@sipvoip @wrathagom Thanks for your help. Date is now working. For "ner_spacy", when I try for "GPE" then spacy can identify country name started with capital letter for some countries. But for some country, it is not case sensitive.
spacy can take both "India" or "india".
But it can't take "bangladesh" but "Bangladesh".
Is there any way to make it totally not case sensitive? Can you please share your experience on that?
I may recommend moving to a larger spacy model (if you're currently just trying the medium model), but for the most part no there is no easy way to improve spacy. If spacy isn't working for you I would suggest trying to train your own entity model using ner_crf
I am actually now doing this using ner_crf. I didn't know about larger spacy model. Thank you for sharing. I'll try this.
I'll close this issue for now then - let us know if there's any more issues/questions
Most helpful comment
dimensions: ["amount-of-money", "distance", "duration", "email", "number", "ordinal", "phone-number", "quantity", "temperature", "time", "unit", "unit-of-duration", "url", "volume"]
Is what I use.