Rasa NLU version (0.10.6):
Used backend / pipeline (spacy_sklearn):
Operating system (ubuntu):
Issue: Hi, coming from LUIS I found RASA very useful for our use case, but I am facing an issue similar to #688. I have more than 2000 synonyms for some entities and adding them to the training data increases the training data very significantly. Is there any plan to support LUIS like functionality for synonyms in the future? Thank you.
Content of configuration file (if used & relevant):
Thanks for reaching out. I don't know all the other tools features by hard and haven't used luis for a while - do you mind explaining what that entity synonyms feature does and how it is different from http://nlu.rasa.ai/dataformat.html#entity-synonyms
Hi @tmbo . Sorry, I should have explained my question in detail. Currently, as per my understanding how Rasa's Entity Synonyms work is that the entity has to be identified by the model before it can be replaced with a synonym. That means that pre-defining entity synonyms in training data doesn't affect entity recognition. For example, if I train my Rasa model to recognize "laptop" as an entity in the sentence "I want to buy a laptop" and define "lappy" as a synonym to "laptop" in training data Rasa won't be able to recognise "lappy" as entity. (This is better explained here). In LUIS, "lappy" will be recognized as an entity if we define it as a synonym for "laptop" and train the model on sentences with "laptop" as a entity. (This is also better explained in the link above). Thanks.
@amn41 @twerkmeister this feels very similar to #773 or at least some of the same problems you all are discussing there for templated training data.
Certainly I've answered a decent number of questions here and on SO about how entity synonyms work.
hey anyoe got any sollution regarding how to use synonyms as entity value..
@tapas100 the only way to this for now is to replace synonyms is training data and generate more training examples.
@tmbo @bedapudi6788 As I understand Rasa NLU takes lookup table and converts it to compound regex pattern and only its useful in case of ner_crf usage as entity_extractor. My idea to support synonyms in training phase is inserting them in lookup table and also put them as entity synonym. By this way they will be taken into account at regex pattern creation and also after extraction they can be mapped to standard form by entity_synonym_mapper. Does this approach sounds good ?
we have lookup tables now, so we can close this. @ufukhurriyetoglu can you create a new issue for your enhancement suggestion please?
Most helpful comment
Hi @tmbo . Sorry, I should have explained my question in detail. Currently, as per my understanding how Rasa's Entity Synonyms work is that the entity has to be identified by the model before it can be replaced with a synonym. That means that pre-defining entity synonyms in training data doesn't affect entity recognition. For example, if I train my Rasa model to recognize "laptop" as an entity in the sentence "I want to buy a laptop" and define "lappy" as a synonym to "laptop" in training data Rasa won't be able to recognise "lappy" as entity. (This is better explained here). In LUIS, "lappy" will be recognized as an entity if we define it as a synonym for "laptop" and train the model on sentences with "laptop" as a entity. (This is also better explained in the link above). Thanks.