Rasa: Log a warning if NLU picks up intents/entities that aren't in the domain

Created on 29 Jul 2019 · 28Comments · Source: RasaHQ/rasa

If we log a warning, this could be helpful in debugging, e.g. if you misspelled an entity while using the regex format, because it will end up featurized differently (or not at all? really not sure). The idea is something like

WARNING: Encountered an entity 'num_peolpe' that isn't defined in the domain. This can lead to unpredictable behavior.

difficulty help wanted type

Source

erohmensing

All 28 comments

I can work on this @erohmensing

RanaMostafaAbdElMohsen on 30 Jul 2019

Awesome @RanaMostafaAbdElMohsen, thanks!

erohmensing on 30 Jul 2019

Hey @RanaMostafaAbdElMohsen how is this going? Anything I can help you with?

erohmensing on 13 Aug 2019

Thanks @erohmensing. I am investigating the issue and will let you know if there any updates.
Many Thanks

RanaMostafaAbdElMohsen on 17 Aug 2019

Hello @erohmensing ,

I believe the issue is that when RegexInterpreter parse entities, the entities must be checked that it is defined in the domain. I believe we need to load domain to check on entities in extract_intent_and_entities function in interpreter.py .

Am I on the right track?

RanaMostafaAbdElMohsen on 17 Aug 2019

Hi @RanaMostafaAbdElMohsen , sorry for the late response. I checked in with my team, and we don't want to pass the domain to the NLU interpreter.

I think the best place to do this would be in the processor where the received message is:
https://github.com/RasaHQ/rasa/blob/ab010785f5b5b7b11025a845c1135b8c3b0b4d43/rasa/core/processor.py#L302

Since we're in the processor, you can access the domain with self.domain. So what I would do is create a method like log_unseen_features that checks whether any unknown entities and intents were picked up and logs a warning

erohmensing on 29 Aug 2019

Hi guys,
newbie here. Is there a way where I can help with this issue? I have a quick look at the code and it seems that we can check if the intents/entities exist in self.domain.entities or self.domain.intent_properties, but I am not entirely sure yet. I have been wanted to start contributing to open source and contribute to RASA would be Awesome.

Best

chrisbangun on 19 Sep 2019

@RanaMostafaAbdElMohsen do you still want to contribute a fix here? @chrisbangun let's give her a few days to respond, and then if not, you can take it over :) Sounds like you're on the right track!

erohmensing on 19 Sep 2019

sure @erohmensing, sounds good to me

chrisbangun on 19 Sep 2019

Hello @erohmensing
I am busy these days so it can take time for me to implement it. So sure @chrisbangun it is okay if u want to work on this issue. Sorry it took me long to reply.

RanaMostafaAbdElMohsen on 22 Sep 2019

That's okay! Thanks for the response. Go for it @chrisbangun 🚀

erohmensing on 22 Sep 2019

thanks @RanaMostafaAbdElMohsen @erohmensing

Btw, I am new to RASA, so could you give me more context @erohmensing ?

As you suggested above, I might as well implement a method called log_unseen_features(parse_data["intent"], parse_data["entities"]). The main logic implemented in this function is to check whether the intent and entities aren't in the domain. The the logging is implemented in the caller function _parse_message. Am I on track?

Thanks and I look forward to this.

chrisbangun on 22 Sep 2019

Yep that sounds like the right track. I'd say after the debug statement and before the return statement, a log_unseen_features(parse_data) and then you can extract the entities and intents from within. Because you'll pass self to the function, you can access self.domain, in which you'll find the intents at self.domain.intent_properties (will need a little more parsing to get the intent names) and the entities at self.domain.entities.

erohmensing on 23 Sep 2019

cool, thanks @erohmensing

chrisbangun on 23 Sep 2019

👍1

hi @erohmensing
could you give me some guidance on how to run the test suite? Specifically the test.core.test_processor one. Thanks!

chrisbangun on 26 Sep 2019

You can try with these instructions: https://github.com/RasaHQ/rasa#running-the-tests

Although I'm partial to just running the tests specifically (you'll still need the requirements and the preparation), e.g.

pytest tests/core/test_processor.py

erohmensing on 26 Sep 2019

hi @erohmensing
thanks for the pointer, I now can execute the testing!

btw, I noticed you suggested that we log a warning for both intent and entities. does it mean we need to have two separate warnings? I am wondering if you have "design" recommendation in mind.

thanks !

chrisbangun on 27 Sep 2019

I think two separate warnings is fine since they'll be two loops of checking intents and then entities. As for design, i think something simple like a

logger.warning("Interpreter parsed an intent '{}' that is not defined in the domain.".format(intent))

which would render like

2019-09-27 06:39:34 WARNING  rasa.core.processor Interpreter parsed an intent 'greeet', which is not defined in the domain.

and then the same for entities!

Also, make sure you check for each entity that is picked up :)

erohmensing on 27 Sep 2019

hi @erohmensing
I've tested the new implementation and it worked fine. But since this is my first time in contributing to open source, I still have some confusion on how should I proceed before I can submit a PR. I've pushed my changes to my forked-version.

Btw, after reading the documentation https://github.com/RasaHQ/rasa#how-to-contribute, I have two questions:

which branch should I submit my PR to?
since this is a minor release, can you advise me with the versioning? should I change the rasa/version.py?

thanks!

chrisbangun on 27 Sep 2019

Great question! Since this is an addition (albeit a small logging one) and not a bug fix, open it against master. For future reference if you push a bug fix, please open it against the .x branch -- e.g. in this case 1.3.x.

Don't change the version.py, that only gets updated when we push out the releases. However you should add a changelog entry to the Added portion of the Unreleased 1.4.0 changelog.

Also for what it's worth, i took a peek at your code -- I reckon the logging methods shouldn't return anything, just log. Instead of creating lists and return codes, if you just log the entities as you see ones that aren't in the domain, you don't have to save or return anything.

erohmensing on 27 Sep 2019

thanks for the input @erohmensing . I modified the logic and run the tests. I think I am ready to submit the pr now.

chrisbangun on 30 Sep 2019

🚀1

hi @erohmensing
I submitted the PR already. it seems the code still couldn't pass the CI test. Specifically the code format test. I read the detail of the test and apparently the CI run the test for ubuntu instead of mac-os, is this expected?

and btw I followed the steps in formatting the code and code style. The result shown in my machine were good (but maybe I miss something). Could you please help to check on this?

Thank you!

chrisbangun on 30 Sep 2019

Yes, I think our tests run on ubuntu vms, that shouldn't be the issue. Your pull request will be assigned to someone (maybe me, maybe someone else), and then we can sort it out. Looks like you have linter errors that you can check with make lint, check the test output:

rasa/core/processor.py:302:16: F632 use ==/!= to compare str, bytes, and int literals
rasa/core/processor.py:332:12: F632 use ==/!= to compare str, bytes, and int literals
Makefile:40: recipe for target 'lint' failed

which i reckon is on this line

erohmensing on 30 Sep 2019

hi @erohmensing
thanks for the pointer. finally get the code formatting passed. Yet, still failed on Test3.6 and Test 3.7. below is the error msg:
ERROR sanic.error:app.py:966 Exception occurred in one of response middleware handlers Traceback (most recent call last): File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sanic/app.py", line 958, in handle_request request, response File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/spf/framework.py", line 579, in _run_response_middleware _response = await _response File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sanic_cors/extension.py", line 266, in unapplied_cors_response_middleware set_cors_headers(req, resp, context, res_options) File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sanic_cors/core.py", line 251, in set_cors_headers headers_to_set = get_cors_headers(options, req.headers, req.method) File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sanic_cors/core.py", line 171, in get_cors_headers origins_to_set = get_cors_origins(options, request_headers.get('Origin')) File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sanic_cors/core.py", line 146, in get_cors_origins return sorted([o for o in origins if not probably_regex(o)]) File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sanic_cors/core.py", line 146, in <listcomp> return sorted([o for o in origins if not probably_regex(o)]) File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sanic_cors/core.py", line 275, in probably_regex return any((c in maybe_regex for c in common_regex_chars)) File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sanic_cors/core.py", line 275, in <genexpr> return any((c in maybe_regex for c in common_regex_chars)) TypeError: argument of type 'NoneType' is not iterable

is it due to my virtual env?

Thank you

chrisbangun on 30 Sep 2019

btw @erohmensing
I encountered this error while trying to do pip3 install -r requirements-dev.txt:

do you think this might be the root cause?

chrisbangun on 1 Oct 2019

hi @erohmensing
I keep failing this test: https://travis-ci.com/RasaHQ/rasa/jobs/241008215, would it be possible if you give me some directions?

thanks a lot.

chrisbangun on 2 Oct 2019

@chrisbangun that looks like a valid failed test, because there is no domain passed to the model in the test -- i believe it should be something we can get around but best to bring that up on the PR.

erohmensing on 2 Oct 2019

i see. could you please raise the issue @erohmensing ? I can do it as well.

chrisbangun on 2 Oct 2019

Was this page helpful?

0 / 5 - 0 ratings