Rasa: Improvements to FallbackClassifier

Created on 28 Jul 2020  路  15Comments  路  Source: RasaHQ/rasa

Description of Problem:

When moving the NLU fallback to the NLU side, we now overwrite the intent with nlu_fallback and assign that a confidence of 1.0
I'm not sure about the overwriting part and assigning it a confidence of 1.0 in the first place, but the main issue for me at the moment is that there's no note of what the predicted intent was in the debug logs at the moment.
We should be logging:

  • that a fallback happened
  • what intent was predicted originally and with what confidence (maybe even the full intent ranking? idk)
Bot loaded. Type a message and press enter (use '/stop' to exit): 
Your input ->  chitchat                                                                                                                                         
2020-07-28 15:29:34 DEBUG    rasa.core.tracker_store  - Creating a new tracker for id 'f0ffa4263bde49d5ba54eeb81d14640a'.
2020-07-28 15:29:34 DEBUG    rasa.core.processor  - Starting a new session for conversation ID 'f0ffa4263bde49d5ba54eeb81d14640a'.
2020-07-28 15:29:34 DEBUG    rasa.core.processor  - Action 'action_session_start' ended with events '[<rasa.core.events.SessionStarted object at 0x156491810>, <rasa.core.events.ActionExecuted object at 0x156700d10>]'.
2020-07-28 15:29:34 DEBUG    rasa.core.processor  - Current slot values: 
    concerts: None
    venues: None
2020-07-28 15:29:34 DEBUG    rasa.core.processor  - Received user message 'chitchat' with intent '{'name': 'nlu_fallback', 'confidence': 1.0}' and entities '[]'
2020-07-28 15:29:34 DEBUG    rasa.core.processor  - Logged UserUtterance - tracker now has 4 events.
2020-07-28 15:29:34 DEBUG    rasa.core.policies.memoization  - Current tracker state [None, None, None, {}, {'prev_action_listen': 1.0, 'intent_nlu_fallback': 1.0}]
2020-07-28 15:29:34 DEBUG    rasa.core.policies.memoization  - There is no memorised next action
2020-07-28 15:29:34 DEBUG    rasa.core.policies.rule_policy  - Current tracker state: [{}, {'prev_action_listen': 1.0, 'intent_nlu_fallback': 1.0}]
2020-07-28 15:29:34 DEBUG    rasa.core.policies.rule_policy  - There is a rule for next action 'utter_default'.
2020-07-28 15:29:34 DEBUG    rasa.core.policies.ensemble  - Predicted next action using policy_2_RulePolicy
2020-07-28 15:29:34 DEBUG    rasa.core.processor  - Predicted next action 'utter_default' with confidence 1.00.
2020-07-28 15:29:34 DEBUG    rasa.core.processor  - Action 'utter_default' ended with events '[BotUttered('default message', {"elements": null, "quick_replies": null, "buttons": null, "attachment": null, "image": null, "custom": null}, {"template_name": "utter_default"}, 1595942974.897126)]'.
2020-07-28 15:29:34 DEBUG    rasa.core.policies.memoization  - Current tracker state [None, None, {}, {'prev_action_listen': 1.0, 'intent_nlu_fallback': 1.0}, {'prev_utter_default': 1.0, 'intent_nlu_fallback': 1.0}]
2020-07-28 15:29:34 DEBUG    rasa.core.policies.memoization  - There is no memorised next action
2020-07-28 15:29:34 DEBUG    rasa.core.policies.rule_policy  - Current tracker state: [{}, {'prev_action_listen': 1.0, 'intent_nlu_fallback': 1.0}, {'prev_utter_default': 1.0, 'intent_nlu_fallback': 1.0}]
2020-07-28 15:29:34 DEBUG    rasa.core.policies.rule_policy  - There is a rule for next action 'action_listen'.
2020-07-28 15:29:34 DEBUG    rasa.core.policies.ensemble  - Predicted next action using policy_2_RulePolicy
2020-07-28 15:29:34 DEBUG    rasa.core.processor  - Predicted next action 'action_listen' with confidence 1.00.
2020-07-28 15:29:34 DEBUG    rasa.core.processor  - Action 'action_listen' ended with events '[]'.
2020-07-28 15:29:34 DEBUG    rasa.core.lock_store  - Deleted lock for conversation 'f0ffa4263bde49d5ba54eeb81d14640a'.
default message
area high type

Most helpful comment

i disagree vova, you want to be debugging your NLU model, and for that it's important to see which intent it's confused with so you can fix those issues in your training data

All 15 comments

@wochinge for when you're back. cc @Ghostvv in case you want to mention anything else

oh also, this format won't work anymore for triggering a fallback: /[email protected]. We should consider what we want to do with that

another improvement could be to include possibility to use confidence interval between first and second intent rather than hard threshold

@Ghostvv That's already handled by this issue: https://github.com/RasaHQ/rasa/issues/6244

oh also, this format won't work anymore for triggering a fallback: /[email protected]. We should consider what we want to do with that

@akelad I didn't even know of this option, but I think we should keep it and it to the todos from the PR description 馃憤

yeah i mean now you can just do /nlu_fallback, but prob makes sense for that way to work as well

The logging part is done as part of https://github.com/RasaHQ/rasa/pull/6355/files
/[email protected] could be a bit difficult as this bypasses the NLU classifier and we have nothing after it which now predicts an NLU fallback. Is it okay if we dump it for now? @akelad

hm, should we just leave the issue open for now then and address later potentially?

Sounds good 馃憤

From Sara's NLU tests, it looks like nlu_fallback is getting predicted as an intent (https://github.com/RasaHQ/rasa-demo/pull/566#issuecomment-707865855). @wochinge is this supposed to happen during evaluation when the confidence score falls below the NLU threshold?

Good question 馃 I think we should fix this for the evaluation. What do you think @Ghostvv ?

not sure, I think we should say in evaluation that nlu_fallback was predicted, however I'd put original intent there as well

So maybe showing the original intent and putting the fallback intent in brackets behind it? (Thinking of the confusion matrix right now)

in the confusion matrix, I think it should be nlu_fallback, because this is the real intent for core

i disagree vova, you want to be debugging your NLU model, and for that it's important to see which intent it's confused with so you can fix those issues in your training data

I changed the priority to high as it's related to testing / evaluation efforts which are most likely part of our OKRs for next cycle.

Was this page helpful?
0 / 5 - 0 ratings