Rasa Core version: 0.13.0a1
Python version: 3.6.6
Operating system (windows, osx, ...): osx
Issue:
Right now I'm using FormAction for my stories. I am having troubles writing e2e_stories.md for e2e evaluation.
Is there an appropriate way to write e2e_stories.md when using FormAction?
Content of domain file (if used & relevant):
Hey, so the thing is that if you explicitly write all the steps within a form, it has to be written with a form prefix (see https://rasa.com/docs/core/interactive_learning/#the-form-prefix) so that rasa core doesn't get confused. This also means that those lines will get ignored from training though. So I don't think e2e evaluation with forms is working right now.
@Ghostvv do you have any additional thoughts on this?
I was trying to use core evaluation instead. But it turned out that core evaluation didn't work neither.
I used the "formbot" in examples for testing, simply using all training stories as testing stories. Out of 10 stories, only 1 passed (the happy path). A typical failed story is:
## chitchat stop but continue path
* request_restaurant
- restaurant_form
- form{"name": "restaurant_form"}
* form: chitchat
- form: utter_chitchat <!-- predicted: restaurant_form -->
- form: restaurant_form
* form: stop
- form: utter_ask_continue <!-- predicted: restaurant_form -->
- form: action_listen <!-- predicted: restaurant_form -->
* form: affirm
- form: restaurant_form
- form{"name": null}
- utter_slots_values
* thankyou
- utter_noworries
ok so that story looks slightly wrong. the chitchat intent shouldn't have a form: prefix. Everything that isn't the happy path shouldn't have a form prefix. I think it might be easiest to generate these stories via interactive learning because it can be a bit confusing.
Let me know if that works
The story I pasted above is generated from the core evaluation (in the file failed_stories.md), not the one I used for test. The testing stories are exactly the same with training stories, in this case:
## chitchat stop but continue path
* request_restaurant
- restaurant_form
- form{"name": "restaurant_form"}
* chitchat
- utter_chitchat
- restaurant_form
* stop
- utter_ask_continue
* affirm
- restaurant_form
- form{"name": null}
- utter_slots_values
* thankyou
- utter_noworries
The form prefix is added during core evaluation.
I track to StoryStep.as_story_string in rasa_core.training.structures (line: 169).
if isinstance(s, UserUttered):
if self.story_string_helper.active_form is None:
result += self._user_string(s, e2e)
else:
# form is active
# it is not known whether the form will be
# successfully executed, so store this
# story string for later
self._store_user_strings(s, e2e, FORM_PREFIX)
When tracker export stories, as long as a form is active, the user utterance will be appended a FORM_PREFIX. However, sometimes the user utterance is not part of the form. In this example, "chitchat" is not part of restaurant_form, therefore should not be appended a FORM_PREFIX.
I tried interactive learning. It generated correct stories. I haven't figured out why in evaluation the stories are presented differently.
Maybe because in interactive learning the endpoints are connected so the bot can understand what intents should be ignored and what should not (which are specified in actions.py).
The guy I tagged in the comment above (@Ghostvv) probably has some ideas, but he's currently away so may not get back to you until next week
@yobekiko thank you for spotting it, yes during interactive training the FormAction fails to validate, so core knows when to add form: prefix and when not. Since during e2e evaluation FormAction doesn't run, core doesn't know when it fails
The solution is to ignore FormPolicy predictions during evaluation
@Ghostvv can you please elaborate on how to do that? I am also having the same problem. I am unable to use e2e for our bot since it almost consists of forms only (with unhappy paths steering back the conversations to reask the form questions).
@tahamr83 I just took a look, and there is a bug there
Most helpful comment
I tried interactive learning. It generated correct stories. I haven't figured out why in evaluation the stories are presented differently.
Maybe because in interactive learning the endpoints are connected so the bot can understand what intents should be ignored and what should not (which are specified in
actions.py).