Rasa: Any intent without an extracted entity fails to be classified correctly after upgrading to 1.8.1

Created on 19 Mar 2020  路  22Comments  路  Source: RasaHQ/rasa

Rasa version: 1.8.1

Rasa SDK version (if used & relevant): 1.8.1

Python version: 3.7

Operating system (windows, osx, ...): Ubuntu

Issue:
I am trying to upgrade Rasa from 1.6 to 1.8 while keeping the configuration as similar as possible. I'm trying to use the EmbeddingIntentClassifier.

All intents that do not have any entity extracted have extremely low confidence scores and most fail to be classified correctly. It even happens when the text is exactly the same as some of the training examples. In many cases, these intents don't even appear in the intent_ranking.

With the previous version, I would get accuracies for these intents of 80% or more.

I do not know if it is a bug with the Spacy featurization or if I need to do something else in the configuration.

Test metrics are all 0.0:

"interested": {
    "precision": 0.0,
    "recall": 0.0,
    "f1-score": 0.0,
    "support": 18,
    "confused_with": {
      "faq_apartment_available": 6,
      "ask_viewing": 3,
      "answer_phone_number": 3
    }
  }

Sample intent error

{
    "text": "I'm super interested",
    "intent": "interested",
    "intent_prediction": {
      "name": "answer_schedule",
      "confidence": 0.13512466847896576
    }
  }

Error (including full traceback):

rasa shell nlu
NLU model loaded. Type a message and press enter to parse it.
Next message:
interested
{
  "intent": {
    "name": "faq_apartment_location",
    "confidence": 0.2963590621948242
  },
  "entities": [],
  "intent_ranking": [
    {
      "name": "faq_apartment_location",
      "confidence": 0.2963590621948242
    },
    {
      "name": "faq_apartment_available",
      "confidence": 0.19111353158950806
    },
    {
      "name": "ask_viewing",
      "confidence": 0.09869010746479034
    },
    {
      "name": "faq_apartment_move_in_date",
      "confidence": 0.0869259387254715
    },
    {
      "name": "faq_generic_identity",
      "confidence": 0.06372860819101334
    },
    {
      "name": "no_preferences",
      "confidence": 0.0610114187002182
    },
    {
      "name": "ask_search_apartments",
      "confidence": 0.05694292113184929
    },
    {
      "name": "answer_phone_number",
      "confidence": 0.05568307265639305
    },
    {
      "name": "interested",
      "confidence": 0.04589391499757767
    },
    {
      "name": "ask_talk_to",
      "confidence": 0.04365147277712822
    }
  ],
  "text": "interested"
}

Content of configuration file (config.yml) (if relevant):
I tried different ones. I came up with the most basic one which fails for demo purpose.

language: "en"
pipeline:
  - name: "DucklingHTTPExtractor"
    url: "http://*************"
    dimensions: ["time", "duration", "amount-of-money", "number", "email", "phone-number", "ordinal", "url"]
    timezone: "America/New_York"
  - name: "SpacyNLP"
    case_sensitive: true
  - name: "SpacyTokenizer"
  - name: "SpacyEntityExtractor"
    dimensions: ["PERSON", "MONEY"]
  - name: "RegexFeaturizer"
  - name: "SpacyFeaturizer"
  - name: "EntitySynonymMapper"
  - name: "EmbeddingIntentClassifier"

Content of domain file (domain.yml) (if relevant):
All of the intents are in there

intents:
- affirm
- deny
- reset:
    triggers: action_reset_full
...
area

Most helpful comment

I think I found what we changed that made it worse. I'll prepare a PR soon to fix it, we'll make it part of 1.9 release. Meanwhile, I would recommend to try:

  - name: "DIETClassifier"
    epochs: 100
    entity_recognition: False

instead of EmbeddingIntentClassifier it is sequential model therefore it takes longer to train

All 22 comments

It's probably not this ... but I just want to check just in case. What version of spaCy are you running? Did you upgrade that as well when you upgraded Rasa?

what is the training accuracy and did you use the same config in 1.6 version?

Thanks both for your quick answers!

@koaning I have spacy==2.1.9 which I did not upgrade when upgrading Rasa. At least the Spacy entity extraction works fine so I doubt it affects the classification.

@Ghostvv not exactly, we had a couple of custom components as well as the CRFEntityExtractor. I just rerun it with Rasa 1.6 and the exact same config as above. Here is the comparison:

Rasa 1.8.1
_300 epochs, 44s training time_
t_loss=3.026
i_loss=2.319
i_acc=0.652

Rasa 1.6.1
_300 epochs, 1m10s training time_
train loss=1.550
train accuracy=0.990

it didn't train, could you please add an option to EmbeddingIntentClassifier weight_sparsity: 0 or update to 1.8.2

I will try this, thanks.

Just as an FYI, I tried with 3000 epochs (12 min) with Rasa 1.8.1
t_loss=2.231
i_loss=1.209
i_acc=0.843

@Ghostvv I upgraded to 1.8.2, it is much better! Still not as good as before though. After 300 epochs, I get to:
t_loss=1.877
i_loss=0.869
i_acc=0.910

With 1.6.1, I got 99% accuracy after less than 100 epochs.

99% train accuracy is not necessarily a good thing. What about test accuracies in 1.8.2?

@Ghostvv I'm not sure where I can see the test accuracy after running rasa train nlu, I wrote down all the numbers I got in my previous answers. I ran rasa test but it tests on all the intents.

What I noticed is that most intents that have very few training samples are completely ignored now. For example, something like that :

## intent:no_more_questions
- I don't have any questions
- that's all thanks
- no more questions
- No, that's it
- That鈥檚 it, thank you.
- I鈥檒l come back to you if I have any questions.
- That鈥檚 all for now.

I know I could add way more examples but it used to work perfectly. Now, none of the samples are correctly classified, even after training for 3000 epochs.

EDIT: this is for Rasa 1.8.2

do they get misclassified in 1.8.1 or 1.8.2?

do they get misclassified in 1.8.1 or 1.8.2?

In both. My previous comment was for 1.8.2. In 1.8.1, it also happened for intents that had a lot of samples.

but how do the overall numbers from rasa test, compare between 1.6 and 1.8.2?

So, running rasa test nlu on all the intents (67 intents, 2278 samples) gives:

  • Rasa 1.8.2: 90.5% (216 errors) with 8 intents with a precision of 0.0
  • Rasa 1.6.1: 98.5% (34 errors) without any intent with a precision of 0.0

it's hard to say what's going on. Is it possible for you to share your training data?

I can't share it publicly but I'd be happy to send it to you privately! Could you give me an email so that I can send it to you?

could you please send it to support @rasa.com with the link to this issue. Thank you

@Ghostvv Just sent it to the address. Thanks a lot for your help, very much appreciated :+1:

got it. Do you test on the same train set?

That's right, not the best practice ever but well ^^

Looking at the data, I'm quite surprised that the old model worked, these intents are confusing for me as well

Any specific intents? In my tests, I see the precision being 0.0 for rasa 1.8.2 for these intents:

  • dont_know
  • show_me
  • no_more_questions
  • has_question_about_apartment
  • faq_apartment_description
  • faq_apartment_unit_type (this one is fine to fail, see below)
  • i_love_you (this one should be really easy to classify)
  • deny_answer_ordinal

Just as an FYI, we normally have other pipeline components that add some entities which help classification (mostly for the ones that have unit_type). But for this comparison, I completely removed them.

Is there any way we can use the old model one exactly as it was?

I think I found what we changed that made it worse. I'll prepare a PR soon to fix it, we'll make it part of 1.9 release. Meanwhile, I would recommend to try:

  - name: "DIETClassifier"
    epochs: 100
    entity_recognition: False

instead of EmbeddingIntentClassifier it is sequential model therefore it takes longer to train

Was this page helpful?
0 / 5 - 0 ratings