currently there's a warning shown that NLU can't find the OOV token whenever you train a model with this configured:
2019-11-04 11:04:43 WARNING rasa.nlu.featurizers.count_vectors_featurizer - OOV_token='oov' was given, but it is not present in the training data. All unseen words will be ignored during prediction.
2019-11-04 11:04:43 WARNING rasa.nlu.featurizers.count_vectors_featurizer - OOV_token='oov' was given, but it is not present in the training data. All unseen words will be ignored during prediction.
I suspect these warnings come during intent featurization, and response selector featurization. This warning shouldn't be shown during the intent stage at all, and for response selector only when it's not empty. We should maybe also specify for which part the warning is thrown
@akelad getting the same error. Is it fixed ?
no, it's just a warning though so you can temporarily ignore it
@akelad Why should this warning not be shown for the intent stage? It's important for the featurization, isn't it?
i should have phrased that better: i meant during intent label featurization (so e.g. mood_great has two tokens if the split flag is set to true). Of course it should be thrown during intent example featurization :D does that make sense?
Ok, so the problem is that it's currently thrown twice? What was your pipeline like?
it's thrown for things it shouldn't be throwing - intent label featurization and response selector. My pipeline was standard supervised embeddings + response selector i think
Sorry, I'm a bit slow here 馃檲 How do you know what it's thrown for? The warning only tells you that the CountVectorizer component is throwing it, isn't it?
i mean, i don't know with 100% certainty because i didn't check. But i know it can't be from the intent example featurization because i have an oov token in there and it works as expected. And I talked about this with someone as well, they mentioned some stuff had been changed with the featurizer which is causing this. It's been happening since version 1.3 or something
Ok cool, thanks, getting it now 馃帀
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
actually i think we can close this, it shows a different warning now