Rasa: TrainingData.merge() doesn't check for duplicate examples per intent

Created on 4 Oct 2018  路  10Comments  路  Source: RasaHQ/rasa

e.g. by merging

## intent:greet
- hi

with

## intent:greet
- hi

we get

## intent:greet
- hi
- hi
difficulty help wanted type

All 10 comments

just when merging? do even check for duplicates within an intent?

You're right, there's nothing implemented. The only data processing done for the intents upon building TrainingData() is in sanitize_examples() which removes whitespace

@MetcalfeTom is this still relevant?

It is still relevant for features like interactive learning where we dump new NLU data. I think it functions well as a community issue

I want to work on this

awesome! let us know if we can help with anything

@akelad what would the unit test using @MetcalfeTom example look like? Should I create a markdown file containing the given examples or can each intent be represented as an instance of TrainingData?

@hsm207 you can take a look at the tests written here: https://github.com/RasaHQ/rasa/blob/master/tests/nlu/base/test_training_data.py

@akelad I've made PR #4414 for your review.

@akelad This issue can be closed now.

Was this page helpful?
0 / 5 - 0 ratings