Description of Problem:
The user is not able to set a seed when using the CLI to split NLU data into test and training data sets (_rasa split data nlu_ command). This feature will enhance reproducibility when required.
Overview of the Solution:
Add set.seed() to split_nlu_data.py or train_test_split.py
Definition of Done:
Thanks for submitting this feature request 🚀@Ghostvv will get back to you about it soon!✨
Great idea 👍 We could add an argument like --random-seed to rasa split data nlu to set the seed. Needs to be added here: https://github.com/RasaHQ/rasa/blob/master/rasa/cli/data.py. And the seed should be forwarded to https://github.com/RasaHQ/rasa/blob/master/rasa/nlu/training_data/training_data.py#L400 in order to set it.
@neelkes Do you want to work on this feature yourself and submit a PR?
I found this enhancement pretty useful. Can I be assigned to this issue? (Supposing that @neelkes is not working with this feature anymore)
@joaorobson Sure, feel free to work on it. Thanks! Let me know if you need help/if you have questions.
Most helpful comment
@joaorobson Sure, feel free to work on it. Thanks! Let me know if you need help/if you have questions.