Hi Team
Thanks for the wonderful HuggingFace library !
I am now working with T5 on my own dataset. I want to know if there is any helper script that can automatically take text and mask a random set of tokens and also generate the expected output sequence for the pretraining unsupervised language modeling task.
Not yet sadly - it's on my ToDo list. Hope to be able to work on it soon
I am working on a script for T5 based upon the current run_language_modeling.py, maybe I can share that once I am done and someone can confirm if it works as expected?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi, I'm working in the same task. Here you can see my code if it helps!
Most helpful comment
I am working on a script for T5 based upon the current run_language_modeling.py, maybe I can share that once I am done and someone can confirm if it works as expected?