It looks like there used to be a script run_lm_finetuning.py that has been replaced by run_language_modeling.py. It's unclear to me how to use this script to run finetuning. I want to finetune GPT-2 on a variety of downstream tasks, and would love some help!
You can use run_language_modeling.py for this purpose. Just set --model_type to gpt2, set --model_name_or_path to the gpt2 model checkpoint you want (gpt2) and set --train_data_file to your dataset and you should be ready to go.
Thanks! The issue is that I want to use the pre-trained version of GPT-2. I remember run_lm_finetuning had a few lines of code where it would download and load that pre-trained model.
You can specify gpt2 in --model_name_or_path. That corresponds to one of the pre-trained checkpoints that it'll download and use. The other possible pre-trained models that you can specify there are gpt2-medium,gpt2-large,gpt2-xl and distilgpt2.
If specifying gpt2 there downloads the checkpoints then how do you train from scratch? I've been specifying that parameter and it seems like it is training from scratch (starting perplexity ~1000).
I believe that if you want to train from scratch, you'll have to point that to a folder with a config file (with the parameters of the model) and no pytorch_model.bin checkpoint file in that folder.
@Genius1237 Hi there,
I am finetuning the 124M model based on my dataset (almost 2mb) and I am using Colab by Max Wolf. I am wondering if there is a way that I can generate texts based on my trained model + internet context (not only metadata). I wanna generate some notes regarding current issues (such as COVID-19) based on my trained model.
Could you help me with that, please? thanks.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Most helpful comment
You can use
run_language_modeling.pyfor this purpose. Just set--model_typetogpt2, set--model_name_or_pathto the gpt2 model checkpoint you want (gpt2) and set--train_data_fileto your dataset and you should be ready to go.