Transformers: Finetune GPT2

Created on 5 Aug 2019 · 14Comments · Source: huggingface/transformers

Hi
According to pytorch-transformers/docs/source/index.rst
There was a run_gpt2.py example which also shows how to finetune GPT2 on the training data.
I was wondernig if you could add this example back, and proving sample script to finetune GPT2.
thanks.
Best regards,
Rabeeh

Source

rabeehk

Most helpful comment

Oh yes, the script is out.

It was renamed run_lm_fintuning.py you can find it in the examples folder: https://github.com/huggingface/pytorch-transformers/blob/master/examples/run_lm_finetuning.py

You can use it to fintune GPT, GPT-2, BERT or RoBERTa on your dataset.

Here is an example on how to run it: https://huggingface.co/pytorch-transformers/examples.html#causal-lm-fine-tuning-on-gpt-gpt-2-masked-lm-fine-tuning-on-bert-roberta

thomwolf on 6 Sep 2019

👍4

All 14 comments

Hi Rabeeh,

We are currently working on an updated example on fine-tuning generative models, especially GPT-2. The example should be up later this week, keep an eye out!

LysandreJik on 5 Aug 2019

Any update on when this example will be available? Thanks!

letsgoduke on 12 Aug 2019

👍1

Hope this issue won't be closed until the example is done.

aiscientist on 14 Aug 2019

The script is being worked on over at https://github.com/huggingface/pytorch-transformers/pull/987 (see relevant file here). It works for GPT/GPT-2 but it isn't ready for BERT/RoBERTa so we're not releasing it yet.

It shows how to fine-tune GPT-2 using causal language modeling on WikiText-2.

LysandreJik on 14 Aug 2019

🎉2

Any update on when this example will be available? Thanks!
The link of "see relevant file here" is 404

ruidongtd on 6 Sep 2019

Oh yes, the script is out.

It was renamed run_lm_fintuning.py you can find it in the examples folder: https://github.com/huggingface/pytorch-transformers/blob/master/examples/run_lm_finetuning.py

You can use it to fintune GPT, GPT-2, BERT or RoBERTa on your dataset.

Here is an example on how to run it: https://huggingface.co/pytorch-transformers/examples.html#causal-lm-fine-tuning-on-gpt-gpt-2-masked-lm-fine-tuning-on-bert-roberta

thomwolf on 6 Sep 2019

👍4

Silly question but how do you know which gpt-2 model is being trained? Does it default to the largest one available. I couldn't find any indication of which size model is being used in the fine tuning script.

Henry-E on 6 Nov 2019

Hi Henry,
Default to the small one.
You can select the size with the model_name_or_path argument. Just put in
the argument the relevant shortcut name for the model as listed here.

On Wed, 6 Nov 2019 at 12:35, Henry-E notifications@github.com wrote:

Silly question but how do you know which gpt-2 model is being trained?
Does it default to the largest one available. I could find any indication
of which size model is being used in the fine tuning script.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/huggingface/transformers/issues/969?email_source=notifications&email_token=ABYDIHNYJ3YQTDE6P6HPCOTQSKTYRA5CNFSM4IJMWQW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDGHRYA#issuecomment-550271200,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABYDIHM7FQGTLI5UPLWJHSTQSKTYRANCNFSM4IJMWQWQ
.

thomwolf on 6 Nov 2019

👍1

Ah got it, thanks!

Henry-E on 6 Nov 2019

run_lm_fintuning.py is no longer available in the examples folder when you clone the transformers repo. Is there a reason for this? It was available a couple of months ago.