Hi,
Is there any option to finetune and use OpenAI GPT-2 for another language except English?
Hello, if you want to try and fine-tune GPT-2 to another language, you can just give the run_lm_finetuning
script your text in the other language on which you want to fine-tune your model.
However, please be aware that according to the language and its distance to the English language (language on which GPT-2 was pre-trained), you may find it hard to obtain good results.
@0x01h
GPT-2 can produce great results given a proper vocabulary. If you just run run_lm_finetuning
on your lang dataset it will give you poor results, regardless of language distance from English because the vocab.
I'd suggest that you train your tokenizer model first and then fine-tune GPT-2 with it. I'm doing that here
https://github.com/mgrankin/ru_transformers
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Most helpful comment
@0x01h
GPT-2 can produce great results given a proper vocabulary. If you just run
run_lm_finetuning
on your lang dataset it will give you poor results, regardless of language distance from English because the vocab.I'd suggest that you train your tokenizer model first and then fine-tune GPT-2 with it. I'm doing that here
https://github.com/mgrankin/ru_transformers