Transformers: Finetuning OpenAI GPT-2 for another language.

Created on 18 Oct 2019  ยท  3Comments  ยท  Source: huggingface/transformers

โ“ Questions & Help

Hi,

Is there any option to finetune and use OpenAI GPT-2 for another language except English?

wontfix

Most helpful comment

@0x01h

GPT-2 can produce great results given a proper vocabulary. If you just run run_lm_finetuning on your lang dataset it will give you poor results, regardless of language distance from English because the vocab.

I'd suggest that you train your tokenizer model first and then fine-tune GPT-2 with it. I'm doing that here
https://github.com/mgrankin/ru_transformers

All 3 comments

Hello, if you want to try and fine-tune GPT-2 to another language, you can just give the run_lm_finetuning script your text in the other language on which you want to fine-tune your model.

However, please be aware that according to the language and its distance to the English language (language on which GPT-2 was pre-trained), you may find it hard to obtain good results.

@0x01h

GPT-2 can produce great results given a proper vocabulary. If you just run run_lm_finetuning on your lang dataset it will give you poor results, regardless of language distance from English because the vocab.

I'd suggest that you train your tokenizer model first and then fine-tune GPT-2 with it. I'm doing that here
https://github.com/mgrankin/ru_transformers

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

fabiocapsouza picture fabiocapsouza  ยท  3Comments

iedmrc picture iedmrc  ยท  3Comments

siddsach picture siddsach  ยท  3Comments

hsajjad picture hsajjad  ยท  3Comments

ereday picture ereday  ยท  3Comments