Transformers: GPT2 774M weights released!

Created on 20 Aug 2019 · 5Comments · Source: huggingface/transformers

🚀 Feature

Hi! OpenAI released the 774M weights in GPT2, is it possible to integrate this into pytorch-transformers?

https://twitter.com/OpenAI/status/1163843803884601344

Also, sorry for the obnoxiously quick ask! Thanks for all the great work you do for the community.

Thanks!

Source

moinnadeem

👍20 ❤7 🚀4 👀2 😄1 👎1

Most helpful comment

We've added it on master.
You can install from source and use the shortcut name gpt2-large to use it (but beware, it's big!)

thomwolf on 21 Aug 2019

❤9

All 5 comments

I did the following:

Run download_model.py 774 from here
Create a file named config.json with the following contents (Might be correct but I am not super sure):

{
    "vocab_size": 50257,
    "n_ctx": 1024,
    "n_embd": 1280,
    "n_head": 20,
    "n_layer": 36,
    "n_positions": 1024,
    "embd_pdrop":0.1,
    "attn_pdrop": 0.1,
    "resid_pdrop": 0.1,
    "layer_norm_epsilon": 1e-5,
    "initializer_range": 0.02
}

Clone this repo
Run python .\pytorch-transformers\pytorch_transformers\convert_gpt2_checkpoint_to_pytorch.py --gpt2_checkpoint_path models/774M --pytorch_dump_folder_path ./ --gpt2_config_file config.json
Use it with

config = GPT2Config.from_pretrained("config.json")
model = GPT2LMHeadModel.from_pretrained("pytorch_model.bin", config=config)

Realize there's no way you can fine-tune this your PC's GPU you need to rent something with more memory.

AlexEne on 21 Aug 2019

We've added it on master.
You can install from source and use the shortcut name gpt2-large to use it (but beware, it's big!)

thomwolf on 21 Aug 2019

❤9

Question: Will the gpt2-large be added to Write With Transformer? I've been eagerly looking forward to that since the moment the 774M was released!

zacharymacleod on 21 Aug 2019

@zacharymacleod Glad you asked! We're definitely planning on adding it in the near future :)

LysandreJik on 21 Aug 2019

❤2 👀1 😄1 👍1

Seems to me as if this has been addressed via #1064 . Closing the feature request now!

moinnadeem on 22 Aug 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Limit on the input text length?

lcswillems · 3Comments

GPT2 tokenizer is so slow because of sum()

iedmrc · 3Comments

What should be the label of sub-word units in Token Classification with Bert

ereday · 3Comments

Dataset format and Best Practices For Language Model Fine-tuning

HanGuo97 · 3Comments

ValueError while using --optimize_on_cpu

rsanjaykamath · 3Comments