Transformers: Add GPT-2 Bigger Model

Created on 4 May 2019 · 7Comments · Source: huggingface/transformers

OpenAI just release the next biggest version of their language model. I think to add the new model, one needs to use the conversion script from TF to Pytorch and then save the model as another option in PRETRAINED_MODEL_ARCHIVE_MAP.

wontfix

Source

Eric-Wallace

👍8

Most helpful comment

Here are the concrete steps if you'd like to run the 345M.

Grab OpenAI's download script from here https://github.com/openai/gpt-2/blob/master/download_model.py. and then run python download_model.py 345M to get the model checkpoint.

Then use the conversion script here https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/convert_gpt2_checkpoint_to_pytorch.py using python convert_gpt2_checkpoint_to_pytorch.py --gpt2_checkpoint_path gpt2_checkpoint_folder --gpt2_config_file config_file --pytorch_dump_folder_path output_dir

where config_file is the json posted by @daemon above.

Then inside https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling_gpt2.py modify the PRETRAINED_MODEL_ARCHIVE_MAP and PRETRAINED_CONFIG_ARCHIVE_MAP to point to the converted pytorch file

Eric-Wallace on 8 May 2019

👍8

All 7 comments

For convenience to others, here's the config file for 345M:

{
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "n_ctx": 1024,
  "n_embd": 1024,
  "n_head": 16,
  "n_layer": 24,
  "n_positions": 1024,
  "vocab_size": 50257
}

daemon on 4 May 2019

Here are the concrete steps if you'd like to run the 345M.

Grab OpenAI's download script from here https://github.com/openai/gpt-2/blob/master/download_model.py. and then run python download_model.py 345M to get the model checkpoint.

where config_file is the json posted by @daemon above.

Eric-Wallace on 8 May 2019

👍8

Thanks!

Then inside https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling_gpt2.py modify the PRETRAINED_MODEL_ARCHIVE_MAP and PRETRAINED_CONFIG_ARCHIVE_MAP to point to the converted pytorch file

Or GPT2LMHeadModel.from_pretrained(pytorch_dump_folder_path) without changing modeling_gpt2.py?

1230113202 on 11 May 2019

👍2

Why not add this in the module?

Thanks for the instruction, I will likely try if its not integrated soon.

Oxi84 on 16 May 2019

👍2

When running "convert_gpt2_checkpoint_to_pytorch.py --gpt2_checkpoint_path gpt2_checkpoint_folder --gpt2_config_file config_file --pytorch_dump_folder_path output_dir" I get the following error:

_runfile('C:/Users/nietop1/Desktop/anaconda/trying to generate text/convert_checkpoint_gtp2.py', wdir='C:/Users/nietop1/Desktop/anaconda/trying to generate text')
Converting TensorFlow checkpoint from C:\Users\nietop1\Desktop\anaconda\models\345M
Traceback (most recent call last):

File "", line 1, in
runfile('C:/Users/nietop1/Desktop/anaconda/trying to generate text/convert_checkpoint_gtp2.py', wdir='C:/Users/nietop1/Desktop/anaconda/trying to generate text')

File "C:\Anaconda3\envs\tensorflow\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)

File "C:\Anaconda3\envs\tensorflow\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "C:/Users/nietop1/Desktop/anaconda/trying to generate text/convert_checkpoint_gtp2.py", line 81, in
'C:/Users/nietop1/Desktop/anaconda/models/345M')

File "C:/Users/nietop1/Desktop/anaconda/trying to generate text/convert_checkpoint_gtp2.py", line 47, in convert_gpt2_checkpoint_to_pytorch
load_tf_weights_in_gpt2(model, gpt2_checkpoint_path)

File "C:\Anaconda3\envs\tensorflow\lib\site-packages\pytorch_pretrained_bert\modeling_gpt2.py", line 60, in load_tf_weights_in_gpt2
init_vars = tf.train.list_variables(tf_path)

AttributeError: module 'tensorflow.python.training.training' has no attribute 'list_variables'_

How can this be solved?

pablonieto0981 on 17 May 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.