Transformers: Add GPT-2 Bigger Model

Created on 4 May 2019  路  7Comments  路  Source: huggingface/transformers

OpenAI just release the next biggest version of their language model. I think to add the new model, one needs to use the conversion script from TF to Pytorch and then save the model as another option in PRETRAINED_MODEL_ARCHIVE_MAP.

wontfix

Most helpful comment

Here are the concrete steps if you'd like to run the 345M.

Grab OpenAI's download script from here https://github.com/openai/gpt-2/blob/master/download_model.py. and then run python download_model.py 345M to get the model checkpoint.

Then use the conversion script here https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/convert_gpt2_checkpoint_to_pytorch.py using python convert_gpt2_checkpoint_to_pytorch.py --gpt2_checkpoint_path gpt2_checkpoint_folder --gpt2_config_file config_file --pytorch_dump_folder_path output_dir

where config_file is the json posted by @daemon above.

Then inside https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling_gpt2.py modify the PRETRAINED_MODEL_ARCHIVE_MAP and PRETRAINED_CONFIG_ARCHIVE_MAP to point to the converted pytorch file

All 7 comments

For convenience to others, here's the config file for 345M:

{
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "n_ctx": 1024,
  "n_embd": 1024,
  "n_head": 16,
  "n_layer": 24,
  "n_positions": 1024,
  "vocab_size": 50257
}

Here are the concrete steps if you'd like to run the 345M.

Grab OpenAI's download script from here https://github.com/openai/gpt-2/blob/master/download_model.py. and then run python download_model.py 345M to get the model checkpoint.

Then use the conversion script here https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/convert_gpt2_checkpoint_to_pytorch.py using python convert_gpt2_checkpoint_to_pytorch.py --gpt2_checkpoint_path gpt2_checkpoint_folder --gpt2_config_file config_file --pytorch_dump_folder_path output_dir

where config_file is the json posted by @daemon above.

Then inside https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling_gpt2.py modify the PRETRAINED_MODEL_ARCHIVE_MAP and PRETRAINED_CONFIG_ARCHIVE_MAP to point to the converted pytorch file

Thanks!

Then inside https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling_gpt2.py modify the PRETRAINED_MODEL_ARCHIVE_MAP and PRETRAINED_CONFIG_ARCHIVE_MAP to point to the converted pytorch file

Or GPT2LMHeadModel.from_pretrained(pytorch_dump_folder_path) without changing modeling_gpt2.py?

Why not add this in the module?

Thanks for the instruction, I will likely try if its not integrated soon.

When running "convert_gpt2_checkpoint_to_pytorch.py --gpt2_checkpoint_path gpt2_checkpoint_folder --gpt2_config_file config_file --pytorch_dump_folder_path output_dir" I get the following error:

_runfile('C:/Users/nietop1/Desktop/anaconda/trying to generate text/convert_checkpoint_gtp2.py', wdir='C:/Users/nietop1/Desktop/anaconda/trying to generate text')
Converting TensorFlow checkpoint from C:\Users\nietop1\Desktop\anaconda\models\345M
Traceback (most recent call last):

File "", line 1, in
runfile('C:/Users/nietop1/Desktop/anaconda/trying to generate text/convert_checkpoint_gtp2.py', wdir='C:/Users/nietop1/Desktop/anaconda/trying to generate text')

File "C:\Anaconda3\envs\tensorflow\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)

File "C:\Anaconda3\envs\tensorflow\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "C:/Users/nietop1/Desktop/anaconda/trying to generate text/convert_checkpoint_gtp2.py", line 81, in
'C:/Users/nietop1/Desktop/anaconda/models/345M')

File "C:/Users/nietop1/Desktop/anaconda/trying to generate text/convert_checkpoint_gtp2.py", line 47, in convert_gpt2_checkpoint_to_pytorch
load_tf_weights_in_gpt2(model, gpt2_checkpoint_path)

File "C:\Anaconda3\envs\tensorflow\lib\site-packages\pytorch_pretrained_bert\modeling_gpt2.py", line 60, in load_tf_weights_in_gpt2
init_vars = tf.train.list_variables(tf_path)

AttributeError: module 'tensorflow.python.training.training' has no attribute 'list_variables'_

How can this be solved?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Closing this because this is merged.

Was this page helpful?
0 / 5 - 0 ratings