OpenAI just release the next biggest version of their language model. I think to add the new model, one needs to use the conversion script from TF to Pytorch and then save the model as another option in PRETRAINED_MODEL_ARCHIVE_MAP.
For convenience to others, here's the config file for 345M:
{
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"n_ctx": 1024,
"n_embd": 1024,
"n_head": 16,
"n_layer": 24,
"n_positions": 1024,
"vocab_size": 50257
}
Here are the concrete steps if you'd like to run the 345M.
Grab OpenAI's download script from here https://github.com/openai/gpt-2/blob/master/download_model.py. and then run python download_model.py 345M to get the model checkpoint.
Then use the conversion script here https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/convert_gpt2_checkpoint_to_pytorch.py using python convert_gpt2_checkpoint_to_pytorch.py --gpt2_checkpoint_path gpt2_checkpoint_folder --gpt2_config_file config_file --pytorch_dump_folder_path output_dir
where config_file is the json posted by @daemon above.
Then inside https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling_gpt2.py modify the PRETRAINED_MODEL_ARCHIVE_MAP and PRETRAINED_CONFIG_ARCHIVE_MAP to point to the converted pytorch file
Thanks!
Then inside https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling_gpt2.py modify the PRETRAINED_MODEL_ARCHIVE_MAP and PRETRAINED_CONFIG_ARCHIVE_MAP to point to the converted pytorch file
Or GPT2LMHeadModel.from_pretrained(pytorch_dump_folder_path) without changing modeling_gpt2.py?
Why not add this in the module?
Thanks for the instruction, I will likely try if its not integrated soon.
When running "convert_gpt2_checkpoint_to_pytorch.py --gpt2_checkpoint_path gpt2_checkpoint_folder --gpt2_config_file config_file --pytorch_dump_folder_path output_dir" I get the following error:
_runfile('C:/Users/nietop1/Desktop/anaconda/trying to generate text/convert_checkpoint_gtp2.py', wdir='C:/Users/nietop1/Desktop/anaconda/trying to generate text')
Converting TensorFlow checkpoint from C:\Users\nietop1\Desktop\anaconda\models\345M
Traceback (most recent call last):
File "
runfile('C:/Users/nietop1/Desktop/anaconda/trying to generate text/convert_checkpoint_gtp2.py', wdir='C:/Users/nietop1/Desktop/anaconda/trying to generate text')
File "C:\Anaconda3\envs\tensorflow\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)
File "C:\Anaconda3\envs\tensorflow\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/nietop1/Desktop/anaconda/trying to generate text/convert_checkpoint_gtp2.py", line 81, in
'C:/Users/nietop1/Desktop/anaconda/models/345M')
File "C:/Users/nietop1/Desktop/anaconda/trying to generate text/convert_checkpoint_gtp2.py", line 47, in convert_gpt2_checkpoint_to_pytorch
load_tf_weights_in_gpt2(model, gpt2_checkpoint_path)
File "C:\Anaconda3\envs\tensorflow\lib\site-packages\pytorch_pretrained_bert\modeling_gpt2.py", line 60, in load_tf_weights_in_gpt2
init_vars = tf.train.list_variables(tf_path)
AttributeError: module 'tensorflow.python.training.training' has no attribute 'list_variables'_
How can this be solved?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Closing this because this is merged.
Most helpful comment
Here are the concrete steps if you'd like to run the 345M.
Grab OpenAI's download script from here https://github.com/openai/gpt-2/blob/master/download_model.py. and then run
python download_model.py 345Mto get the model checkpoint.Then use the conversion script here https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/convert_gpt2_checkpoint_to_pytorch.py using
python convert_gpt2_checkpoint_to_pytorch.py --gpt2_checkpoint_path gpt2_checkpoint_folder --gpt2_config_file config_file --pytorch_dump_folder_path output_dirwhere config_file is the json posted by @daemon above.
Then inside https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling_gpt2.py modify the PRETRAINED_MODEL_ARCHIVE_MAP and PRETRAINED_CONFIG_ARCHIVE_MAP to point to the converted pytorch file