I am trying to convert the recently released BioBert checkpoint: https://github.com/naver/biobert-pretrained
The conversion script loads the checkpoint, but appears to balk at BERTAdam when building the Pytorch model.
...
Building PyTorch model from configuration: {
"attention_probs_dropout_prob": 0.1,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"type_vocab_size": 2,
"vocab_size": 28996
}
Initialize PyTorch weight ['bert', 'embeddings', 'LayerNorm', 'beta']
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/venvs/dev3.6/lib/python3.6/site-packages/pytorch_pretrained_bert/__main__.py", line 19, in <module>
convert_tf_checkpoint_to_pytorch(TF_CHECKPOINT, TF_CONFIG, PYTORCH_DUMP_OUTPUT)
File "/venvs/dev3.6/lib/python3.6/site-packages/pytorch_pretrained_bert/convert_tf_checkpoint_to_pytorch.py", line 69, in convert_tf_checkpoint_to_pytorch
pointer = getattr(pointer, l[0])
AttributeError: 'Parameter' object has no attribute 'BERTAdam'
I see. This is because they didn't use the same names for the adam optimizer variables than the Google team. I'll see if I can find a simple way around this for future cases.
In the mean time, you can install pytorch-pretrained-bert from the master (git clone ... and pip install -e .) and add the names of these variables (BERTAdam to the black-list line 53 in the conversion script: https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/convert_tf_checkpoint_to_pytorch.py#L53
Hmm, loading bioberts parameters works for me. Mabye as a future feature we could have the option to load biobert parameters as an option in the package?
I ran it like this:
convert_tf_checkpoint_to_pytorch("AI/data/biobert/biobert_model.ckpt.index",
"AI/data/biobert/bert_config.json","AI/data/biobert/pytorch_model.bin")
it also loads afterwards.
After I convert the tensorflow checkpoint to pytorch model by excluding some variables as mentioned by @MeRajat , I get the following warnings when I tried to load the model.
02/21/2019 17:33:06 - INFO - pytorch_pretrained_bert.modeling - Weights of BertForQuestionAnswering not initialized from pretrained model: ['qa_outputs.bias', 'qa_outputs.weight']
02/21/2019 17:33:06 - INFO - pytorch_pretrained_bert.modeling - Weights from pretrained model not used in BertForQuestionAnswering: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias']
This is normal. Closing the issue now.
Most helpful comment
This can help. https://github.com/MeRajat/SolvingAlmostAnythingWithBert/blob/ner_medical/convert_to_pytorch_wt.ipynb