Transformers: cannot load BERTAdam when restoring from BioBert

Created on 30 Jan 2019 · 5Comments · Source: huggingface/transformers

I am trying to convert the recently released BioBert checkpoint: https://github.com/naver/biobert-pretrained

The conversion script loads the checkpoint, but appears to balk at BERTAdam when building the Pytorch model.

...
Building PyTorch model from configuration: {
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "max_position_embeddings": 512,
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "type_vocab_size": 2,
  "vocab_size": 28996
}

Initialize PyTorch weight ['bert', 'embeddings', 'LayerNorm', 'beta']
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/venvs/dev3.6/lib/python3.6/site-packages/pytorch_pretrained_bert/__main__.py", line 19, in <module>
    convert_tf_checkpoint_to_pytorch(TF_CHECKPOINT, TF_CONFIG, PYTORCH_DUMP_OUTPUT)
  File "/venvs/dev3.6/lib/python3.6/site-packages/pytorch_pretrained_bert/convert_tf_checkpoint_to_pytorch.py", line 69, in convert_tf_checkpoint_to_pytorch
    pointer = getattr(pointer, l[0])
AttributeError: 'Parameter' object has no attribute 'BERTAdam'

Source

mikerossgithub

👍1

Most helpful comment

This can help. https://github.com/MeRajat/SolvingAlmostAnythingWithBert/blob/ner_medical/convert_to_pytorch_wt.ipynb

MeRajat on 18 Feb 2019

👍3

All 5 comments

I see. This is because they didn't use the same names for the adam optimizer variables than the Google team. I'll see if I can find a simple way around this for future cases.

In the mean time, you can install pytorch-pretrained-bert from the master (git clone ... and pip install -e .) and add the names of these variables (BERTAdam to the black-list line 53 in the conversion script: https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/convert_tf_checkpoint_to_pytorch.py#L53

thomwolf on 31 Jan 2019

Hmm, loading bioberts parameters works for me. Mabye as a future feature we could have the option to load biobert parameters as an option in the package?
I ran it like this:

convert_tf_checkpoint_to_pytorch("AI/data/biobert/biobert_model.ckpt.index",
                                "AI/data/biobert/bert_config.json","AI/data/biobert/pytorch_model.bin")

it also loads afterwards.

lysecret2 on 1 Feb 2019

This can help. https://github.com/MeRajat/SolvingAlmostAnythingWithBert/blob/ner_medical/convert_to_pytorch_wt.ipynb

MeRajat on 18 Feb 2019

👍3

After I convert the tensorflow checkpoint to pytorch model by excluding some variables as mentioned by @MeRajat , I get the following warnings when I tried to load the model.

02/21/2019 17:33:06 - INFO - pytorch_pretrained_bert.modeling - Weights of BertForQuestionAnswering not initialized from pretrained model: ['qa_outputs.bias', 'qa_outputs.weight']
02/21/2019 17:33:06 - INFO - pytorch_pretrained_bert.modeling - Weights from pretrained model not used in BertForQuestionAnswering: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias']