Transformers: AttributeError: 'BertForPreTraining' object has no attribute 'shape'

Created on 21 Mar 2019  Â·  13Comments  Â·  Source: huggingface/transformers

Is there any suggestion for fixing the following? I was trying "convert_tf_checkpoint_to_pytorch.py" to convert a model trained from scratch but the conversion didn't work out....

Skipping cls/seq_relationship/output_weights/adam_v
Traceback (most recent call last):
  File "pytorch_pretrained_bert/convert_tf_checkpoint_to_pytorch.py", line 66, in <module>
    args.pytorch_dump_path)
  File "pytorch_pretrained_bert/convert_tf_checkpoint_to_pytorch.py", line 37, in convert_tf_checkpoint_to_pytorch
    load_tf_weights_in_bert(model, tf_checkpoint_path)
  File "/content/my_pytorch-pretrained-BERT/pytorch_pretrained_bert/modeling.py", line 117, in load_tf_weights_in_bert
    assert pointer.shape == array.shape
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 535, in __getattr__
    type(self).__name__, name))
AttributeError: 'BertForPreTraining' object has no attribute 'shape'
Need more information wontfix

Most helpful comment

I'm getting a similar error when trying to convert the newer BERT models released at
tensorflow/models/tree/master/official/nlp/.

These models are either BERT models trained with Keras or else checkpoints converted from
the original google-research/bert repository. I also get the same error when I convert the TF1 to TF2 checkpoints myself using the tf2_encoder_checkpoint_converter.py script:

What I have tried:

First, I have downloaded a model:

wget https://storage.googleapis.com/cloud-tpu-checkpoints/bert/keras_bert/cased_L-12_H-768_A-12.tar.gz
# or
wget https://storage.googleapis.com/cloud-tpu-checkpoints/bert/tf_20/cased_L-12_H-768_A-12.tar.gz 

After unpacking:

export BERT_BASE_DIR=cased_L-12_H-768_A-12

transformers-cli convert --model_type bert \
    --tf_checkpoint $BERT_BASE_DIR/bert_model.ckpt \
    --config $BERT_BASE_DIR/bert_config.json \
    --pytorch_dump_output $BERT_BASE_DIR/pytorch_model.bin

The command prints the configuration but throws the following error:

INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_layer/_value_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE with shape [768, 12, 64]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_layer_norm/beta/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_layer_norm/gamma/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_output_dense/bias/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_output_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE with shape [12, 64, 768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_intermediate_dense/bias/.ATTRIBUTES/VARIABLE_VALUE with shape [3072]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_intermediate_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE with shape [768, 3072]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_output_dense/bias/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_output_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE with shape [3072, 768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_output_layer_norm/beta/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_output_layer_norm/gamma/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight save_counter/.ATTRIBUTES/VARIABLE_VALUE with shape []
INFO:transformers.modeling_bert:Skipping _CHECKPOINTABLE_OBJECT_GRAPH
Traceback (most recent call last):
  File "/home/jbarry/anaconda3/envs/transformers/bin/transformers-cli", line 30, in <module>
    service.run()
  File "/home/jbarry/anaconda3/envs/transformers/lib/python3.6/site-packages/transformers/commands/convert.py", line 62, in run
    convert_tf_checkpoint_to_pytorch(self._tf_checkpoint, self._config, self._pytorch_dump_output)
  File "/home/jbarry/anaconda3/envs/transformers/lib/python3.6/site-packages/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py", line 36, in convert_tf_checkpoint_to_pytorch
    load_tf_weights_in_bert(model, config, tf_checkpoint_path)
  File "/home/jbarry/anaconda3/envs/transformers/lib/python3.6/site-packages/transformers/modeling_bert.py", line 118, in load_tf_weights_in_bert
    assert pointer.shape == array.shape
  File "/home/jbarry/anaconda3/envs/transformers/lib/python3.6/site-packages/torch/nn/modules/module.py", line 585, in __getattr__
    type(self).__name__, name))
AttributeError: 'BertForPreTraining' object has no attribute 'shape'

This is happening in a fresh environment with PyTorch 1.3 installed in Anaconda (Linux), as well as pip-installing tf-nightly and transformers (2.3.0).

Has anyone else been able to successfully convert the TF 2.0 version models to PyTorch or know where I'm going wrong? Thanks!

All 13 comments

Hi,
Is it a model trained from the original Google BERT Tensorflow implementation?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

I get the same error as @leejason. I used the pre-trained BERT base uncased model from the original TF implementation and fine-tuned it on my own training set. I am now trying to use the fine-tuned model to do masked LM.

I also have similar issue. I pretrained a bert from scratch using nvidia implementation with customized config file and vocab.
Then I use

convert_tf_checkpoint_to_pytorch.convert_tf_checkpoint_to_pytorch(BERT_MODEL_PATH + 'model.ckpt',
                                                                 BERT_MODEL_PATH  + 'bert_config.json',
                                                                 BERT_MODEL_PATH + 'pytorch_model.bin')

```
...
Loading TF weight cls/predictions/transform/dense/bias with shape [512]
Loading TF weight cls/predictions/transform/dense/bias/adam_m with shape [512]
Loading TF weight cls/predictions/transform/dense/bias/adam_v with shape [512]
Loading TF weight cls/predictions/transform/dense/kernel with shape [512, 512]
Loading TF weight cls/predictions/transform/dense/kernel/adam_m with shape [512, 512]
Loading TF weight cls/predictions/transform/dense/kernel/adam_v with shape [512, 512]
Loading TF weight cls/seq_relationship/output_bias with shape [2]
Loading TF weight cls/seq_relationship/output_weights with shape [2, 512]
Loading TF weight global_step with shape []
Loading TF weight good_steps with shape []
Loading TF weight loss_scale with shape []

Skipping bad_steps

AttributeError Traceback (most recent call last)
in
----> 1 bert = BertModel.from_pretrained(BERT_MODEL_PATH, from_tf=True).bert()

~/InEx/input/huggingface/pytorch-pretrained-BERT-master/pytorch_pretrained_bert/modeling.py in from_pretrained(cls, pretrained_model_name_or_path, inputs, *kwargs)
611 # Directly load from a TensorFlow checkpoint
612 weights_path = os.path.join(serialization_dir, TF_WEIGHTS_NAME)
--> 613 return load_tf_weights_in_bert(model, weights_path)
614 # Load from a PyTorch state_dict
615 old_keys = []

~/InEx/input/huggingface/pytorch-pretrained-BERT-master/pytorch_pretrained_bert/modeling.py in load_tf_weights_in_bert(model, tf_checkpoint_path)
107 array = np.transpose(array)
108 try:
--> 109 assert pointer.shape == array.shape
110 except AssertionError as e:
111 e.args += (pointer.shape, array.shape)

~/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in __getattr__(self, name)
537 return modules[name]
538 raise AttributeError("'{}' object has no attribute '{}'".format(
--> 539 type(self).__name__, name))
540
541 def __setattr__(self, name, value):

AttributeError: 'BertModel' object has no attribute 'shape'```

I solve this problem by bypass some variables in the model, such as "bad_steps", “global_step", "good_steps", "loss_scale". They don't have attribute 'shape‘ and I don't need them when fineturning the model.

In modeling.py, line 121, replace it with
if any(n in ["adam_v", "adam_m", "global_step", "bad_steps", "global_step", "good_steps", "loss_scale"] for n in name):
and delete line 151-156.

I solve this problem by bypass some variables in the model, such as "bad_steps", “global_step", "good_steps", "loss_scale". They don't have attribute 'shape‘ and I don't need them when fineturning the model.

In modeling.py, line 121, replace it with
if any(n in ["adam_v", "adam_m", "global_step", "bad_steps", "global_step", "good_steps", "loss_scale"] for n in name):
and delete line 151-156.

It works. Thanks very much !

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

I'm getting a similar error when trying to convert the newer BERT models released at
tensorflow/models/tree/master/official/nlp/.

These models are either BERT models trained with Keras or else checkpoints converted from
the original google-research/bert repository. I also get the same error when I convert the TF1 to TF2 checkpoints myself using the tf2_encoder_checkpoint_converter.py script:

What I have tried:

First, I have downloaded a model:

wget https://storage.googleapis.com/cloud-tpu-checkpoints/bert/keras_bert/cased_L-12_H-768_A-12.tar.gz
# or
wget https://storage.googleapis.com/cloud-tpu-checkpoints/bert/tf_20/cased_L-12_H-768_A-12.tar.gz 

After unpacking:

export BERT_BASE_DIR=cased_L-12_H-768_A-12

transformers-cli convert --model_type bert \
    --tf_checkpoint $BERT_BASE_DIR/bert_model.ckpt \
    --config $BERT_BASE_DIR/bert_config.json \
    --pytorch_dump_output $BERT_BASE_DIR/pytorch_model.bin

The command prints the configuration but throws the following error:

INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_layer/_value_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE with shape [768, 12, 64]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_layer_norm/beta/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_layer_norm/gamma/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_output_dense/bias/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_output_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE with shape [12, 64, 768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_intermediate_dense/bias/.ATTRIBUTES/VARIABLE_VALUE with shape [3072]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_intermediate_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE with shape [768, 3072]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_output_dense/bias/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_output_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE with shape [3072, 768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_output_layer_norm/beta/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_output_layer_norm/gamma/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight save_counter/.ATTRIBUTES/VARIABLE_VALUE with shape []
INFO:transformers.modeling_bert:Skipping _CHECKPOINTABLE_OBJECT_GRAPH
Traceback (most recent call last):
  File "/home/jbarry/anaconda3/envs/transformers/bin/transformers-cli", line 30, in <module>
    service.run()
  File "/home/jbarry/anaconda3/envs/transformers/lib/python3.6/site-packages/transformers/commands/convert.py", line 62, in run
    convert_tf_checkpoint_to_pytorch(self._tf_checkpoint, self._config, self._pytorch_dump_output)
  File "/home/jbarry/anaconda3/envs/transformers/lib/python3.6/site-packages/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py", line 36, in convert_tf_checkpoint_to_pytorch
    load_tf_weights_in_bert(model, config, tf_checkpoint_path)
  File "/home/jbarry/anaconda3/envs/transformers/lib/python3.6/site-packages/transformers/modeling_bert.py", line 118, in load_tf_weights_in_bert
    assert pointer.shape == array.shape
  File "/home/jbarry/anaconda3/envs/transformers/lib/python3.6/site-packages/torch/nn/modules/module.py", line 585, in __getattr__
    type(self).__name__, name))
AttributeError: 'BertForPreTraining' object has no attribute 'shape'

This is happening in a fresh environment with PyTorch 1.3 installed in Anaconda (Linux), as well as pip-installing tf-nightly and transformers (2.3.0).

Has anyone else been able to successfully convert the TF 2.0 version models to PyTorch or know where I'm going wrong? Thanks!

I'm getting a similar error when trying to convert the newer BERT models released at
tensorflow/models/tree/master/official/nlp/.

These models are either BERT models trained with Keras or else checkpoints converted from
the original google-research/bert repository. I also get the same error when I convert the TF1 to TF2 checkpoints myself using the tf2_encoder_checkpoint_converter.py script:

What I have tried:

First, I have downloaded a model:

wget https://storage.googleapis.com/cloud-tpu-checkpoints/bert/keras_bert/cased_L-12_H-768_A-12.tar.gz
# or
wget https://storage.googleapis.com/cloud-tpu-checkpoints/bert/tf_20/cased_L-12_H-768_A-12.tar.gz 

After unpacking:

export BERT_BASE_DIR=cased_L-12_H-768_A-12

transformers-cli convert --model_type bert \
    --tf_checkpoint $BERT_BASE_DIR/bert_model.ckpt \
    --config $BERT_BASE_DIR/bert_config.json \
    --pytorch_dump_output $BERT_BASE_DIR/pytorch_model.bin

The command prints the configuration but throws the following error:

INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_layer/_value_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE with shape [768, 12, 64]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_layer_norm/beta/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_layer_norm/gamma/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_output_dense/bias/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_attention_output_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE with shape [12, 64, 768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_intermediate_dense/bias/.ATTRIBUTES/VARIABLE_VALUE with shape [3072]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_intermediate_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE with shape [768, 3072]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_output_dense/bias/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_output_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE with shape [3072, 768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_output_layer_norm/beta/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight model/layer_with_weights-9/_output_layer_norm/gamma/.ATTRIBUTES/VARIABLE_VALUE with shape [768]
INFO:transformers.modeling_bert:Loading TF weight save_counter/.ATTRIBUTES/VARIABLE_VALUE with shape []
INFO:transformers.modeling_bert:Skipping _CHECKPOINTABLE_OBJECT_GRAPH
Traceback (most recent call last):
  File "/home/jbarry/anaconda3/envs/transformers/bin/transformers-cli", line 30, in <module>
    service.run()
  File "/home/jbarry/anaconda3/envs/transformers/lib/python3.6/site-packages/transformers/commands/convert.py", line 62, in run
    convert_tf_checkpoint_to_pytorch(self._tf_checkpoint, self._config, self._pytorch_dump_output)
  File "/home/jbarry/anaconda3/envs/transformers/lib/python3.6/site-packages/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py", line 36, in convert_tf_checkpoint_to_pytorch
    load_tf_weights_in_bert(model, config, tf_checkpoint_path)
  File "/home/jbarry/anaconda3/envs/transformers/lib/python3.6/site-packages/transformers/modeling_bert.py", line 118, in load_tf_weights_in_bert
    assert pointer.shape == array.shape
  File "/home/jbarry/anaconda3/envs/transformers/lib/python3.6/site-packages/torch/nn/modules/module.py", line 585, in __getattr__
    type(self).__name__, name))
AttributeError: 'BertForPreTraining' object has no attribute 'shape'

This is happening in a fresh environment with PyTorch 1.3 installed in Anaconda (Linux), as well as pip-installing tf-nightly and transformers (2.3.0).

Has anyone else been able to successfully convert the TF 2.0 version models to PyTorch or know where I'm going wrong? Thanks!

ignore those lines causing erros by changing
https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_bert.py#L123-L127 to
try:
assert pointer.shape == array.shape
except:
pass

same thing for https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_bert.py#L129

I solve this problem by bypass some variables in the model, such as "bad_steps", “global_step", "good_steps", "loss_scale". They don't have attribute 'shape‘ and I don't need them when fineturning the model.

In modeling.py, line 121, replace it with
if any(n in ["adam_v", "adam_m", "global_step", "bad_steps", "global_step", "good_steps", "loss_scale"] for n in name):
and delete line 151-156.

It helps! Thx u so much <3

I'm running into the same problem, I tried the solution proposed by @yzhang123 but it only makes me run in another error

Traceback (most recent call last):
  File "C:\Users\lrizzello\AppData\Local\JetBrains\PyCharm 2019.3.4\plugins\python\helpers\pydev\pydevd.py", line 1434, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Users\lrizzello\AppData\Local\JetBrains\PyCharm 2019.3.4\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/source/repos/DeduplicationSiamese/PythonDeduper/eval/eval_matcher_tf.py", line 91, in <module>
    load_tf_weights_in_bert(model, config, tf_path)
  File "C:\Users\lrizzello\Anaconda3\envs\dedupe_transformer\lib\site-packages\transformers\modeling_bert.py", line 129, in load_tf_weights_in_bert
    pointer.data = torch.from_numpy(array)
TypeError: expected np.ndarray (got bytes)

Process finished with exit code -1

I got those checkpoints by training an existing huggingface model (namely 'google/bert_uncased_L-12_H-256_A-4') via the TFTrainer/TFTrainingArguments method

I have tried many other hacks to get this working, such as this one or this one but unsuccesfully but nothing worked. I keep running into error after error.

Has anyone managed to get this working any other way?

I managed to get it working by going through the pointers in debug mode and checking what variable name corresponded to what. This is the function I ended up using.

def convert_tf_checkpoint_to_pytorch(tf_checkpoint_path, bert_config_file, pytorch_dump_path):
    config_path = os.path.abspath(bert_config_file)
    tf_path = os.path.abspath(tf_checkpoint_path)
    print("Converting TensorFlow checkpoint from {} with config at {}".format(tf_path, config_path))
    # Load weights from TF model
    init_vars = tf.train.list_variables(tf_path)
    excluded = ["BERTAdam", "_power", "global_step", "_CHECKPOINTABLE_OBJECT_GRAPH"]
    init_vars = list(filter(lambda x: all([True if e not in x[0] else False for e in excluded]), init_vars))
    names = []
    arrays = []
    for name, shape in init_vars:
        print("Loading TF weight {} with shape {}".format(name, shape))
        array = tf.train.load_variable(tf_path, name)
        names.append(name)
        arrays.append(array)

    config = BertConfig.from_json_file(bert_config_file)
    print("Building PyTorch model from configuration: {}".format(str(config)))
    # Initialise PyTorch model
    model = BertForSequenceClassification(config)

    for name, array in zip(names, arrays):
        name = name.split("/")
        # adam_v and adam_m are variables used in AdamWeightDecayOptimizer to calculated m and v
        # which are not required for using pretrained model
        if any(n in ["adam_v", "adam_m", "global_step", "bad_steps", "global_step", "good_steps", "loss_scale",
                     "AdamWeightDecayOptimizer", "AdamWeightDecayOptimizer_1", "save_counter", ".OPTIMIZER_SLOT"] for n in name) or \
                name[0] == "optimizer":
            print("Skipping {}".format("/".join(name)))
            continue
        if ".OPTIMIZER_SLOT" in name:
            idx = name.index(".OPTIMIZER_SLOT")
            name = name[:idx]
        elif ".ATTRIBUTES" in name:
            idx = name.index(".ATTRIBUTES")
            name = name[:idx]
        print(name)
        pointer = model
        for m_name in name:
            if re.fullmatch(r"[A-Za-z]+_\d+", m_name):
                scope_names = re.split(r"_(\d+)", m_name)
            else:
                scope_names = [m_name]
            if scope_names[0] == "kernel" or scope_names[0] == "gamma":
                pointer = getattr(pointer, "weight")
            elif scope_names[0] == "output_bias" or scope_names[0] == "beta":
                pointer = getattr(pointer, "bias")
            elif scope_names[0] == "output_weights":
                pointer = getattr(pointer, "weight")
            elif scope_names[0] == "squad":
                pointer = getattr(pointer, "classifier")
            elif scope_names[0] == "dense_output" or scope_names[0] == "bert_output":
                pointer = getattr(pointer, "output")
            elif scope_names[0] == "self_attention":
                pointer = getattr(pointer, "self")
            else:
                try:
                    pointer = getattr(pointer, scope_names[0])
                except AttributeError:
                    logger.info("Skipping {}".format("/".join(name)))
                    continue
            if len(scope_names) >= 2:
                num = int(scope_names[1])
                pointer = pointer[num]
        if m_name[-11:] == "_embeddings":
            pointer = getattr(pointer, "weight")
        elif m_name == "kernel" or m_name == "gamma" or m_name == "output_weights":
            array = np.transpose(array)
        # print("Initialize PyTorch weight {}".format(name))
        pointer.data = torch.from_numpy(array)

    # Save pytorch-model
    print("Save PyTorch model to {}".format(pytorch_dump_path))
    torch.save(model.state_dict(), pytorch_dump_path)


convert_tf_checkpoint_to_pytorch(tf_path, config_path, pytorch_dump_path)

Hi, this is an actual programming error in modeling_bert.py. If you look at line 145 it's pretty obvious that the code should be continuing to the next iteration of the outer loop (over name, array) rather than the inner one (over the path components of name) - otherwise why would the error messages say "skipping {name}":

https://github.com/huggingface/transformers/blob/master/src/transformers/models/bert/modeling_bert.py#L145

To fix this, simply extract the try/except block so that it wraps the entire loop (lines 127-148). I would supply a patch but I have to work with transformers 3.5.1 for the moment since I'm using sentence-transformers which hasn't been updated to the latest version.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lemonhu picture lemonhu  Â·  3Comments

iedmrc picture iedmrc  Â·  3Comments

0x01h picture 0x01h  Â·  3Comments

rsanjaykamath picture rsanjaykamath  Â·  3Comments

fyubang picture fyubang  Â·  3Comments