Transformers: Error in conversion to tensorflow

Created on 15 Jul 2020 · 6Comments · Source: huggingface/transformers

🐛 Bug

Information

Model I am using (Bert, XLNet ...): DistilBERT

Language I am using the model on (English, Chinese ...): English

The problem arises when using:

[ ] the official example scripts: (give details below)
[x] my own modified scripts: (give details below)

The tasks I am working on is:

[ ] an official GLUE/SQUaD task: (give the name)
[x] my own task or dataset: (give details below)

To reproduce

from transformers import TFAutoModel, AutoTokenizer, AutoModel
import os

model = AutoModel.from_pretrained('distilbert-base-uncased')
os.system('mkdir distilbert')
model.save_pretrained('distilbert')
model = TFAutoModel.from_pretrained('distilbert', from_pt=True) # crashes

Expected behavior

Model is converted from pytorch to tensorflow

Environment info

transformers version: 3.0.2
Platform: Linux-5.3.0-62-generic-x86_64-with-Ubuntu-18.04-bionic
Python version: 3.6.9
PyTorch version (GPU?): 1.5.1 (True)
Tensorflow version (GPU?): 2.0.0-beta1 (False)
Using GPU in script?: no
Using distributed or parallel set-up in script?: no

Actual Behaviour

Traceback (most recent call last):
  File "pt2tf.py", line 8, in <module>
    model = TFAutoModel.from_pretrained('distilbert', from_pt=True)
  File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_auto.py", line 423, in from_pretrained
    return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_utils.py", line 482, in from_pretrained
    return load_pytorch_checkpoint_in_tf2_model(model, resolved_archive_file, allow_missing_keys=True)
  File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_pytorch_utils.py", line 93, in load_pytorch_checkpoint_in_tf2_model
    tf_model, pt_state_dict, tf_inputs=tf_inputs, allow_missing_keys=allow_missing_keys
  File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_pytorch_utils.py", line 125, in load_pytorch_weights_in_tf2_model
    tf_model(tf_inputs, training=False)  # Make sure model is built
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 712, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_distilbert.py", line 603, in call
    outputs = self.distilbert(inputs, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 712, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_distilbert.py", line 493, in call
    embedding_output = self.embeddings(input_ids, inputs_embeds=inputs_embeds)  # (bs, seq_length, dim)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 709, in __call__
    self._maybe_build(inputs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1966, in _maybe_build
    self.build(input_shapes)
  File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_distilbert.py", line 112, in build
    "weight", shape=[self.vocab_size, self.dim], initializer=get_initializer(self.initializer_range)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 389, in add_weight
    aggregation=aggregation)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/tracking/base.py", line 713, in _add_variable_with_custom_getter
    **kwargs_for_getter)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 154, in make_variable
    shape=variable_shape if variable_shape else None)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 260, in __call__
    return cls._variable_v1_call(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 221, in _variable_v1_call
    shape=shape)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 199, in <lambda>
    previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variable_scope.py", line 2502, in default_variable_creator
    shape=shape)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 264, in __call__
    return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py", line 464, in __init__
    shape=shape)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py", line 608, in _init_from_args
    initial_value() if init_from_fn else initial_value,
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 134, in <lambda>
    init_val = lambda: initializer(shape, dtype=dtype)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops_v2.py", line 341, in __call__
    dtype = _assert_float_dtype(dtype)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops_v2.py", line 769, in _assert_float_dtype
    raise ValueError("Expected floating point type, got %s." % dtype)
ValueError: Expected floating point type, got <dtype: 'int32'>.

Source

AIshutin

All 6 comments

Hey @Alshutin,

I am not able to reproduce the error. It might be because PyTorch uses a GPU and Tensorflow does not.
Could you try to run your code when disabling GPU (export CUDA_VISIBLE_DEVICES="") and see whether the
error persists?

patrickvonplaten on 15 Jul 2020

Hi! I just tried it with another version of TensorFlow. With 2.2.0 it just works.

AIshutin on 15 Jul 2020

👍1

With 2.0.0-beta1 and CUDA_VISIBLE_DEVICES="" the error persists.

AIshutin on 15 Jul 2020

Interesting - thanks for checking!

Does it crash as well for bert-base-uncased and TF 2.0.0?

Could you run these lines to verify?

from transformers import TFAutoModel, AutoTokenizer, AutoModel
import os

model = AutoModel.from_pretrained('bert-base-uncased')
os.system('mkdir bert')
model.save_pretrained('bert')
model = TFAutoModel.from_pretrained('bert', from_pt=True) # crashes

@thomwolf @jplu - are we gonna force TF 2.2 in transformers ?

patrickvonplaten on 15 Jul 2020

Can you try with the 2.0.0 release and not beta? The beta was know to have a lot of issue and a lot of fixes have been applied since.

@patrickvonplaten I proposed indeed to fix the TensorFlow version to 2.2, because of some welcomed features from it. But nothing has been decided yet.