Model I am using (Bert, XLNet ...): DistilBERT
Language I am using the model on (English, Chinese ...): English
The problem arises when using:
The tasks I am working on is:
from transformers import TFAutoModel, AutoTokenizer, AutoModel
import os
model = AutoModel.from_pretrained('distilbert-base-uncased')
os.system('mkdir distilbert')
model.save_pretrained('distilbert')
model = TFAutoModel.from_pretrained('distilbert', from_pt=True) # crashes
Model is converted from pytorch to tensorflow
transformers version: 3.0.2Traceback (most recent call last):
File "pt2tf.py", line 8, in <module>
model = TFAutoModel.from_pretrained('distilbert', from_pt=True)
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_auto.py", line 423, in from_pretrained
return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_utils.py", line 482, in from_pretrained
return load_pytorch_checkpoint_in_tf2_model(model, resolved_archive_file, allow_missing_keys=True)
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_pytorch_utils.py", line 93, in load_pytorch_checkpoint_in_tf2_model
tf_model, pt_state_dict, tf_inputs=tf_inputs, allow_missing_keys=allow_missing_keys
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_pytorch_utils.py", line 125, in load_pytorch_weights_in_tf2_model
tf_model(tf_inputs, training=False) # Make sure model is built
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 712, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_distilbert.py", line 603, in call
outputs = self.distilbert(inputs, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 712, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_distilbert.py", line 493, in call
embedding_output = self.embeddings(input_ids, inputs_embeds=inputs_embeds) # (bs, seq_length, dim)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 709, in __call__
self._maybe_build(inputs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1966, in _maybe_build
self.build(input_shapes)
File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_distilbert.py", line 112, in build
"weight", shape=[self.vocab_size, self.dim], initializer=get_initializer(self.initializer_range)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 389, in add_weight
aggregation=aggregation)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/tracking/base.py", line 713, in _add_variable_with_custom_getter
**kwargs_for_getter)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 154, in make_variable
shape=variable_shape if variable_shape else None)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 260, in __call__
return cls._variable_v1_call(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 221, in _variable_v1_call
shape=shape)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 199, in <lambda>
previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variable_scope.py", line 2502, in default_variable_creator
shape=shape)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 264, in __call__
return super(VariableMetaclass, cls).__call__(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py", line 464, in __init__
shape=shape)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py", line 608, in _init_from_args
initial_value() if init_from_fn else initial_value,
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 134, in <lambda>
init_val = lambda: initializer(shape, dtype=dtype)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops_v2.py", line 341, in __call__
dtype = _assert_float_dtype(dtype)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops_v2.py", line 769, in _assert_float_dtype
raise ValueError("Expected floating point type, got %s." % dtype)
ValueError: Expected floating point type, got <dtype: 'int32'>.
Hey @Alshutin,
I am not able to reproduce the error. It might be because PyTorch uses a GPU and Tensorflow does not.
Could you try to run your code when disabling GPU (export CUDA_VISIBLE_DEVICES="") and see whether the
error persists?
Hi! I just tried it with another version of TensorFlow. With 2.2.0 it just works.
With 2.0.0-beta1 and CUDA_VISIBLE_DEVICES="" the error persists.
Interesting - thanks for checking!
Does it crash as well for bert-base-uncased and TF 2.0.0?
Could you run these lines to verify?
from transformers import TFAutoModel, AutoTokenizer, AutoModel
import os
model = AutoModel.from_pretrained('bert-base-uncased')
os.system('mkdir bert')
model.save_pretrained('bert')
model = TFAutoModel.from_pretrained('bert', from_pt=True) # crashes
@thomwolf @jplu - are we gonna force TF 2.2 in transformers ?
Can you try with the 2.0.0 release and not beta? The beta was know to have a lot of issue and a lot of fixes have been applied since.
@patrickvonplaten I proposed indeed to fix the TensorFlow version to 2.2, because of some welcomed features from it. But nothing has been decided yet.
It works with 2.0.0 stable TensorFlow release.