Transformers: TF BERT not FP16 compatible?

Created on 18 Mar 2020 · 10Comments · Source: huggingface/transformers

🐛 Bug

Information

Model I am using (Bert, XLNet ...): TFBertForQuestionAnswering

Language I am using the model on (English, Chinese ...): English

The problem arises when using:

[x] my own modified scripts:

The tasks I am working on is:

[x] an official GLUE/SQUaD task: SQUaD

To reproduce

Simple example to reproduce error:

import tensorflow as tf
from transformers import TFBertForQuestionAnswering

# turn on mp (fp16 operations)
tf.keras.mixed_precision.experimental.set_policy('mixed_float16')

model = TFBertForQuestionAnswering.from_pretrained('bert-base-uncased')

The error occurs here:
transformers/modeling_tf_bert.py", line 174, in _embedding
embeddings = inputs_embeds + position_embeddings + token_type_embeddings

And this is the error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute AddV2 as input #1(zero-based) was expected to be a half tensor but is a float tensor [Op:AddV2] name: tf_bert_for_question_answering/bert/embeddings/add/

Expected behavior

I want to use TF BERT with mixed precision (for faster inference on tensor core GPUs). I know that full fp16 is not working out-of-the-box, because the model weights need to be in fp16 as well. Mixed precision, however, should work because only operations are performed in fp16.

I get some dtype issue. Seems the mode is not fp16 compatible yet? Will this be fixed in the future?

Environment info

transformers version: 2.5.0
Platform: ubuntu 16.04
Python version: 3.6.9
PyTorch version (GPU?): 1.4.0 (GPU)
Tensorflow version (GPU?): 2.1.0 (GPU)
Using GPU in script?: sort of
Using distributed or parallel set-up in script?: nope

Source

volker42maru

👍7

Most helpful comment

This is still an open problem...I didn't find the time yet to take a look! Will link this issue to the TF projects.

patrickvonplaten on 8 Aug 2020

👍3

All 10 comments

I've aced same issue. Maybe it's hard coded the data type somewhere? Have you found solution?

bamps53 on 16 Apr 2020

Tried this on Colab TPU, same error.

rzepinskip on 24 Apr 2020

Same here, would be convenient as hell :)

ben74 on 6 May 2020

Having the same error also for transformers version 2.11.0.
Here some code to easily reproduce the error:

#!/usr/bin/env python3
from transformers import TFBertModel, BertTokenizer
from tensorflow.keras.mixed_precision import experimental as mixed_precision

policy = mixed_precision.Policy('mixed_float16')
mixed_precision.set_policy(policy)

tok = BertTokenizer.from_pretrained("bert-base-uncased")
model = TFBertModel.from_pretrained("bert-base-uncased")
input_ids = tok("The dog is cute", return_tensors="tf").input_ids
model(input_ids)  # throws error on GPU

patrickvonplaten on 18 Jun 2020

👍1

Encountering the same issue here:
```python3
import tensorflow as tf
from transformers.modeling_tf_distilbert import TFDistilBertModel

tf.keras.mixed_precision.experimental.set_policy('mixed_float16')
model = TFDistilBertModel.from_pretrained('distilbert-base-uncased')

chrisabbott on 20 Jun 2020

Put this issue on my TF ToDo-List :-)

patrickvonplaten on 22 Jun 2020

Hazarapet on 17 Jul 2020

Hi @patrickvonplaten, is this problem fixed？
I got the same error recently with version 3.0.2

QixinLi on 7 Aug 2020

This is still an open problem...I didn't find the time yet to take a look! Will link this issue to the TF projects.

patrickvonplaten on 8 Aug 2020

👍3

This is already solved in new version.
position_embeddings = tf.cast(self.position_embeddings(position_ids), inputs_embeds.dtype) token_type_embeddings = tf.cast(self.token_type_embeddings(token_type_ids), inputs_embeds.dtype) embeddings = inputs_embeds + position_embeddings + token_type_embeddings