Model I am using:
RoBERTa (roberta-base)
Language I am using the model on:
English
The problem arises when using:
Conversion based on https://github.com/huggingface/tflite-android-transformers/blob/master/models_generation/distilbert.py
The tasks I am working on is:
It is irrelevant on this step.
Conversion script
import tensorflow as tf
from transformers import TFRobertaForSequenceClassification
model = TFRobertaForSequenceClassification.from_pretrained('roberta-base')
input_spec = tf.TensorSpec([1, 384], tf.int32)
model._set_inputs(input_spec, training=False)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
# For conversion with hybrid quantization:
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
converter.experimental_new_converter = True
tflite_model = converter.convert()
open("test.tflite", "wb").write(tflite_model)
Error: tf.Cumsum op is neither a custom op nor a flex op and needs a custom implementation
No errors.
I believe this is because tf.Cumsum is not a supported operation and not an issue relating to this repo. Here is a link to the tensorflow documentation on supported ops. [https://www.tensorflow.org/lite/guide/ops_compatibility]
In the past, I've been able to get around unsupported ops by reimplementing the operator with supported ops or replacing the unsupported portion with another op. ie. relu in place of gelu.
Hey @will-rice, thank you for giving me an idea how to handle this issue. I managed to overcome this problem, by using custom _cumsum_ function implemented in pure python by @ibab in here https://github.com/tensorflow/tensorflow/issues/813.
I just changed it to sum over rows not columns, as the way it is done in the Roberta model.
Here is a cumsum function:
def cumsum(xs):
values = tf.unstack(xs, axis=1)
out = []
prev = tf.zeros_like(values[0])
for val in values:
s = prev + val
out.append(s)
prev = s
result = tf.stack(out, axis=1)
return result
and it is used in the _modeling_tf_roberta.py_ file in line 69:
# Original code / non tflite compatible way
incremental_indicies = tf.math.cumsum(mask, axis=1) * mask)
# My custom code / tflite compatible way
incremental_indicies = cumsum(mask) * mask
Hope it will help anyone aswell!
Also cc'ing @Pierrci
@julien-c any updates on this feature? Was browsing through the later releases but could not find any reference.
Thanks!
@dshahrokhian As mentioned by @will-rice, the issue is due to the lack of support for the tf.Cumsum operator by TFLite and thus not related to transformers. If you encounter the same problem you can implement the workaround posted by @kubux1 earlier, or implement a similar one if you're having this issue with a different operator.
@Pierrci thanks! It also seems to be have been solved in the latest release of tf-nightly: https://github.com/tensorflow/tensorflow/issues/42382#issuecomment-675000451
Most helpful comment
@Pierrci thanks! It also seems to be have been solved in the latest release of
tf-nightly: https://github.com/tensorflow/tensorflow/issues/42382#issuecomment-675000451