Transformers: State of ONNX

Created on 15 Oct 2020  Â·  6Comments  Â·  Source: huggingface/transformers

Hi, love the work that's going on with ONNX. I wanted to share the current state of ONNX support in case others were wondering about it (sorry for another table).

Pipeline | Supported
--- | ---
feature-extraction | ✓
sentiment-analysis | ✓
ner | ✓
question-answering | ✓
fill-mask | ✓
text-generation | ✓
translation | Broken https://github.com/huggingface/transformers/issues/5948#issuecomment-701699251
summarization |
zero-shot-classification |
conversational |
text2text-generation |

I was able to export models for both summarization and zero-shot-classification, but they both error without a specific token length due to a reshape inside the ONNX model (code in https://github.com/huggingface/transformers/issues/7404#issuecomment-703966076). If you have any ideas for how to prevent this, I'm happy to try and put together a PR.


A note for Mac Catalina users: exporting models may error with:

[libprotobuf ERROR google/protobuf/descriptor_database.cc:394] Invalid file descriptor data passed to EncodedDescriptorDatabase::Add().
[libprotobuf FATAL google/protobuf/descriptor.cc:1356] CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size): 
libc++abi.dylib: terminating with uncaught exception of type google::protobuf::FatalException: CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size): 

Use an older version of protobuf to avoid this (https://github.com/onnx/onnx/issues/2940#issuecomment-669979419):

pip3 uninstall protobuf
pip3 install protobuf==3.11.3

Most helpful comment

Hey @amanpreet692 - sorry this is not really my area of expertise here... @mfuntowicz - could you take a look?

All 6 comments

Hey, I tried exporting T5 for summarization but get the below error:
You have to specify either decoder_input_ids or decoder_inputs_embeds

I get a similar error for translation pipeline as well. Any workarounds available for this?
@patrickvonplaten @sshleifer

Hey @amanpreet692 - you need to provide both input_ids and decoder_input_ids for EncoderDecoderModels.

Hey @patrickvonplaten yep I get that, but the code implementation is such that we don't pass in the sample inputs for ONNX, the sample tokens are passed directly from within Pytorch onnx.export code I think which are consumed by the encoder and decoder inputs are empty.
I used https://github.com/huggingface/transformers/blob/master/notebooks/04-onnx-export.ipynb from @mfuntowicz as a reference with the additional parameter of 'translation' as pipeline.
Please let me know if there is an immediate solution else I am gonna look into this next week :)
Thanks!

Hey @amanpreet692 - sorry this is not really my area of expertise here... @mfuntowicz - could you take a look?

Hey @amanpreet692 are you able to resolve this error while exporting t5 You have to specify either decoder_input_ids or decoder_inputs_embeds?

Was this ever resolved @amanpreet692 @dharm033075 @mfuntowicz? I am having the same issue trying to export t5.

Was this page helpful?
0 / 5 - 0 ratings