Transformers: Add support for Funnel-Transformer

Created on 8 Jun 2020  路  3Comments  路  Source: huggingface/transformers

馃専 New model addition

Model description

The recently introduced Funnel-Transformer architecture and models would be a great feature for Transformers:

Funnel-Transformer is a new self-attention model that gradually compresses the sequence of hidden states to a shorter one and hence reduces the computation cost. More importantly, by re-investing the saved FLOPs from length reduction in constructing a deeper or wider model, Funnel-Transformer usually has a higher capacity given the same FLOPs. In addition, with a decoder, Funnel-Transformer is able to recover the token-level deep representation for each token from the reduced hidden sequence, which enables standard pretraining.

The paper can be found here.

Open source status

New model

Most helpful comment

Will start to look into this.

All 3 comments

Will start to look into this.

@sgugger Any updates on this? Thanks!

The first models are uploaded and the base models are available in PyTorch (FunnelModel has encoder + decoder and FunnelBaseModel just the encoder, for sequence classification and multiple choice) in this branch. Should have all checkpoints on the HuggingFace S3 and all PyTorch models on the same branch by the end of this week.

Note that there might be some changes in the names as this goes under review once it's ready.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lcswillems picture lcswillems  路  3Comments

chuanmingliu picture chuanmingliu  路  3Comments

rsanjaykamath picture rsanjaykamath  路  3Comments

adigoryl picture adigoryl  路  3Comments

hsajjad picture hsajjad  路  3Comments