Transformers: ELECTRA Model

Created on 4 Oct 2019 · 14Comments · Source: huggingface/transformers

🚀 Feature

New Transformer based model: ELECTRA

Motivation

Hi guys, did you see the following paper: https://openreview.net/forum?id=r1xMH1BtvB ? There is a new Transformer based model called ELECTRA that seems very interesting and promising. It would be very useful to have a implementation of the model in PyTorch.

Additional context

Paper: https://openreview.net/forum?id=r1xMH1BtvB

Source

josecannete

👍20 👀2 🚀1

Most helpful comment

@LysandreJik If it helps, I believe ELECTRA weights are drop-in replacements into the BERT codebase except we do not use a pooler layer and just take the final [CLS] hidden state for sentence representations.

clarkkev on 10 Mar 2020

❤14 👍11

All 14 comments

Hi @josecannete

Thanks for the tip! We are busy building other awesome things at the moment, but feel free to start a PR with a first draft and we will be happy to have a look at it 😄

rlouf on 7 Oct 2019

And note that it's probably better to wait for the author's original code and pretrained weights.

thomwolf on 9 Oct 2019

👍2

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] on 8 Dec 2019

👎5 😕2

Any news? Do you know if the code was released?

josecannete on 8 Dec 2019

waiting...

sloth2012 on 31 Dec 2019

👍2

The original code is available from here: https://github.com/google-research/electra

anbasile on 10 Mar 2020

🎉4 👀2

Anyone wants to give it a go? We can help!

julien-c on 10 Mar 2020

We're on it! :hugs:

LysandreJik on 10 Mar 2020

👍17 🚀8

@LysandreJik I can help you with evaluating the model on downstream tasks to compare it with the original implementation - I'm currently training an ELECTRA model on GPU, so I'm highly interested in using it with Transformers 😅