New Transformer based model: ELECTRA
Hi guys, did you see the following paper: https://openreview.net/forum?id=r1xMH1BtvB ? There is a new Transformer based model called ELECTRA that seems very interesting and promising. It would be very useful to have a implementation of the model in PyTorch.
Hi @josecannete
Thanks for the tip! We are busy building other awesome things at the moment, but feel free to start a PR with a first draft and we will be happy to have a look at it 馃槃
And note that it's probably better to wait for the author's original code and pretrained weights.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Any news? Do you know if the code was released?
waiting...
The original code is available from here: https://github.com/google-research/electra
Anyone wants to give it a go? We can help!
We're on it! :hugs:
@LysandreJik I can help you with evaluating the model on downstream tasks to compare it with the original implementation - I'm currently training an ELECTRA model on GPU, so I'm highly interested in using it with Transformers 馃槄
@LysandreJik If it helps, I believe ELECTRA weights are drop-in replacements into the BERT codebase except we do not use a pooler layer and just take the final [CLS] hidden state for sentence representations.
waiting...+10086
Since v2.8.0 ELECTRA is in the library :)
@LysandreJik Is pretraining of Electra from scratch support available now?
Using default scripts run_language_modeling.py
?
Most helpful comment
@LysandreJik If it helps, I believe ELECTRA weights are drop-in replacements into the BERT codebase except we do not use a pooler layer and just take the final [CLS] hidden state for sentence representations.