Transformers: considerd to add albert?

Created on 29 Sep 2019 · 13Comments · Source: huggingface/transformers

🚀 Feature

Motivation

Additional context

wontfix

Source

fengzuo97

👍40

Most helpful comment

The official code and models got released :slightly_smiling_face:
https://github.com/google-research/google-research/tree/master/albert

tholor on 24 Oct 2019

🎉15 👍5 ❤2 😄2 👀1 🚀1

All 13 comments

Would definitely love to see an implementation of ALBERT added to this repository. Just for completeness:

That said, it could be even more interesting to implement the core improvements (factorized embedding parameterization, cross-layer parameter sharing) from ALBERT in (some?/all?) other transformers as optional features?

gooofy on 29 Sep 2019

👍17

Knowing how fast the team works, I would expect ALBERT to be implemented quite soon. That being said, I haven't had time to read the ALBERT paper yet, so it might be more difficult than previous BERT iterations such as distilbert and RoBERTa.

BramVanroy on 30 Sep 2019

I think ALBERT is very cool! Expect...

illcat on 1 Oct 2019

And in pytorch (using code from this repo and weights from brightmart) https://github.com/lonePatient/albert_pytorch

wassname on 5 Oct 2019

Any Update on the progress?

sarim-zafar on 21 Oct 2019

👀3

The ALBERT paper will be presented at ICLR in April 2020. From what I last heard, the huggingface team has been talking with the people over at Google AI to share the details of the model, but I can imagine that the researchers rather wait until the paper has been presented. One of those reasons being that they want to get citations from their ICLR talk rather than an arXiv citation which, in the field, is "worth less" than a big conference proceeding.

For now, just be patient. I am sure that the huggingface team will have a big announcement (follow their Twitter/LinkedIn channels) with a new version bump. No need to keep bumping this topic.

BramVanroy on 21 Oct 2019

👍6 😕3

https://github.com/interviewBubble/Google-ALBERT

roccqqck on 24 Oct 2019

The official code and models got released :slightly_smiling_face:
https://github.com/google-research/google-research/tree/master/albert

tholor on 24 Oct 2019

🎉15 👍5 ❤2 😄2 👀1 🚀1

[WIP]
ALBERT in tensorflow 2.0
https://github.com/kamalkraj/ALBERT-TF2.0

kamalkraj on 31 Oct 2019

https://github.com/lonePatient/albert_pytorch

Dataset: MNLI
Model: ALBERT_BASE_V2
Dev accuracy : 0.8418

Dataset: SST-2
Model: ALBERT_BASE_V2
Dev accuracy :0.926

lonePatient on 3 Nov 2019

👍1

PR was created, see here:

https://github.com/huggingface/transformers/pull/1683

stefan-it on 3 Nov 2019

🎉6

[WIP]
ALBERT in tensorflow 2.0
https://github.com/kamalkraj/ALBERT-TF2.0

Verison 2 weights added.
Support for SQuAD 1.1 and 2.0 added.
Reproduces the same results from paper. From my experiments, ALBERT model is very sensitive to hyperparameter like Batch Size. FineTuning using AdamW as Default in Original Repo. AdamW performs better than LAMB on Model finetuning.

kamalkraj on 6 Nov 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.