Flair: Add Albert embeddings?

Created on 12 Oct 2019  路  10Comments  路  Source: flairNLP/flair

In a recent paper https://arxiv.org/pdf/1909.11942.pdf a small and said to be SOTA version of BERT-like model is proposed. Wouldn't it be nice to include this model in Flair?

Most helpful comment

With the xxlarge (v1, uncased) ALBERT model I could achieve 92.03% on CoNLL-2003 for NER 馃槃

So it outperforms all other "classic" BERT models, see a comparison here.

PR is coming soon!

All 10 comments

Hi @krzysztoffsiuwa ,

please have a look at this issue: https://github.com/huggingface/transformers/issues/1370

Once the albert model is included in transformers, then we can support it in Flair (I'll add an embedding layer whenever it is fully supported)

I'm also very excited about the NER results for Albert 馃

Thanks for a very quick response, great so it's probably just a matter of time now.

Hello, huggingface/transformers has been updated to v2.2.0. It supports ALBERT. Thanks!

So now we are only waiting for Flair support.

I'm current doing some experiments for NER with ALBERT and Flair. I'll post some results today + add the implementation :)

With the xxlarge (v1, uncased) ALBERT model I could achieve 92.03% on CoNLL-2003 for NER 馃槃

So it outperforms all other "classic" BERT models, see a comparison here.

PR is coming soon!

@stefan-it So, I guess loading ALBERT model using class BertEmbeddings is wrong? There is no error.
I can test your PR ;)

ALBERT for token classification has been recently added to the Transformers library.

I'll look into it again, maybe it is ready until end of this week :)

ALBERT embeddings were already added with https://github.com/flairNLP/flair/pull/1333, so I'm closing here.

You just need to pass the albert model name to the BERTEmbeddings instance:

from flair.data import Sentence
from flair.embeddings import BertEmbeddings

embeddings = BertEmbeddings(bert_model_or_path="albert-base-v2")

sent = Sentence("Berlin and Munich are nice cities .")
embeddings.embed(sent)

for token in sent.tokens:
  print(token.embedding) 

Thanks. I had problem related to recognizing type of model by path/name. The solution was to change name of directory to contain "albert" in it.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

aschmu picture aschmu  路  3Comments

prematurelyoptimized picture prematurelyoptimized  路  3Comments

alanakbik picture alanakbik  路  3Comments

Aditya715 picture Aditya715  路  3Comments

jannenev picture jannenev  路  3Comments