Flair: Add Albert embeddings?

Created on 12 Oct 2019  路  10Comments  路  Source: flairNLP/flair

In a recent paper https://arxiv.org/pdf/1909.11942.pdf a small and said to be SOTA version of BERT-like model is proposed. Wouldn't it be nice to include this model in Flair?

Most helpful comment

With the xxlarge (v1, uncased) ALBERT model I could achieve 92.03% on CoNLL-2003 for NER 馃槃

So it outperforms all other "classic" BERT models, see a comparison here.

PR is coming soon!

All 10 comments

Hi @krzysztoffsiuwa ,

please have a look at this issue: https://github.com/huggingface/transformers/issues/1370

Once the albert model is included in transformers, then we can support it in Flair (I'll add an embedding layer whenever it is fully supported)

I'm also very excited about the NER results for Albert 馃

Thanks for a very quick response, great so it's probably just a matter of time now.

Hello, huggingface/transformers has been updated to v2.2.0. It supports ALBERT. Thanks!

So now we are only waiting for Flair support.

I'm current doing some experiments for NER with ALBERT and Flair. I'll post some results today + add the implementation :)

With the xxlarge (v1, uncased) ALBERT model I could achieve 92.03% on CoNLL-2003 for NER 馃槃

So it outperforms all other "classic" BERT models, see a comparison here.

PR is coming soon!

@stefan-it So, I guess loading ALBERT model using class BertEmbeddings is wrong? There is no error.
I can test your PR ;)

ALBERT for token classification has been recently added to the Transformers library.

I'll look into it again, maybe it is ready until end of this week :)

ALBERT embeddings were already added with https://github.com/flairNLP/flair/pull/1333, so I'm closing here.

You just need to pass the albert model name to the BERTEmbeddings instance:

from flair.data import Sentence
from flair.embeddings import BertEmbeddings

embeddings = BertEmbeddings(bert_model_or_path="albert-base-v2")

sent = Sentence("Berlin and Munich are nice cities .")
embeddings.embed(sent)

for token in sent.tokens:
  print(token.embedding) 

Thanks. I had problem related to recognizing type of model by path/name. The solution was to change name of directory to contain "albert" in it.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

davidsbatista picture davidsbatista  路  3Comments

mittalsuraj18 picture mittalsuraj18  路  3Comments

Rahulvks picture Rahulvks  路  3Comments

Y4rd13 picture Y4rd13  路  3Comments

aschmu picture aschmu  路  3Comments