Flair: Add ELMo embeddings

Created on 28 Nov 2018  路  4Comments  路  Source: flairNLP/flair

Just like #251 for BERT embeddings, we want to add ELMo embeddings to the Flair framework.

This will make it easier to compare Flair, BERT and ELMo embeddings against each other in various NLP tasks, and even combine these embeddings using the StackedEmbeddings class.

This will also make it easier to train and deploy models that use one or a combination of these embeddings.

feature release-0.4

Most helpful comment

Added to release-0.4 branch. Will do some testing of Flair+ELMo combinations.

The new ELMoEmbeddings class can be used like any other embeddings class in the Flair framework, so:

from flair.embeddings import ELMoEmbeddings

# instantiate ELMo embeddings
elmo_embeddings = ELMoEmbeddings()

# make example sentence
sentence = Sentence('I love Berlin.', use_tokenizer=True)

# embed sentence
elmo_embeddings.embed(sentence)

# print embedded tokens
for token in sentence:
    print(token)
    print(token.embedding)

We package several pre-trained ELMo models: The large model used in the ELMo paper is used by default. You can also instantiate smaller models with:

# small ELMo embeddings
elmo_embeddings_small = ELMoEmbeddings('small')

# medium ELMo embeddings
elmo_embeddings_medium = ELMoEmbeddings('medium')

We also include the Portuguese model:

# small ELMo embeddings
elmo_embeddings_portuguese = ELMoEmbeddings('portuguese')

HOWEVER, in order to use these embeddings, you need to pip install allennlp next to flair. We are not including allennlp as dependency by default since the library is gigantic and would introduce a very large number of dependencies that we don't need.

All 4 comments

Relates to #173

Added to release-0.4 branch. Will do some testing of Flair+ELMo combinations.

The new ELMoEmbeddings class can be used like any other embeddings class in the Flair framework, so:

from flair.embeddings import ELMoEmbeddings

# instantiate ELMo embeddings
elmo_embeddings = ELMoEmbeddings()

# make example sentence
sentence = Sentence('I love Berlin.', use_tokenizer=True)

# embed sentence
elmo_embeddings.embed(sentence)

# print embedded tokens
for token in sentence:
    print(token)
    print(token.embedding)

We package several pre-trained ELMo models: The large model used in the ELMo paper is used by default. You can also instantiate smaller models with:

# small ELMo embeddings
elmo_embeddings_small = ELMoEmbeddings('small')

# medium ELMo embeddings
elmo_embeddings_medium = ELMoEmbeddings('medium')

We also include the Portuguese model:

# small ELMo embeddings
elmo_embeddings_portuguese = ELMoEmbeddings('portuguese')

HOWEVER, in order to use these embeddings, you need to pip install allennlp next to flair. We are not including allennlp as dependency by default since the library is gigantic and would introduce a very large number of dependencies that we don't need.

... Enthusiastically waiting for the testing results ...

We are going to share any testing results here: https://github.com/zalandoresearch/flair/issues/308 Feel free to also share your results in that thread. Thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

prematurelyoptimized picture prematurelyoptimized  路  3Comments

jewl123 picture jewl123  路  3Comments

inyukwo1 picture inyukwo1  路  3Comments

isanvicente picture isanvicente  路  3Comments

Rahulvks picture Rahulvks  路  3Comments