Flair: Add ELMo embeddings

Created on 28 Nov 2018 · 4Comments · Source: flairNLP/flair

Just like #251 for BERT embeddings, we want to add ELMo embeddings to the Flair framework.

This will make it easier to compare Flair, BERT and ELMo embeddings against each other in various NLP tasks, and even combine these embeddings using the StackedEmbeddings class.

This will also make it easier to train and deploy models that use one or a combination of these embeddings.

feature release-0.4

Source

alanakbik

👍3 ❤2 🎉1 😄1

Most helpful comment

Added to release-0.4 branch. Will do some testing of Flair+ELMo combinations.

The new ELMoEmbeddings class can be used like any other embeddings class in the Flair framework, so:

from flair.embeddings import ELMoEmbeddings

# instantiate ELMo embeddings
elmo_embeddings = ELMoEmbeddings()

# make example sentence
sentence = Sentence('I love Berlin.', use_tokenizer=True)

# embed sentence
elmo_embeddings.embed(sentence)

# print embedded tokens
for token in sentence:
    print(token)
    print(token.embedding)

We package several pre-trained ELMo models: The large model used in the ELMo paper is used by default. You can also instantiate smaller models with:

# small ELMo embeddings
elmo_embeddings_small = ELMoEmbeddings('small')

# medium ELMo embeddings
elmo_embeddings_medium = ELMoEmbeddings('medium')

We also include the Portuguese model:

# small ELMo embeddings
elmo_embeddings_portuguese = ELMoEmbeddings('portuguese')

HOWEVER, in order to use these embeddings, you need to pip install allennlp next to flair. We are not including allennlp as dependency by default since the library is gigantic and would introduce a very large number of dependencies that we don't need.

alanakbik on 28 Nov 2018

❤3

All 4 comments

Relates to #173

alanakbik on 28 Nov 2018

Added to release-0.4 branch. Will do some testing of Flair+ELMo combinations.

The new ELMoEmbeddings class can be used like any other embeddings class in the Flair framework, so:

from flair.embeddings import ELMoEmbeddings

# instantiate ELMo embeddings
elmo_embeddings = ELMoEmbeddings()

# make example sentence
sentence = Sentence('I love Berlin.', use_tokenizer=True)

# embed sentence
elmo_embeddings.embed(sentence)

# print embedded tokens
for token in sentence:
    print(token)
    print(token.embedding)

We package several pre-trained ELMo models: The large model used in the ELMo paper is used by default. You can also instantiate smaller models with:

# small ELMo embeddings
elmo_embeddings_small = ELMoEmbeddings('small')

# medium ELMo embeddings
elmo_embeddings_medium = ELMoEmbeddings('medium')

We also include the Portuguese model:

# small ELMo embeddings
elmo_embeddings_portuguese = ELMoEmbeddings('portuguese')

alanakbik on 28 Nov 2018

❤3

... Enthusiastically waiting for the testing results ...

JoanEspasa on 28 Nov 2018

We are going to share any testing results here: https://github.com/zalandoresearch/flair/issues/308 Feel free to also share your results in that thread. Thanks!