Flair: I am getting different embeddings every time I initialize CharacterEmbeddings

Created on 14 Jun 2019 · 5Comments · Source: flairNLP/flair

This is a code that I am using:

from flair.embeddings import CharacterEmbeddings
c_embeddings = CharacterEmbeddings ()

word_string_1 = "Example"
word_string_2 = "Example"

c_embeddings_1 = CharacterEmbeddings ()
c_embeddings_2 = CharacterEmbeddings ()




sentence_embed_it_1 = Sentence (word_string_1)
sentence_embed_it_2 = Sentence (word_string_2)

c_embeddings_1.embed(sentence_embed_it_1)
c_embeddings_1.embed (sentence_embed_it_2)

for token in sentence_embed_it_1:
    emb_1 =  (token.embedding.data)

for token in sentence_embed_it_2:
    emb_2 =  (token.embedding.data)

emb_1 and emb_2 return different values.

bug

Source

n-ibrahimov01

All 5 comments

Hello @n-ibrahimov01 yes the reason for this is that the CharacterEmbeddings are randomly initialized and only make sense if you train them on a downstream task first. So you can use them when training your own model. During model training, these embeddings will then get trained to make sense for that task.

If you're just interested in embedding text and not in training a downstream task model, you should use any of the pre-trained embeddings, such as WordEmbeddings, FlairEmbeddings and BertEmbeddings.

alanakbik on 20 Jun 2019

Why don't large pretrained character embeddings models exist yet?

Hellisotherpeople on 23 Jun 2019

👎1

@Hellisotherpeople the FlairEmbeddings are large pre-trained character embeddings.

They're different in that FlairEmbeddings are contextualized and pre-trained, whereas CharacterEmbeddings are uncontextualized and require to be trained on a task. We did some comparisons of the two in our COLING 2018 paper; At least on the tasks we looked at, flair embeddings were much better and when we used them, task-trained character features were no longer necessary.

alanakbik on 24 Jun 2019

🎉1

I should have read your paper more closely 😁.

Hellisotherpeople on 24 Jun 2019

😄1

:D no worries - will close this issue, but feel free to reopen if you have more questions!

alanakbik on 25 Jun 2019

Was this page helpful?

0 / 5 - 0 ratings