Flair: Contextual embedding don't seem to work

Created on 11 Jun 2019 · 11Comments · Source: flairNLP/flair

Trying to explore the contxtual side of Flair embeddings with a simple example:

# your query
query = 'The capital of Washington'

# some texts
sentences = [
    'George Washington addressed his supporters',
    'Taking a flight to Washington tonight',
    'Arkansaw is a lovely state',
    'George Washington was a great president',
]

# first, declare how you want to embed
embeddings = DocumentPoolEmbeddings([FlairEmbeddings('news-forward'), 
                                     FlairEmbeddings('news-backward')
                                    ])

# embed
q = Sentence(query)
embeddings.embed(q)

# use cosine distance
cos = torch.nn.CosineSimilarity(dim=0, eps=1e-6)

for sentence in sentences:
    s = Sentence(sentence)
    embeddings.embed(s)
    prox = cos(q.embedding, s.embedding)
    print(query, ' - ', sentence, ' - ', prox)

Results:

The capital of Washington  -  George Washington addressed his supporters  -  0.3869
The capital of Washington  -  Taking a flight to Washington tonight  -  0.4389
The capital of Washington  -  Arkansaw is a lovely state  -  0.3746
The capital of Washington  -  George Washington was a great president  -  0.3629

Would've expected much higher scores on the 'Geo context' sentences
Am I doing something wrong?

question

Source

eh-93

Most helpful comment

Hello @eliehamouche @stefan-it thanks for sharing these results!

Another idea would be not to use the cosine of document vectors as a measure of similarity, but different measures that get document similarity based on word embeddings. An example of this would be the word mover's distance: Like document pool embeddings, it need not be trained so it can be used without supervision. We don't yet have it in Flair, but I think it's probably not difficult to implement and experiment with. It might be interesting to see how well word mover's distance works with different types of contextualized word embeddings.

alanakbik on 13 Jun 2019

👍2

All 11 comments

Technically, the code looks good. Here are some other comparisons with BERT and ELMo:

| LM | Sentence | Similarity
| ------------------------ | ------------------------------------------ | -----------
| BERT (bert-base-uncased) | George Washington addressed his supporters | 0.6652
| BERT (bert-base-uncased) | Taking a flight to Washington tonight | 0.6186
| BERT (bert-base-uncased) | Arkansaw is a lovely state | 0.5656
| BERT (bert-base-uncased) | George Washington was a great president | 0.6955
| BERT (bert-base-cased) | George Washington addressed his supporters | 0.8641
| BERT (bert-base-cased) | Taking a flight to Washington tonight | 0.8477
| BERT (bert-base-cased) | Arkansaw is a lovely state | 0.8385
| BERT (bert-base-cased) | George Washington was a great president | 0.8622
| BERT (bert-large-uncased)| George Washington addressed his supporters | 0.7823
| BERT (bert-large-uncased)| Taking a flight to Washington tonight | 0.7476
| BERT (bert-large-uncased)| Arkansaw is a lovely state | 0.7185
| BERT (bert-large-uncased)| George Washington was a great president | 0.8058
| BERT (bert-large-cased) | George Washington addressed his supporters | 0.8190
| BERT (bert-large-cased) | Taking a flight to Washington tonight | 0.7761
| BERT (bert-large-cased) | Arkansaw is a lovely state | 0.7934
| BERT (bert-large-cased) | George Washington was a great president | 0.8424
| ELMo | George Washington addressed his supporters | 0.3986
| ELMo | Taking a flight to Washington tonight | 0.4577
| ELMo | Arkansaw is a lovely state | 0.3902
| ELMo | George Washington was a great president | 0.3886
| GPT-1 | George Washington addressed his supporters | 0.8232
| GPT-1 | Taking a flight to Washington tonight | 0.8396
| GPT-1 | Arkansaw is a lovely state | 0.7307
| GPT-1 | George Washington was a great president | 0.8003
| Transformer-XL | George Washington addressed his supporters | 0.2481
| Transformer-XL | Taking a flight to Washington tonight | 0.1841
| Transformer-XL | Arkansaw is a lovely state | 0.3009
| Transformer-XL | George Washington was a great president | 0.2997

stefan-it on 11 Jun 2019

🎉2

ELMo looks quite similar to the result with Flair Embeddings :)

stefan-it on 11 Jun 2019

Spelling Arkansas correctly may help the model realize that it's a geolocation.

Also, given the 4 sentences, Flair correctly ranks the "Taking a flight to Washington tonight " as the most important, so I don't see the problem. Maybe you'd like the difference in similarity to be higher.

I'd like to see how TransformerXL and GPT-2 do on this, and maybe even word2vec / fasttext

Hellisotherpeople on 11 Jun 2019

@stefan-it - thanks for the table that's quite interesting.

@Hellisotherpeople Arkansaw is a town in Wisconsin. Would expect it to pick up on that as geolocation too.

Sorry just realised I said it's a lovely state in the example - I see how that's misleading.

eh-93 on 11 Jun 2019

As requested, I added the scores for GPt-1 and Transformer-XL 🤗

stefan-it on 11 Jun 2019

Thanks @stefan-it

Tried some other examples with Flair - these actually work well:

the bucket and mop are in the closet  -  he kicked the bucket  -  0.5848
the bucket and mop are in the closet  -  i have yet to cross-off all the items on my bucket list  - 0.5263
the bucket and mop are in the closet  -  the bucket was filled with water  - 0.6970

he is currently resting at home  -  the dog sleeps in the kennel  - 0.4730
he is currently resting at home  -  he lived in a beautiful mansion  - 0.5347
he is currently resting at home  -  the home office issued penalties for late filing  - 0.4030
he is currently resting at home  -  press the home button on your phone  -  0.3302

Anyone have any further insight or ideas?
If not I'll close this out later on

eh-93 on 12 Jun 2019

Hello @eliehamouche @stefan-it thanks for sharing these results!

alanakbik on 13 Jun 2019

👍2

Hey @alanakbik - sorry for the delay missed the notification

That looks quite interesting actually, I'll do a quick comparison and revert back.

eh-93 on 25 Jun 2019

@alanakbik @eliehamouche Hello, really thank you provide the transformerXL embedding, but I have a question, if i train my own transformerXL embedding, it seems that it cannot be integrated into the embedding in Flair, such as elmo, I just provide the option_file and the weight_file.

songtaoshi on 10 Aug 2019

@songtaoshi I will push a follow-up PR for passing custom models into the newly added embeddings very soon (I've also trained a few XLNet models) :)

stefan-it on 10 Aug 2019

@stefan-it Wow great !!!!! thanks for your replying. Really looking forward the new PR.

songtaoshi on 11 Aug 2019

Was this page helpful?

0 / 5 - 0 ratings