Transformers: how to get word embedding vector in GPT-2

Created on 8 Oct 2019 · 10Comments · Source: huggingface/transformers

❓ Questions & Help

How can we get the word embedding vector in gpt-2? I follow the guidance in bert(model.embeddings.word_embeddings.weight). But it shows that ''GPT2LMHeadModel' object has no attribute 'embeddings''.

Please help me with that. Thank you in advance.

Source

weiguowilliam

Most helpful comment

Hi, indeed GPT-2 has a slightly different implementation than BERT. In order to have access to the embeddings, you would have to do the following:

from transformers import GPT2LMHeadModel

model = GPT2LMHeadModel.from_pretrained('gpt2')  # or any other checkpoint
word_embeddings = model.transformer.wte.weight  # Word Token Embeddings 
position_embeddings = model.transformer.wpe.weight  # Word Position Embeddings

LysandreJik on 8 Oct 2019

👍8

All 10 comments

Hi, indeed GPT-2 has a slightly different implementation than BERT. In order to have access to the embeddings, you would have to do the following:

from transformers import GPT2LMHeadModel

model = GPT2LMHeadModel.from_pretrained('gpt2')  # or any other checkpoint
word_embeddings = model.transformer.wte.weight  # Word Token Embeddings 
position_embeddings = model.transformer.wpe.weight  # Word Position Embeddings

LysandreJik on 8 Oct 2019

👍8

Hi, indeed GPT-2 has a slightly different implementation than BERT. In order to have access to the embeddings, you would have to do the following:
from transformers import GPT2LMHeadModel

model = GPT2LMHeadModel.from_pretrained('gpt2')  # or any other checkpoint
word_embeddings = model.transformer.wte.weight  # Word Token Embeddings 
position_embeddings = model.transformer.wpe.weight  # Word Position Embeddings 

Hi,

Thank you for your reply! So if I want to get the vector for 'man', it would be like this:

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
text_index = tokenizer.encode('man',add_prefix_space=True)
vector = model.transformer.wte.weight[text_index,:]

Is it correct?

weiguowilliam on 8 Oct 2019

👍2

Just wondering, how to transform word_vector to word? Imagine a word vector and change a few elements, how can I find closest word from gpt2 model?

fqassemi on 29 Feb 2020

Just wondering, how to transform word_vector to word? Imagine a word vector and change a few elements, how can I find closest word from gpt2 model?

So for each token in dictionary there is a static embedding(on layer 0). You can use cosine similarity to find the closet static embedding to the transformed vector. That should help you find the word.

weiguowilliam on 2 Mar 2020

Just wondering, how to transform word_vector to word? Imagine a word vector and change a few elements, how can I find closest word from gpt2 model?

So for each token in dictionary there is a static embedding(on layer 0). You can use cosine similarity to find the closet static embedding to the transformed vector. That should help you find the word.

Thanks. It means that for every word_vector I have to calculate vocab_size (~50K) cosine_sim manipulation. Is that right?

fqassemi on 2 Mar 2020

Just wondering, how to transform word_vector to word? Imagine a word vector and change a few elements, how can I find closest word from gpt2 model?

So for each token in dictionary there is a static embedding(on layer 0). You can use cosine similarity to find the closet static embedding to the transformed vector. That should help you find the word.

Thanks. It means that for every word_vector I have to calculate vocab_size (~50K) cosine_sim manipulation. Is that right?

I guess so. Unless you can use some property to first tighten the range.

weiguowilliam on 2 Mar 2020

Just wondering, how to transform word_vector to word? Imagine a word vector and change a few elements, how can I find closest word from gpt2 model?

So for each token in dictionary there is a static embedding(on layer 0). You can use cosine similarity to find the closet static embedding to the transformed vector. That should help you find the word.

Thanks. It means that for every word_vector I have to calculate vocab_size (~50K) cosine_sim manipulation. Is that right?

I guess so. Unless you can use some property to first tighten the range.

Ok. Three more questions, 1) is there any resource on how to generate fixed length sentence (a sentence with N words that ends with "." or "!" )? 2) what is the most effective underlying parameter for hyper-parameter tuning (eg. Temperature)? 3) Is there any slack channel to discuss these types of questions?

fqassemi on 2 Mar 2020

👍1

Just wondering, how to transform word_vector to word? Imagine a word vector and change a few elements, how can I find closest word from gpt2 model?

So for each token in dictionary there is a static embedding(on layer 0). You can use cosine similarity to find the closet static embedding to the transformed vector. That should help you find the word.

Thanks. It means that for every word_vector I have to calculate vocab_size (~50K) cosine_sim manipulation. Is that right?

I guess so. Unless you can use some property to first tighten the range.

Ok. Three more questions, 1) is there any resource on how to generate fixed length sentence (a sentence with N words that ends with "." or "!" )? 2) what is the most effective underlying parameter for hyper-parameter tuning (eg. Temperature)? 3) Is there any slack channel to discuss these types of questions?

about 1) I don't think that there is any. You can use Web Scraping for such specified sentences. Also, you can download a corpus and use Regex to extract desired sentences.

2) I don't really know

3) If you find any, please share it with me too. Thanks! 😄

AminTaheri23 on 16 Mar 2020

Hi, indeed GPT-2 has a slightly different implementation than BERT. In order to have access to the embeddings, you would have to do the following:
from transformers import GPT2LMHeadModel

model = GPT2LMHeadModel.from_pretrained('gpt2')  # or any other checkpoint
word_embeddings = model.transformer.wte.weight  # Word Token Embeddings 
position_embeddings = model.transformer.wpe.weight  # Word Position Embeddings 
Hi,

Thank you for your reply! So if I want to get the vector for 'man', it would be like this:

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
text_index = tokenizer.encode('man',add_prefix_space=True)
vector = model.transformer.wte.weight[text_index,:]

Is it correct?

Did you succeed? I'm pursuing the same goal and I don't know how to validate my findings. I have tested some king - man + woman stuff, but it didn't work.

AminTaheri23 on 16 Mar 2020

Hi, indeed GPT-2 has a slightly different implementation than BERT. In order to have access to the embeddings, you would have to do the following:
from transformers import GPT2LMHeadModel

model = GPT2LMHeadModel.from_pretrained('gpt2')  # or any other checkpoint
word_embeddings = model.transformer.wte.weight  # Word Token Embeddings 
position_embeddings = model.transformer.wpe.weight  # Word Position Embeddings 
Hi,
Thank you for your reply! So if I want to get the vector for 'man', it would be like this:

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
text_index = tokenizer.encode('man',add_prefix_space=True)
vector = model.transformer.wte.weight[text_index,:]

Is it correct?
Did you succeed? I'm pursuing the same goal and I don't know how to validate my findings. I have tested some king - man + woman stuff, but it didn't work.