Transformers: With GPT-2 is it possible to get previous word prediction?

Created on 1 Oct 2019 · 2Comments · Source: huggingface/transformers

Feature/Question: With GPT-2 is it possible to get previous word prediction?

Hi,

I say this after seeing this https://towardsdatascience.com/deconstructing-bert-distilling-6-patterns-from-100-million-parameters-b49113672f77

And wondering how I could maybe write a method that would allow me to predict the previous word? (ideally for GPT2)

Many thanks,
Vince.

Source

hypnoai

Most helpful comment

Hi! There is one big difference between BERT and GPT-2, in that BERT is trained using masked language modeling, whereas GPT-2 is trained using causal language modeling.

During pre-training, BERT learns to predict masked words given a bi-directional context. GPT-2, on the other hand, learns to predict a word given only its left context. This is why GPT-2 is very good at text generation (it only needs the left-hand side context), while BERT isn't.

Given this, GPT-2 won't be able to do previous word prediction, as it does not handle the right-hand side context.

LysandreJik on 1 Oct 2019

👍3

All 2 comments

Hi! There is one big difference between BERT and GPT-2, in that BERT is trained using masked language modeling, whereas GPT-2 is trained using causal language modeling.

Given this, GPT-2 won't be able to do previous word prediction, as it does not handle the right-hand side context.

LysandreJik on 1 Oct 2019

👍3

If you want to train your own GPT-2 model to predict previous words, you could feed in your entire training set in reverse word order. Then GPT-2 would learn to predict text backwards, and that model would then be able to tell you what word should come before a piece of text.