Transformers: Fine tuned to Multi-choice dataset?

Created on 5 Dec 2018 · 7Comments · Source: huggingface/transformers

Is it posible to fine tuned to the multi choices problems , which usually has one passage, question and ABCD four options?

Source

Qzsl123

Most helpful comment

Yes it is, the code is not written yet but I'm planning to work on it. The idea is to format the input data the same way the authors of Improving Language Understanding with Unsupervised Learning

You run an inference (context, choice) for each choice, you compute the image of the [CLS] token by a linear layer with 1 output and then compute a softmax over the output of all choices.

I will try to create a PR with this code very soon.

hi，The multi choices problem usually has one passage, question and ABCD four options。In your model, dose context means passage&question ?

DukeZhu on 25 Dec 2018

👍2

All 7 comments

Yes it is, the code is not written yet but I'm planning to work on it. The idea is to format the input data the same way the authors of Improving Language Understanding with Unsupervised Learning

Multiple choice GPT

You run an inference (context, choice) for each choice, you compute the image of the [CLS] token by a linear layer with 1 output and then compute a softmax over the output of all choices.

I will try to create a PR with this code very soon.

rodgzilla on 5 Dec 2018

👍1

Thx for the reply.
Actually, I have the same plan. But I am not sure whether it will work. Anyway, I will have a try.

Qzsl123 on 5 Dec 2018

If it worked in the OpenAI paper, I don't really see why it wouldn't work with this architecture.

rodgzilla on 5 Dec 2018

@Qzsl123 The code for multiple choice task is available in PR #96 if you want to test it.

rodgzilla on 7 Dec 2018

@rodgzilla yeah, I am trying to run it. Thanks for the wonderful job!

Qzsl123 on 7 Dec 2018

Yes it is, the code is not written yet but I'm planning to work on it. The idea is to format the input data the same way the authors of Improving Language Understanding with Unsupervised Learning

You run an inference (context, choice) for each choice, you compute the image of the [CLS] token by a linear layer with 1 output and then compute a softmax over the output of all choices.

I will try to create a PR with this code very soon.

hi，The multi choices problem usually has one passage, question and ABCD four options。In your model, dose context means passage&question ?

DukeZhu on 25 Dec 2018

👍2

Any update on this issue?