Transformers: GLUE test set predictions

Created on 8 Mar 2020 · 7Comments · Source: huggingface/transformers

🚀 Feature request

Motivation

The run_glue script is super helpful. But it currently doesn't implement producing predictions on the test datasets for the GLUE tasks. I think this would be extremely helpful for a lot of people. I'm sure there are plenty of people who have implemented this functionality themselves, but I haven't found any. Since transformers already provides train and dev for GLUE, it would be cool to complete the feature set with providing test set predictions.

Your contribution

I'm personally working on a branch that extends the glue_processors to support the test sets (which are already downloaded by the recommended download_glue.py script. I also update the run_glue.py script to produce the *.tsv files required by the GLUE online submission interface.

I think I'm a couple days out from testing/completing my implementation. I'm also sure plenty of implementations exist of this. If there are no other plans to support this in the works, I'm happy to submit a PR.

Source

shoarora

👍5

All 7 comments

hi @shoarora

can you share the script to to report performance on test set?

Mahmedturk on 25 Apr 2020

@Mahmedturk you can check out the branch in PR #3405

It's diverge pretty heavily from master and I haven't updated it yet, but you should still be able to run run_glue.py off that branch with the --do_test flag that I added and it should produce the .tsv files required for submission.

shoarora on 25 Apr 2020

@shoarora I pulled the repo to update run_glue.py as I wanted to use this new feature. However, I now get an error when I run run_glue.py! Please see below the output of the error message. It looks like in previous versions, there weren't any keyword arguments named "mode" in GlueDataset() --possible?

Traceback (most recent call last): File "./transformers/examples/text-classification/run_glue.py", line 228, in <module> main() File "./transformers/examples/text-classification/run_glue.py", line 139, in main test_dataset = GlueDataset(data_args, tokenizer=tokenizer, mode="test") if training_args.do_predict else None TypeError: __init__() got an unexpected keyword argument 'mode'

AMChierici on 25 May 2020

@AMChierici I didn't author #4463, which is what has made it to master to enable this feature. I haven't played with it yet so sorry I can't be of more help

shoarora on 25 May 2020

👍1

@AMChierici make sure you run from master, there's indeed a mode kwarg now.

@shoarora Thanks for this first PR and I did check yours while merging the other (to make sure that the indices in csv parsing, etc. were correct)

julien-c on 25 May 2020

😕1 🎉1

Thanks, @julien-c . Yes, solved.. In fact, I was not running from master.

AMChierici on 26 May 2020

downloaded master right now.
File "examples/text-classification/run_glue.py", line 143, in main
if training_args.do_eval
TypeError: __init__() got an unexpected keyword argument 'mode'