Fairseq: A rookie question of RoBERTa finetuning on RACE dataset

Created on 28 Mar 2020 · 2Comments · Source: pytorch/fairseq

🐛 Bug

Hi, I am trying to use the fine-tuning code of roberta on race dataset on Colab with GPU. But I cannot run the following code on the Colab. Does any know how to solve it? Thanks a lot.
https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.race.md#3-fine-tuning-on-race

Here is the error information.

  File "<ipython-input-8-17151a1d11e0>", line 9
    CUDA_VISIBLE_DEVICES=0 fairseq-train $DATA_DIR --ddp-backend=no_c10d   --restore-file $ROBERTA_PATH   --reset-optimizer --reset-dataloader --reset-meters   --best-checkpoint-metric accuracy --maximize-best-checkpoint-metric   --task sentence_ranking   --num-classes $NUM_CLASSES   --init-token 0 --separator-token 2   --max-option-length 128   --max-positions 512   --truncate-sequence   --arch roberta_large   --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01   --criterion sentence_ranking   --optimizer adam --adam-betas '(0.9, 0.98)' --adam-eps 1e-06   --clip-norm 0.0   --lr-scheduler fixed --lr $LR   --fp16 --fp16-init-scale 4 --threshold-loss-scale 1 --fp16-scale-window 128   --max-sentences $MAX_SENTENCES   --required-batch-size-multiple 1   --update-freq $UPDATE_FREQ   --max-epoch $MAX_EPOCH
                                 ^
SyntaxError: invalid syntax

And there is a red wavy line of that line

"CUDA_VISIBLE_DEVICES=0 fairseq-train $DATA_DIR --ddp-backend=no_c10d \"

I guess probably the problem is from the "fairseq-train". Does anyone knows that?

Many thanks!

bug needs triage

Source

14H034160212

All 2 comments

It seems like you're trying to run this in a python cell. What happens when you add a ! to the beginning of the command so it's interpreted as a unix command; i.e.,

! CUDA_VISIBLE_DEVICES=0 fairseq-train $DATA_DIR --ddp-backend=no_c10d --restore-file $ROBERTA_PATH --reset-optimizer --reset-dataloader --reset-meters --best-checkpoint-metric accuracy --maximize-best-checkpoint-metric --task sentence_ranking --num-classes $NUM_CLASSES --init-token 0 --separator-token 2 --max-option-length 128 --max-positions 512 --truncate-sequence --arch roberta_large --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 --criterion sentence_ranking --optimizer adam --adam-betas '(0.9, 0.98)' --adam-eps 1e-06 --clip-norm 0.0 --lr-scheduler fixed --lr $LR --fp16 --fp16-init-scale 4 --threshold-loss-scale 1 --fp16-scale-window 128 --max-sentences $MAX_SENTENCES --required-batch-size-multiple 1 --update-freq $UPDATE_FREQ --max-epoch $MAX_EPOCH