Transformers: model.to(args.device) in run_glue.py taking around 10 minutes. Is this normal?

Created on 11 Oct 2019 · 7Comments · Source: huggingface/transformers

❓ Questions & Help

Currently line 484 of run_glue.py model.to(args.device) is taking close to 10 minutes to complete when loading the bert-base pretrained model. This seems like a long time compared to what I was seeing in pytorch-transformers.

My configuration:
Tesla V100 - Driver 418.87.00
Cuda toolkit 10.1
PyTorch 1.3.0

The code I am running is:
python example/run_glue.py \ --model_type bert \ --model_name_or_path bert-base-uncased \ --task_name $(MY TASK) \ --do_train \ --do_eval \ --do_lower_case \ --data_dir $(MY_DIR) \ --max_seq_length 128 \ --per_gpu_eval_batch_size=64 \ --per_gpu_train_batch_size=64 \ --learning_rate 2e-5 \ --num_train_epochs 3.0 \ --output_dir $(MY_OUTDIR) \ --overwrite_output_dir \ --fp16

Is this behavior expected or am I doing something wrong? Thanks!

Source

pydn

Most helpful comment

Reopening because I found the issue and hopefully it can help someone else. I was comparing model loading times to what I was seeing on the hosted runtimes in Google Colab notebooks.

Even through they have cuda toolkit 10.1 installed as you can see when running the command !nvidia-smi, when you run torch.version.cuda they have 10.0.130 installed instead of the 10.1 version. They are also running pytorch 1.2.0.

I downgraded my environment to match and the model from models.densenet121(pretrained=True) loaded in 4.9 seconds.

Thanks for the help!

pydn on 11 Oct 2019

🚀1 👍1

All 7 comments

This seems weird, I'm looking into this.

LysandreJik on 11 Oct 2019

By running the run_glue.py script as it is right now with your exact parameters, I timed to model.to and it took 6.4 seconds

LysandreJik on 11 Oct 2019

Ok, thanks for looking into that! I'm using my own dataset so I made adjustments to the processor, but I don't think that should be causing the issue when transferring the model to the GPU. I'll run a few more tests and see if I can pinpoint what is going on. It's super helpful to know that you are seeing it take only 6.4 seconds. Thank you!

pydn on 11 Oct 2019

I just tested again using the SST-2 data keeping the run_glue.py code as is and I'm still having the same issue. My guess is that there is something with my VM set up that's causing the hanging issue. I'm having a hard time identifying what might be the exact cause of the issue.

pydn on 11 Oct 2019

Hmm do you think you can reproduce it on another VM? Are you running into the same issue if you simply put the model on the device in a standalone script?

LysandreJik on 11 Oct 2019

Ok, it's definitely an issue with my setup. I have the same issue when running the following:
`from torchvision import models

model = models.densenet121(pretrained=True)
model.to('cuda')`

I'll close the issue and keep troublehsooting on my end. Thanks!

pydn on 11 Oct 2019

👀1

Reopening because I found the issue and hopefully it can help someone else. I was comparing model loading times to what I was seeing on the hosted runtimes in Google Colab notebooks.

I downgraded my environment to match and the model from models.densenet121(pretrained=True) loaded in 4.9 seconds.

Thanks for the help!

pydn on 11 Oct 2019

🚀1 👍1

Was this page helpful?

0 / 5 - 0 ratings