Transformers: How to set local_rank argument in run_squad.py

Created on 28 Oct 2019 · 9Comments · Source: huggingface/transformers

Hi!

I would like to try out the run_squad.py script (with AWS SageMaker in a PyTorch container).
I will use 8 x 100V 16 GB GPUs for the training.
How should I set the the local_rank parameter in this case?
( I tried to understand it from the code, but I couldn't really.)

Thank you for the help!

Source

tothniki

Most helpful comment

The easiest way is to use the torch launch script. It will automatically set the local rank correctly. It would look something like this (can't test, am on phone) :

python -m torch.distributed.launch --nproc_per_node 8 run_squad.py <your arguments>

BramVanroy on 29 Oct 2019

👍3

All 9 comments

The easiest way is to use the torch launch script. It will automatically set the local rank correctly. It would look something like this (can't test, am on phone) :

python -m torch.distributed.launch --nproc_per_node 8 run_squad.py <your arguments>

BramVanroy on 29 Oct 2019

👍3

Hi,

Thanks for the fast answer!

Yes I saw this solution in the examples, but I am interested in the case when I am using PyTorch container and I have to set up an entry point for the training (= run_squad.py) and its parameters . And so in that case how should I set it? Or just let it to be -1?

(Or you recommend in that case to create a bash file as entry where I start this torch lunch.)

Thanks again!

tothniki on 29 Oct 2019

If you want to run it manually, you'll have to run the script once for each GPU, and set the local rank to the GPU ID for each process. It might help to look at the contents of the launch script that I mentioned before. It shows you how to set the local rank automatically for multiple processes, which I think is what you want.

BramVanroy on 29 Oct 2019

👍1

Ok, thanks for the response! I will try that!

tothniki on 29 Oct 2019

If your problem is fixed, please do close this issue.

BramVanroy on 29 Oct 2019

@tothniki Did you have to modify the script very much to run with SM? Attempting to do so now, as well.

petulla on 9 Jan 2020

@petulla No, at the end i didn't modify anything regarding to the multiple GPU problem. ( of course I had to modify the read-in and the save to a S3 Bucket).I tried with SageMaker as it was, and it seemed to me that the distribution between GPUs worked.

tothniki on 10 Jan 2020

The easiest way is to use the torch launch script. It will automatically set the local rank correctly. It would look something like this (can't test, am on phone) :
python -m torch.distributed.launch --nproc_per_node 8 run_squad.py <your arguments>

Hi @ugent

what about ( run_language_modeling.py ) ?
Does passing local_rank = 0 to it means it will automatically do the task on 4 GPUs (for ex.) which we have available ? and our speed will be 4 times faster ? (by distributed training)

or we have to run script by ( python -m torch.distributed.launch .....)