Transformers: How to set local_rank argument in run_squad.py

Created on 28 Oct 2019  路  9Comments  路  Source: huggingface/transformers

Hi!

I would like to try out the run_squad.py script (with AWS SageMaker in a PyTorch container).
I will use 8 x 100V 16 GB GPUs for the training.
How should I set the the local_rank parameter in this case?
( I tried to understand it from the code, but I couldn't really.)

Thank you for the help!

Most helpful comment

The easiest way is to use the torch launch script. It will automatically set the local rank correctly. It would look something like this (can't test, am on phone) :

python -m torch.distributed.launch --nproc_per_node 8 run_squad.py <your arguments>

All 9 comments

The easiest way is to use the torch launch script. It will automatically set the local rank correctly. It would look something like this (can't test, am on phone) :

python -m torch.distributed.launch --nproc_per_node 8 run_squad.py <your arguments>

Hi,

Thanks for the fast answer!

Yes I saw this solution in the examples, but I am interested in the case when I am using PyTorch container and I have to set up an entry point for the training (= run_squad.py) and its parameters . And so in that case how should I set it? Or just let it to be -1?

(Or you recommend in that case to create a bash file as entry where I start this torch lunch.)

Thanks again!

If you want to run it manually, you'll have to run the script once for each GPU, and set the local rank to the GPU ID for each process. It might help to look at the contents of the launch script that I mentioned before. It shows you how to set the local rank automatically for multiple processes, which I think is what you want.

Ok, thanks for the response! I will try that!

If your problem is fixed, please do close this issue.

@tothniki Did you have to modify the script very much to run with SM? Attempting to do so now, as well.

@petulla No, at the end i didn't modify anything regarding to the multiple GPU problem. ( of course I had to modify the read-in and the save to a S3 Bucket).I tried with SageMaker as it was, and it seemed to me that the distribution between GPUs worked.

The easiest way is to use the torch launch script. It will automatically set the local rank correctly. It would look something like this (can't test, am on phone) :

python -m torch.distributed.launch --nproc_per_node 8 run_squad.py <your arguments>

Hi @ugent

what about ( run_language_modeling.py ) ?
Does passing local_rank = 0 to it means it will automatically do the task on 4 GPUs (for ex.) which we have available ? and our speed will be 4 times faster ? (by distributed training)

or we have to run script by ( python -m torch.distributed.launch .....)

@mahdirezaey

Please use the correct tag when tagging...

No, it will not do this automatically, you have to use the launch utility.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

adigoryl picture adigoryl  路  3Comments

0x01h picture 0x01h  路  3Comments

rsanjaykamath picture rsanjaykamath  路  3Comments

siddsach picture siddsach  路  3Comments

fabiocapsouza picture fabiocapsouza  路  3Comments