Ignite: Is there a way to select which GPUs to use during training with ignite.distributed?

Created on 4 Nov 2020  ยท  4Comments  ยท  Source: pytorch/ignite

โ“ Questions/Help/Support

In the idist.Parallel method in Ignite is there a way to select which GPUs to use during training?

Related to the question in Issue #1118

question

All 4 comments

@ryanwongsa currently the most reliable way to select GPUs to run on is script-wise with CUDA_VISIBLE_DEVICES="0,1,2,3". Does it work for your use-case ?

Currently I don't have a use-case, it was more of a general question since I noticed other Pytorch higher-level frameworks have an option like selecting GPUs.

Would the above solution work if I want to run multiple scripts on different gpus? Say I have 4 GPUs and want to train 1 model on GPU 0,1 and another model on GPU 2, 3 simultaneously. e.g:

CUDA_VISIBLE_DEVICES="0,1"
python train1.py

CUDA_VISIBLE_DEVICES="1,2"
python train2.py 

Maybe this is a Pytorch question instead of an Ignite question though.

Yes, it should work if there is not overlapping between devices: 0,1 for train1 and 2,3 for train2. It would work as if you had 2 GPUs. Another examples,

CUDA_VISIBLE_DEVICES=0 python -c "import torch; print(torch.cuda.device_count())"
> 1
CUDA_VISIBLE_DEVICES=0,1 python -c "import torch; print(torch.cuda.device_count())"
> 2

or

# terminal 1
CUDA_VISIBLE_DEVICES=0 python -c "import torch; torch.rand(64, 128, 512, 512, device='cuda'); import time; time.sleep(60);"
# terminal 2
CUDA_VISIBLE_DEVICES=1 python -c "import torch; torch.rand(32, 128, 512, 512, device='cuda'); import time; time.sleep(60);"

> 
|   0  
|  0%   39C    P2    57W / 280W |   8743MiB / 11178MiB |      0%      Default |   
+-------------------------------+----------------------+----------------------+
|   1  
| 24%   47C    P2    60W / 250W |   4647MiB / 11176MiB |      0%      Default |

Another thing to keep in mind is CPU(num_workers) / RAM usage as independent scripts will use the same CPU resources.

Great thanks. That looks good enough for future use cases.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

vfdev-5 picture vfdev-5  ยท  3Comments

CreateRandom picture CreateRandom  ยท  3Comments

vfdev-5 picture vfdev-5  ยท  3Comments

samarth-robo picture samarth-robo  ยท  3Comments

kilsenp picture kilsenp  ยท  3Comments