Pytorch-lightning: Tidy up returns from `ddp_train()` of accelerators

Created on 20 Oct 2020  路  2Comments  路  Source: PyTorchLightning/pytorch-lightning

馃悰 Bug

There are currently inconsistent returns from the method ddp_train of accelerator classes.

For example, the method ddp_train of DDP2Accelerator returns results while ddp_train of DDPSpawnAccelerator returns None. Although I am not familiar with distributed training, it seems that both of the methods should return results (or both should return None) for consistency.
https://github.com/PyTorchLightning/pytorch-lightning/blob/eddf35a379efbd1b8fbedddb0d4b550e8e5ca48d/pytorch_lightning/accelerators/ddp_spawn_accelerator.py#L76
https://github.com/PyTorchLightning/pytorch-lightning/blob/f37444fa3e82011e4a71b5ca8dc897eff9ba0fa3/pytorch_lightning/accelerators/ddp2_accelerator.py#L120

Expected returns

All methods whose names are the same (i.e. ddp_train()) should return the same type of object (e.g. results).

Additional context

This inconsistency was found while handling #4232.

DDP Priority P0 bug / fix help wanted

All 2 comments

I think this is also where more typehint coverage + Pyre would detect issues based on static analysis

I鈥檓 not sure this is accurate.

sometimes they can鈥檛 really return something or it doesn鈥檛 make sense to.

ddp spawn can鈥檛 return anything :)
the functions are in a subprocess, and you can鈥檛 share objects between the processes unless you use a queue.

so, if you notice, ddpspawn train does not return anything but i think it tries to use a queue to pull out info and put it back on the main process.

in the example you mentioned, at the end of ddp spawn there is no return BUT there is a transfer state function... this function does return a few things but via the correct way which is a queue.

i suggest to watch the video on how ddp works which will explain how forking into processes happens (ie: mp.spawn, and hence the name), and why the use of spawn introduces limitations and renders returns impossible

Was this page helpful?
0 / 5 - 0 ratings

Related issues

edenlightning picture edenlightning  路  3Comments

baeseongsu picture baeseongsu  路  3Comments

DavidRuhe picture DavidRuhe  路  3Comments

srush picture srush  路  3Comments

Vichoko picture Vichoko  路  3Comments