Is it possible to run parallel training on multiple machines? I use computation heavy perception methods and training process is quite slower than example scenes. To speed up training I try to train on multiple unity instances but it didn't help much. Is it possible to distributed training with different machines, if not will be in included future releases?
From their blog post: _Our work doesn鈥檛 stop here; we are also working on techniques to train multiple levels concurrently by scaling out training across multiple machines._
@roboserg You are right, we are still working on this.
hi @ertugrulerdogan and @roboserg - i've documented this and will update when we make more progress.
@unityjeffrey The project I am currently working requires this feature, is there any eta if not is it possible to work on this feature? Thank you so much :D
@Taikatou You can try to use rllib(https://ray.readthedocs.io/en/latest/rllib.html) along with the gym wrapper we have for this. Our own parallel training with multiple machines will need more time to come.
Thank you for submitting this request. We鈥檝e added it to our internal tracker. I鈥檓 going to close this issue for now, but we鈥檒l ping back with any updates.
Also note that parallel environments were improved in 0.9, they no longer block each other when training. Give it a go. Also, if environments are the bottleneck, the SAC trainer in v0.10 should help quite a bit even in single machine training.
Most helpful comment
@Taikatou You can try to use rllib(https://ray.readthedocs.io/en/latest/rllib.html) along with the gym wrapper we have for this. Our own parallel training with multiple machines will need more time to come.