Dear all,
Thank you for this excellent library! I'm just wondering if there is intent to add support at some point for multi-agent environments?
Hello,
if there is intent to add support at some point for multi-agent environments?
This is currently not part of the current roadmap (see projects). We are for now focusing on model-free, single agent setting and in fact improving the current library (cf milestones).
However, the implementation can be discussed here. What should be implemented and how?
You can also take a look at this related issue: https://github.com/hill-a/stable-baselines/issues/181
(if you think this is a duplicate, then please close the issue ;))
Thanks for the response! I think those issues are similar but my request is a slight superset of theirs.
It does look like a fair bit of refactoring would need to be done to add multi-agent support to this library. Namely, in the way I would imagine it is that you have a class that tracks multi-agent batches in a dict keyed by a policy-id, and then passes the appropriate samples to the policy graph keyed by that id. However, this would require moving the runner out of the training loop of all of the algorithms. I do really like the library so I might see if we can get someone from our team (https://flow-project.github.io/team.html) to give it a try.
There's a multi-agent version of Gym now:
Closing this in favor of https://github.com/DLR-RM/stable-baselines3/issues/69
Most helpful comment
Thanks for the response! I think those issues are similar but my request is a slight superset of theirs.
It does look like a fair bit of refactoring would need to be done to add multi-agent support to this library. Namely, in the way I would imagine it is that you have a class that tracks multi-agent batches in a dict keyed by a policy-id, and then passes the appropriate samples to the policy graph keyed by that id. However, this would require moving the runner out of the training loop of all of the algorithms. I do really like the library so I might see if we can get someone from our team (https://flow-project.github.io/team.html) to give it a try.