Ray: [rllib] Make reverb available as a type of replay actor?

Created on 1 Jun 2020  路  4Comments  路  Source: ray-project/ray

Describe your feature request

https://deepmind.com/research/open-source/Reverb looks like it could be potentially quite efficient as a replay buffer implementation. Currently our replay buffer code is written in Python / numpy.

enhancement rllib stale

Most helpful comment

@yutaizhou, from a brief look, these are the differences that stand out to me:

(1) target users -- acme is explicitly targeted at research reproducibility, and has more extensive support for DM benchmark suites. RLlib is more generally targeted at complex applied RL and multi-agent use cases. However both are fairly general libraries, much more than others I think.

(2) architecture choices -- there are a couple things, the first being "push vs pull" of experiences. RLlib pulls experiences from RolloutWorkers (unless you are using client/server mode). Acme pushes them from individual un-managed env loops. The managed rollout generation in RLlib enables multi-agent and optimizations such as inference vectorization. The second difference is that RLlib builds on a general purpose distributed system, whereas Acme relies on Reverb for distributed coordination.

(3) framework support -- RLlib supports TF(2) / PyTorch, Acme supports TF2 / JAX.

With respect to interop, leveraging reverb is a possible integration point (similar to how DD-PPO leverages torch.distributed in RLlib). In the ideal world, users would be able to write policies that can execute in either RLlib / acme without too much fuss. Whether this is possible requires more investigation.

All 4 comments

Hey Eric, could you comment on the relationship between DM's acme and ray/rllib? i.e, what one has that the other doesn't, and how the two can interoperate, if possible

@yutaizhou, from a brief look, these are the differences that stand out to me:

(1) target users -- acme is explicitly targeted at research reproducibility, and has more extensive support for DM benchmark suites. RLlib is more generally targeted at complex applied RL and multi-agent use cases. However both are fairly general libraries, much more than others I think.

(2) architecture choices -- there are a couple things, the first being "push vs pull" of experiences. RLlib pulls experiences from RolloutWorkers (unless you are using client/server mode). Acme pushes them from individual un-managed env loops. The managed rollout generation in RLlib enables multi-agent and optimizations such as inference vectorization. The second difference is that RLlib builds on a general purpose distributed system, whereas Acme relies on Reverb for distributed coordination.

(3) framework support -- RLlib supports TF(2) / PyTorch, Acme supports TF2 / JAX.

With respect to interop, leveraging reverb is a possible integration point (similar to how DD-PPO leverages torch.distributed in RLlib). In the ideal world, users would be able to write policies that can execute in either RLlib / acme without too much fuss. Whether this is possible requires more investigation.

@yutaizhou We will also take a look at possibly adding JAX support to RLlib.

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

  • If you'd like to keep the issue open, just leave any comment, and the stale label will be removed!
  • If you'd like to get more attention to the issue, please tag one of Ray's contributors.

You can always ask for help on our discussion forum or Ray's public slack channel.

Was this page helpful?
0 / 5 - 0 ratings