Stable-baselines: RDPG implementation ?

Created on 25 Jul 2019  路  3Comments  路  Source: hill-a/stable-baselines

Good morning,
I was wondering if you are going to implement the RDPG algorithm and if the answer is yes, within what time-frame?
Thank you.

enhancement question

Most helpful comment

I would start from the TD3 implementation (which is cleaner than the current DDPG), you can find it on the td3 branch (will be merged with master soon anyway).
Anyway, I don't know much about this paper so I'll let you find out ;) (and PPO2 implementation of recurrent policy is quite complicated, I would try to do something simpler)

All 3 comments

Hello,
I assume your are talking about that paper?
Its implementation is currently not in the roadmap (see milestiones and project tab) but we welcome contributions ;)

Thank you for your answer and yes the paper is that.
I will try to implement it in the next weeks, do you have any suggestions on how to approach the implementation ? (For example if I can re-use some portions of code, like the recurrent policies of PPO2 or if it's useful to start from the DDPG implementation). Consider that if I'm successful I want to incorporate the + HER part.

I would start from the TD3 implementation (which is cleaner than the current DDPG), you can find it on the td3 branch (will be merged with master soon anyway).
Anyway, I don't know much about this paper so I'll let you find out ;) (and PPO2 implementation of recurrent policy is quite complicated, I would try to do something simpler)

Was this page helpful?
0 / 5 - 0 ratings