Stable-baselines: [question] ppo2 and prioritized experience replay

Created on 14 Dec 2018  路  4Comments  路  Source: hill-a/stable-baselines

I need to test ppo2 with a prioritized experience replay and I wonder if anyone wrote a similar integration before I go ahead and write it from scratch.

question

Most helpful comment

Hello,
PPO is meant to be on-policy (the policy that generates samples needs to be the same that is optimized (and not an older version)) so I don t think it really makes sense to have an experience replay in that case.

All 4 comments

Hello,
PPO is meant to be on-policy (the policy that generates samples needs to be the same that is optimized (and not an older version)) so I don t think it really makes sense to have an experience replay in that case.

My bad, I meant to say dueling dqn. I was coding a ppo2 and was stuck in my head.

Well then, i don t really understand your question neither. Prioritized experience replay is already implemented for ddqn in stable baselines

I see, I missed that in the code. I'll give it a second look. Thanks!

Was this page helpful?
0 / 5 - 0 ratings