Stable-baselines: [question] ppo2 and prioritized experience replay

Created on 14 Dec 2018 · 4Comments · Source: hill-a/stable-baselines

I need to test ppo2 with a prioritized experience replay and I wonder if anyone wrote a similar integration before I go ahead and write it from scratch.

question

Source

AloshkaD

Most helpful comment

Hello,
PPO is meant to be on-policy (the policy that generates samples needs to be the same that is optimized (and not an older version)) so I don t think it really makes sense to have an experience replay in that case.

araffin on 14 Dec 2018

👍2

All 4 comments

Hello,
PPO is meant to be on-policy (the policy that generates samples needs to be the same that is optimized (and not an older version)) so I don t think it really makes sense to have an experience replay in that case.

araffin on 14 Dec 2018

👍2

My bad, I meant to say dueling dqn. I was coding a ppo2 and was stuck in my head.

AloshkaD on 14 Dec 2018

Well then, i don t really understand your question neither. Prioritized experience replay is already implemented for ddqn in stable baselines

araffin on 14 Dec 2018

I see, I missed that in the code. I'll give it a second look. Thanks!

AloshkaD on 14 Dec 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

[question] Actor-Net with continuous actions: Why does the std not depend on observations?

Antalagor · 3Comments

What is the default number of parallel actors used in PPO2?

ktattan · 3Comments

RDPG implementation ?

H2SO4T · 3Comments

SubprocVecEnv problem

maystroh · 3Comments

What is the default network architecture for MlpLnLstmPolicy?

ktattan · 3Comments