Hi,
I was wondering do you happen to have A2C's hyperparams for MuJoCo that can reproduce results close/similar to the PPO paper [PPO paper, Figure 3, and results for A2C]? or any A2C hyperparameters that work for MuJoCo?
Thanks.
Hello,
Please wait a bit or use the gail-test branch (see PR #206 ), that will be merged with master soon.
In the master branch, there is a tricky bug in A2C with continuous actions, but fortunately easy to fix (see https://github.com/hill-a/stable-baselines/pull/206/commits/689afd16f5b07d2fead1fa5e8474a8efa2826a64 for the fix)
For the hyperparameters, I would recommend you to take a look at the rl baselines zoo on the add-trpo branch. There are hyperparameters for Pybullet envs that are similar and a bit harder than the mujoco ones.
From what I remember, default hyperparameters where working quite well for A2C.
EDIT: it seems that A2C needs some hyperparameter tuning for Mujoco (I'm currenltly running some)
EDIT: the branch is now merged with master ;)
Hi,
Thanks.
That would be very helpful and great if you can share A2C hyperparameters when you have it. It seems A2C needs different hyperparameters for Mujoco than Atari.
Thanks again for your help.
Hey,
here is for now the best hyperparams found so far (using add-trpo branch in the rl baselines zoo) with stable-baselines v2.5.0 (please upgrade ;)):
HalfCheetahBulletEnv-v0:
normalize: true
n_envs: 8
n_timesteps: !!float 2e6
policy: 'MlpPolicy'
ent_coef: 0.0
n_steps: 32
vf_coef: 0.5
lr_schedule: 'linear'
gamma: 0.99
learning_rate: 0.0013
Thanks a lot. Really appreciated for the update.
Thanks a lot. Really appreciated for the update.
Your welcome. Btw, as I did not have any Mujoco licence, I would be interested by your results ;)
I just published a paper with optimized parameters for A2C on pybullet environments:
Most helpful comment
Hey,
here is for now the best hyperparams found so far (using
add-trpobranch in the rl baselines zoo) with stable-baselines v2.5.0 (please upgrade ;)):