Stable-baselines: DDPG and SAC for discrete action space.

Created on 26 Jul 2019  路  4Comments  路  Source: hill-a/stable-baselines

[question] Is there any reason why DDPG and SAC don't have the implementation for discrete action space? And will appreciate it there are any suggestions for applying the DDPG with continuous action space on the discrete one. Thanks!

duplicate question

Most helpful comment

Hello,
For DDPG, you can already find an answer here: https://github.com/hill-a/stable-baselines/issues/37
For SAC, the implementation with discrete actions is not trivial and it was developed to be used on robots, so with continuous actions. Those are the main reason. Meanwhile, if you want to work with discrete actions, you have plenty of other algorithms that can do that (ACER, PPO, DQN, A2C, ACKTR, ...).

All 4 comments

Hello,
For DDPG, you can already find an answer here: https://github.com/hill-a/stable-baselines/issues/37
For SAC, the implementation with discrete actions is not trivial and it was developed to be used on robots, so with continuous actions. Those are the main reason. Meanwhile, if you want to work with discrete actions, you have plenty of other algorithms that can do that (ACER, PPO, DQN, A2C, ACKTR, ...).

It was more than a year ago. Any news on this topic recently @araffin ?
It would be nice if SAC can take discrete action space input.

We have an issue about that in Stable-Baselines3 repo: https://github.com/DLR-RM/stable-baselines3/issues/157

But I would favor QR-DQN first in the contrib repo.

Thank you for letting me know @araffin

Was this page helpful?
0 / 5 - 0 ratings