Stable-baselines: [Question] DDPG action space symmetric?

Created on 27 Nov 2018 · 5Comments · Source: hill-a/stable-baselines

Hi again,
Why does the action space need to be symmetric in DDPG learning?

question

Source

RGring

Most helpful comment

DDPG uses tanh before output (so its output lies in [-1, 1]) and then this output is rescaled.
Because of that, it can only handles symmetric action spaces.

araffin on 27 Nov 2018

👍2

All 5 comments

DDPG uses tanh before output (so its output lies in [-1, 1]) and then this output is rescaled.
Because of that, it can only handles symmetric action spaces.

araffin on 27 Nov 2018

👍2

okay. Is it not possible to just remap; e.g. [-1, 1] --> [0 , 1]?

RGring on 27 Nov 2018

Well, nothing prevent you from doing that in your env.

araffin on 27 Nov 2018

@araffin Why not just rescale the agent's actions within the agent code to fit the bounds of the action space? Seems like it would make the agent code more generic.

Any reason that you couldn't just apply a linear transformation?

def rescale_actions(tanh_output, low, high):
    range = high - low
    return tanh_output * range / 2 + (low + (0.5 * range))

csaroff on 22 Mar 2019

👍1

yes, I agree with @csaroff , it's cumbersome to do it ourselves

yutao-li on 1 May 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

[question] Is PPO2 with multiple workers the same as DPPO?

acyclics · 3Comments

Tensorboard add summary image

maystroh · 3Comments

CustomPolicy error: AttributeError: can't set attribute in self.pdtype.proba_distribution_from_latent

pirobot · 3Comments

How exactly are the actor-critic networks created?

RyanRizzo96 · 3Comments

[question] Actor-Net with continuous actions: Why does the std not depend on observations?

Antalagor · 3Comments