Stable-baselines: [feature request] custom transformation of observation space

Created on 6 Dec 2018 · 6Comments · Source: hill-a/stable-baselines

Hello,

I often need to manually transform the observation space shape and associated observations in order to match custom policies I'm using. Would it be interesting to add a pre-processing mechanism that would:

customize the shape of the input. For now the shape is fixed like in (Box space): https://github.com/hill-a/stable-baselines/blob/a0b35d1f87046802baadbcbed59d3619a5f9bd92/stable_baselines/common/input.py#L25
I guess there are other places where this needs to be adapted
transform the observation to fit in the desired shape. An example of that would be:

def transform(obs):
    return np.reshape(obs, ...)

I guess there would be at least 2 options to expose a custom transformer: add it as a parameter to the algorithm, or register it (2nd option preferred I think).

custom gym env enhancement

Source

antoine-galataud

Most helpful comment

@hill-a thanks. I was about to write my own :)

antoine-galataud on 6 Dec 2018

👍2

All 6 comments

I'm not sure I understand what you mean.

If you want to change the observation shape from the environment, you can use a custom environment wrapper that can transform your observation before it is used by the model.

If you want to change the way the batch shapes are handled, I wouldn't mind an example, as I'm not sure how this could be used.

Or do you mean something else?

hill-a on 6 Dec 2018

👍1

I mostly agree with @hill-a, I would add one thing: what you described seems related to custom environment.
Does something prevent you from doing the transformation inside the environment ?

araffin on 6 Dec 2018

Well I think my use case is quite specific then :)

An example of what I meant would be to use policies with convolutional NNs but with an environment that doesn't have images as observations. This would require to transform both the observation (I agree it can be done in the environment) and the input shape (AFAIK it requires a modification of the input shape that can't be done only in the custom policy)

antoine-galataud on 6 Dec 2018

convolutional NNs but with an environment that doesn't have images as observations

If your observation is not a image, I would recommend you to use MLP policies, unless it is something different than a feature vector. If your observation is a tensor of dimension 3 (i.e. it is as if it is an image), then it should work.

araffin on 6 Dec 2018

wouldn't this work? (granted, it is not documented that this is possible in the doc examples)

import numpy as np
from gym import spaces

from stable_baselines.common.vec_env import VecEnvWrapper

class CustomVecEnvWrapper(VecEnvWrapper):
    """
    A custom vectorized environment wrapper

    :param venv: (VecEnv) the vectorized environment to wrap
    """

    def __init__(self, venv, obs_shape, action_shape):
        self.venv = venv
        self.obs_shape = obs_shape
        self.action_shape = actions_shape

        obs_low = venv.observation_space.low.reshape(obs_shape)
        obs_high = venv.observation_space.high.reshape(obs_shape)
        observation_space = spaces.Box(low=obs_low , high=obs_high , dtype=venv.observation_space.dtype)

        action_low = venv.action_space.low.reshape(action_shape)
        action_high = venv.action_space.high.reshape(action_shape)
        action_space = spaces.Box(low=action_low , high=action_high , dtype=venv.action_space.dtype)

        VecEnvWrapper.__init__(self, venv, observation_space=observation_space, action_space=action_space)

    def step_async(self, actions):
        self.venv.step_async(actions.reshape(self.actions_shape))

    def step_wait(self):
        observations, rewards, dones, infos = self.venv.step_wait()
        return observations.reshape(self.obs_shape), rewards, dones, infos

    def reset(self):
        """
        Reset all environments
        """
        obs = self.venv.reset()
        return obs .reshape(self.obs_shape)

    def close(self):
        self.venv.close()

EDIT: added action reshaping as well

hill-a on 6 Dec 2018

@hill-a thanks. I was about to write my own :)

antoine-galataud on 6 Dec 2018

👍2

Was this page helpful?

0 / 5 - 0 ratings