Stable-baselines: Is VecNormalize intentionally not changing the obervation space dtype?

Created on 6 Feb 2020 · 9Comments · Source: hill-a/stable-baselines

Pretty much the title: I found that VecNormalize does not change the data type of the observation space.

If you have an env which returns uint8 observations and wrap that in a VecNormalize and a VecFrameStack you will end up with uint8 observations instead of the float obs from the normalization (cause VecFrameStack converts them back).

This might acually work in some cases since VecNormalize does normalize between 2 arbritrary values, but for me it caused some confusion. Maybe the output dtype can be a parameter for VecNormalize?

enhancement

Source

NeoExtended

Most helpful comment

It's here and there

araffin on 7 Feb 2020

👍2

All 9 comments

Do your observations happen to be images? I believe VecNormalize is designed for robotics environments, where observations variables can range between unknown intervals (in which case a running mean/std, like done by VecNormalize, is required). If you know your values range between [0, 255], then you can divide it by 255 and things should work all dandy.

In any case I agree, the output of VecNormalize should be a float/double, in case somebody uses e.g. int32s. @AdamGleave any comments?

Miffyli on 6 Feb 2020

Yep they are images. I used VecNormalize in the beginning when i switched from another environment and then wondered why my fps were going down when I used the ScaledFloatFrame wrapper instead (which just divides everything by 255.0).

But its funny that he did learn anything even with each observation squished down to values from 1 to 10 (in int).

NeoExtended on 6 Feb 2020

In any case I agree, the output of VecNormalize should be a float/double, in case somebody uses e.g. int32s. @AdamGleave any comments?

Is there a reasonable use case for applying VecNormalize to int data types? It feels odd to try to normalize discrete data, although if the discrete data has a large enough range perhaps it makes sense.

If we don't intend VecNormalize to be used in this case, I'd be leaning towards adding input validation that raises an error if the venv being wrapped does not have a float-type observation_space.

AdamGleave on 6 Feb 2020

👍1

Yep they are images

If you use CnnPolicies, images are normalized automatically (cf doc).

araffin on 6 Feb 2020

I think there is a use case where you use normalization on image data. For example OpenAI used it in their large scale study on curiosity learning. I think the difference between the learned normalization and just dividing everything by 255 is that pixels which never change are actually set to 0 instead of something positive. I think that results in a better focus when learning, but i can't really proof that at the moment, its just a guess.

@araffin Oh I didn't know that. I now see it in the code, but where is it in the doc? At least not under CNN Policies

EDIT: Formatting issues

NeoExtended on 7 Feb 2020

It's here and there

araffin on 7 Feb 2020

👍2

Oh okay...I must have been blind there, thanks.

NeoExtended on 7 Feb 2020