Gym: env.action_space.sample() doesn't follow env.seed() ?

Created on 8 Aug 2017 · 3Comments · Source: openai/gym

When I set env.seed(0) (or some other seed) I expected all random elements of env to produce deterministically. However, the env.action_space.sample() function still seems to output randomly.

a1 = []
a2 = []


env1 = gym.make('FrozenLake-v0')
env1.seed(0)

s1 = env1.reset()

for _ in range(4):
    a1.append(env1.action_space.sample())


env2 = gym.make('FrozenLake-v0')
env2.seed(0)

s2 = env2.reset()

for _ in range(4):
    a2.append(env2.action_space.sample())


print a1
print a2

produces different results for a1 and a2. For example:

[1, 0, 2, 2]
[0, 3, 2, 1]

Perhaps this was/is desired, but as mentioned above, I thought that setting env.seed() would override that.

Source

tylerlekang

Most helpful comment

For newer versions use env.action_space.np_random.seed(123) - depending on the specific environment you might need env.seed(123) for a deterministic behavior.

jfaleiro on 16 Feb 2019

👍10

All 3 comments

see in gym source code how do spaces sample; e.g. https://github.com/openai/gym/blob/339415aa03a9b039a51f67798a44f8cd21464091/gym/spaces/box.py#L28-L29 they use separate random number generator that lives in gym.spaces.prng. If you want action / observation space to sample deterministically you will need to

from gym.spaces.prng import seed
seed(123)

fjwolski on 9 Aug 2017

OK, thanks for that info.

I was questioning if that should be the case, given a seemingly "overarching" nature of a simple line like env.seed(). BUT, if that is the way they want it to be done (or perhaps how it has to be done), I'm fine with that.

tylerlekang on 9 Aug 2017

For newer versions use env.action_space.np_random.seed(123) - depending on the specific environment you might need env.seed(123) for a deterministic behavior.

jfaleiro on 16 Feb 2019

👍10

Was this page helpful?

0 / 5 - 0 ratings