Gym: env.action_space.sample() doesn't follow env.seed() ?

Created on 8 Aug 2017  路  3Comments  路  Source: openai/gym

When I set env.seed(0) (or some other seed) I expected all random elements of env to produce deterministically. However, the env.action_space.sample() function still seems to output randomly.

a1 = []
a2 = []


env1 = gym.make('FrozenLake-v0')
env1.seed(0)

s1 = env1.reset()

for _ in range(4):
    a1.append(env1.action_space.sample())


env2 = gym.make('FrozenLake-v0')
env2.seed(0)

s2 = env2.reset()

for _ in range(4):
    a2.append(env2.action_space.sample())


print a1
print a2

produces different results for a1 and a2. For example:

[1, 0, 2, 2]
[0, 3, 2, 1]

Perhaps this was/is desired, but as mentioned above, I thought that setting env.seed() would override that.

Most helpful comment

For newer versions use env.action_space.np_random.seed(123) - depending on the specific environment you might need env.seed(123) for a deterministic behavior.

All 3 comments

see in gym source code how do spaces sample; e.g. https://github.com/openai/gym/blob/339415aa03a9b039a51f67798a44f8cd21464091/gym/spaces/box.py#L28-L29 they use separate random number generator that lives in gym.spaces.prng. If you want action / observation space to sample deterministically you will need to

from gym.spaces.prng import seed
seed(123)

OK, thanks for that info.

I was questioning if that should be the case, given a seemingly "overarching" nature of a simple line like env.seed(). BUT, if that is the way they want it to be done (or perhaps how it has to be done), I'm fine with that.

For newer versions use env.action_space.np_random.seed(123) - depending on the specific environment you might need env.seed(123) for a deterministic behavior.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

RuofanKong picture RuofanKong  路  3Comments

pdoongarwal picture pdoongarwal  路  4Comments

reaIws picture reaIws  路  4Comments

mdavis-xyz picture mdavis-xyz  路  3Comments

Spiral-Galaxy picture Spiral-Galaxy  路  3Comments