Gym: Random seed not working correctly

Created on 9 Nov 2017  路  3Comments  路  Source: openai/gym

I am trying to reproduce data using Open AI Gym, and I notice that it couldn't get deterministic results when I added random seed. For example, I tried the following code:

import gym

eps_num = 1000
eps_limit = 1000
seed_num = 20


def run_random_samples(env, data):
    for _ in range(eps_num):
        env.reset()
        for _ in range(eps_limit):
            action = env.action_space.sample()
            observation, reward, done, _ = env.step(action)
            data.append([observation, reward, done])
            if done:
                break


def main():

    # Set the random seed.
    env = gym.make('CartPole-v0')
    env.seed(seed_num)

    # Initialize data list.
    data_1 = []
    data_2 = []

    # Run first samples with the random seed.
    run_random_samples(env, data_1)

    # Set the random seed again.
    env.seed(seed_num)
    run_random_samples(env, data_2)

    print(data_1[0], data_2[0])


if __name__ == "__main__":
    main()

If we run the above code, data_1[0] and data_2[0] are not equal from the printed results. I also tried other seeds, and also cannot get the same result. I wonder if there are things wrong in the code, or if I missed something.

P.S The above code was running on gym 0.9.4 on MacBook Pro with Python 3.5.2.

Most helpful comment

I ran into the same issue and reported it here, but apparently it hasn't been addressed.

Your problem is

action = env.action_space.sample()

In gym spaces are seeded separately from the environment. The seed is constant and it is set here:
https://github.com/openai/gym/blob/master/gym/spaces/prng.py

So the environment is seeded and working correctly, but data_1 and data_2 are taking different actions. Try resetting the space's prng between the trials, that should do the trick.

All 3 comments

I ran into the same issue and reported it here, but apparently it hasn't been addressed.

Your problem is

action = env.action_space.sample()

In gym spaces are seeded separately from the environment. The seed is constant and it is set here:
https://github.com/openai/gym/blob/master/gym/spaces/prng.py

So the environment is seeded and working correctly, but data_1 and data_2 are taking different actions. Try resetting the space's prng between the trials, that should do the trick.

Thanks for answering this @FirefoxMetzger, closing in favor of https://github.com/openai/gym/issues/667

@christopherhesse the above prng does not exist anymore. And so this issue is still open in my thinking. I am running into this issue when I sample action_space.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

RuofanKong picture RuofanKong  路  4Comments

RuofanKong picture RuofanKong  路  4Comments

tylerlekang picture tylerlekang  路  3Comments

cpatyn picture cpatyn  路  4Comments

julian-ramos picture julian-ramos  路  4Comments