I am trying to reproduce data using Open AI Gym, and I notice that it couldn't get deterministic results when I added random seed. For example, I tried the following code:
import gym
eps_num = 1000
eps_limit = 1000
seed_num = 20
def run_random_samples(env, data):
for _ in range(eps_num):
env.reset()
for _ in range(eps_limit):
action = env.action_space.sample()
observation, reward, done, _ = env.step(action)
data.append([observation, reward, done])
if done:
break
def main():
# Set the random seed.
env = gym.make('CartPole-v0')
env.seed(seed_num)
# Initialize data list.
data_1 = []
data_2 = []
# Run first samples with the random seed.
run_random_samples(env, data_1)
# Set the random seed again.
env.seed(seed_num)
run_random_samples(env, data_2)
print(data_1[0], data_2[0])
if __name__ == "__main__":
main()
If we run the above code, data_1[0] and data_2[0] are not equal from the printed results. I also tried other seeds, and also cannot get the same result. I wonder if there are things wrong in the code, or if I missed something.
P.S The above code was running on gym 0.9.4 on MacBook Pro with Python 3.5.2.
I ran into the same issue and reported it here, but apparently it hasn't been addressed.
Your problem is
action = env.action_space.sample()
In gym spaces are seeded separately from the environment. The seed is constant and it is set here:
https://github.com/openai/gym/blob/master/gym/spaces/prng.py
So the environment is seeded and working correctly, but data_1 and data_2 are taking different actions. Try resetting the space's prng between the trials, that should do the trick.
Thanks for answering this @FirefoxMetzger, closing in favor of https://github.com/openai/gym/issues/667
@christopherhesse the above prng does not exist anymore. And so this issue is still open in my thinking. I am running into this issue when I sample action_space.
Most helpful comment
I ran into the same issue and reported it here, but apparently it hasn't been addressed.
Your problem is
In gym spaces are seeded separately from the environment. The seed is constant and it is set here:
https://github.com/openai/gym/blob/master/gym/spaces/prng.py
So the environment is seeded and working correctly, but data_1 and data_2 are taking different actions. Try resetting the space's prng between the trials, that should do the trick.