Stable-baselines: Outputs from runs with same random seed are not identical

Created on 23 Dec 2018  路  9Comments  路  Source: hill-a/stable-baselines

Description of the bug

I have been unable to get reproducible results when using the same
seed for the random number generators.

Code example

Starting from the example described at

https://stable-baselines.readthedocs.io/en/master/modules/ppo1.html

I can create

import gym

from stable_baselines.common.policies import MlpPolicy, MlpLstmPolicy, MlpLnLstmPolicy
from stable_baselines.common.vec_env import DummyVecEnv
from stable_baselines import PPO1

env = gym.make('CartPole-v1')
env = DummyVecEnv([lambda: env])

model = PPO1(MlpPolicy, env, verbose=1)
model.learn(total_timesteps=5000,seed=100)
model.save("ppo1_cartpole")

Note that I have added seed=100 to model.learn().

Running this example prints output to the screen and writes the
ppo1_cartpole.pkl file.

Running the exact same code twice (with the same seed value) produces
different screen outputs and different ppo1_cartpole.pkl files.

System Info
My environment:

  • Installed by pip into virtual environment.
  • Stable Baselines version 2.3.0
  • Python version 3.5.2 is installed in the virtual environment.
  • Tensorflow version 1.12.0.
  • OpenAI Gym version 0.10.9.
  • My OS is Ubuntu-16.04.
  • No GPUs.

Additional context

It appears from the code that when seed is not None in learn() the
function set_global_seeds(seed) is called. I can see that this
function initialises the following random number generators with the
specified seed:

def set_global_seeds(seed):
    """
    set the seed for python random, tensorflow, numpy and gym spaces

    :param seed: (int) the seed
    """
    tf.set_random_seed(seed)
    np.random.seed(seed)
    random.seed(seed)
    gym.spaces.prng.seed(seed)

Because of this I also tried including the code lines

from stable_baselines.common import set_global_seeds
set_global_seeds(100)

before the call to gym.make() in the above example, but it did not help.

enhancement help wanted

All 9 comments

Hello,
This is known issue and is on the roadmap. It apparently comes from tensorflow and any help is appreciated ;)

Among the issues I found:

  • when using CuDNN, you have to force things to be deterministic (cf pytorch )
  • on CPU, tensorflow may use threads to accelerate matrix multiplication and this may also lead to non-determinism

The only case where I have reproducible results is when I test a learned policy with deterministic=True using the predict() method.

The problem might be related to the bug reported at https://github.com/keras-team/keras/issues/2280

However, I tried the suggestion, made at https://github.com/keras-team/keras/issues/2280#issuecomment-411542012, to use

PYTHONHASHSEED=0 python

but it didn't help.

Ultimately this appears to be a Tensorflow bug, see the bug at https://github.com/tensorflow/tensorflow/issues/9171

Bug report also indicates it won't be fixed until TensorFlow 2.0.

Basically the only way to get around this is to set the seed at the operation level, i.e. when a random number is generated.

EDIT: They point to a set of stateless random number generators here I don't know if these help at all.

@crobarcro thanks for pointing out that issue, that was what I was afraid of... So we need to seed weights initialization and random sampling done in common.distributions.

@crobarcro @pstansell I tweaked a bit the code and managed to get reproducible results for A2C, ACER, PPO1, PPO2 and TRPO (not working with ACKTR yet and the others I did not try)
You can find details here on the deterministic-fix branch.

Hi everyone, I've got to ask, are there any plans to merge the deterministic fix with the master branch?

Hello,
If you look at the roadmap and the milestones, it is planned for the next releases. However, there is no due date, we would appreciate contribution to help us finish it.

we would appreciate contribution to help us finish it.
I would like to help and can normally contribute some time each day, does the project have a slack channel?

does the project have a slack channel?

We don't have a slack channel. However, we have a roadmap, milestones and issues ;)
If you start working on something, just comment on the appropriate issue.

Was this page helpful?
0 / 5 - 0 ratings