Keras: is there any way to note down or control the random seeds?

Created on 24 Jul 2015 · 26Comments · Source: keras-team/keras

I use the default init function "uniform" to init weights, but each time I get different result with the same code and data. I can't reproduce the result. I wonder is there any way to note down or control the random seeds?
Thank you!

Source

kgzy

Most helpful comment

Easy: seed numpy.random.seed() before you import anything from Keras. Most examples in the example folder follow this scheme.

fchollet on 24 Jul 2015

👍19 👎13

All 26 comments

Easy: seed numpy.random.seed() before you import anything from Keras. Most examples in the example folder follow this scheme.

fchollet on 24 Jul 2015

👍19 👎13

Thank you!

kgzy on 24 Jul 2015

imdb_lstm.py example has "np.random.seed(1337)" in the top, befor ekeras imports. But I get different results (test accuracy) every time I run it. What am I missing?

surmenok on 6 Sep 2015

👍8

I can confirm this issue, i realized after implemented my code that i was previously running on CPU using torch. I ran a bunch of tests on GPU (Tesla K40c) and obtained different result with the same hyperparameters.

So i verified this issue on the imdb_cnn.py example and i can confirm the issue. What am i missing?

First run:

20000/20000 [==============================] - 30s - loss: 0.3159 - acc: 0.8649 - val_loss: 0.3745 - val_acc: 0.8274

Second run:

20000/20000 [==============================] - 30s - loss: 0.3174 - acc: 0.8636 - val_loss: 0.3757 - val_acc: 0.8260

Third run:
20000/20000 [==============================] - 30s - loss: 0.3169 - acc: 0.8634 - val_loss: 0.3815 - val_acc: 0.8228

The validation accuracy loss is low but it is more sensible on my code.

4634/4634 [==============================] - 7s - loss: 0.0065 - acc: 0.9978 - val_loss: 0.7320 - val_acc: 0.8778
4634/4634 [==============================] - 7s - loss: 0.0068 - acc: 0.9978 - val_loss: 0.7667 - val_acc: 0.8802
4634/4634 [==============================] - 7s - loss: 0.0046 - acc: 0.9978 - val_loss: 0.8354 - val_acc: 0.8667

dbonadiman on 6 Oct 2015

I am unable to get keras to reproduce the same result despite setting the numpy seed before importing anything. And I am running on the CPU on OS X.

desilinguist on 13 Oct 2015

I've set the numpy.random.seed before importing anything. The result of each run is different. Any help would be appreciated.

alizzz on 18 Oct 2015

Be sure to use Theano dev version, otherwise sometimes Theano use numpy
random generator in way that is very hard to predict, so this look like the
seed don't work:

http://deeplearning.net/software/theano/install.html#bleeding-edge-install-instructions

Just in case this is your problem.

On Sat, Oct 17, 2015 at 11:05 PM, Alexander [email protected]
wrote:

I've set the numpy.random.seed before importing anything. The result of
each run is different. Any help would be appreciated.

—
Reply to this email directly or view it on GitHub
https://github.com/fchollet/keras/issues/439#issuecomment-148973049.

nouiz on 18 Oct 2015

@nouiz's solution worked for me. Thanks.

hlin117 on 3 Mar 2016

@nouiz I am using Keras 0.3.1 and theano 0.7.1a1, and I am setting numpy seed just after numpy import and before importing any keras or numpy related staff, and I still have this non reproducibility problem. Please @nuiz, does your answer mean that those "hard to predict" ways in which theano uses numpy are corrected on the current release candidate (0.8) or were they supposed to be corrected on 0.7.1a1?

volvador on 17 Mar 2016

I don't remember when that was fixed. You can use Theano 0.8rc1 or the
development version and it should work.

On Thu, Mar 17, 2016 at 7:41 AM, volvador [email protected] wrote:

@nouiz https://github.com/nouiz I am using Keras 0.3.1 and theano
0.7.1a1, and I am setting numpy seed just after numpy import and before
importing any keras or numpy related staff, and I still have this non
reproducibility problem. Please @nuiz https://github.com/nuiz, does
your answer mean that those "hard to predict" ways in which theano uses
numpy are corrected on the current release candidate (0.8) or were they
supposed to be corrected on 0.7.1a1?

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/fchollet/keras/issues/439#issuecomment-197840712

nouiz on 17 Mar 2016

@volvador, if you're using python 3, you need to set a flag before running: PYTHONHASHSEED=0 (I've just commented about this in https://github.com/fchollet/keras/issues/850#issuecomment-231350983).

kepler on 8 Jul 2016

@fchollet : I am looking for something slightly different:
I want to repeat training/evaluation for 15 times, and each time with a different random seed.
Then take the average of f-scores. How to do this in python?
should I reload numpy and keras ? should I delete numpy and keras from memory ?

for i in range(0,15):

 import numpy as np 

 import datetime 

 np.random.seed(datetime.datetime.now().microsecond)

 from keras import ....

 train()

 evaluate()

 A_CODE_TO_UNLOAD_OR_RESET_NUMPY_KERAS()

Does del np , del keras work ? what about reloading the modules ?
what is the correct way of doing this ?

farmeh on 29 Aug 2016

Sorry to high-jack this issue.

I read a few time about the PYTHONHASHSEED=0 trick for python 3. Is this
needed only for Theano back-end or also for Tensorflow? Mostly, I would
like to know if this is something we should change in Theano (and not in
user code or Keras). If it is in Theano, do you have code that reproduce
it? I don'T know when we can work on it, but we should make an issue on
Theano if this is the case.

thanks

On Mon, Aug 29, 2016 at 9:50 AM, farmeh [email protected] wrote:

I am looking for something slightly different:
I want to repeat training/evaluation for 15 times, and each time with a
different random seed.
Then take the average of f-scores. How to do this in python?
should I reload numpy and keras ? should I delete numpy and keras from
memory ?

`

for i in range(0,15):

import numpy as np

import datetime

np.random.seed(datetime.datetime.now().microsecond)

from keras import ....

train()

evaluate()

A_CODE_TO_UNLOAD_OR_RESET_NUMPY_KERAS()

`

Does _del np , del keras_ work ? what about _reloading the modules_ ?
what is the correct way of doing this ?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/fchollet/keras/issues/439#issuecomment-243129406, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AALC-6RFL1VTSkxqXoFukXSIk3yVID2Iks5qkuPBgaJpZM4FfPGF
.

nouiz on 29 Aug 2016

@nouiz, as I understand it, this flag only affects Python variables whose _"types [are] covered by the hash randomization"_. So this might affect Theano only if it somehow relies on the order of elements in a dictionary, for example. What has a much higher chance of being affected by this flag is actually user code that calls Theano, which probably has some mappings between tokens and ids, for example (and is thus affected by random initialization of weights, etc.).

kepler on 10 Oct 2016

For tensorflow backend, please see https://github.com/fchollet/keras/issues/2280

For theano: numpy seed worked @nouiz ! :) It was indeed crucial to upgrade to Theano dev.

kudkudak on 30 Oct 2016

For me it was not necessary to import numpy and call numpy.random.seed(42) _before_ any Keras imports. It worked even though I called it in the __main__:

import numpy as np
import matplotlib.pyplot as plt

from keras.engine import Input
from keras.engine import Model
from keras.layers import Convolution2D, MaxPooling2D


class MaxPoolTests:

    def __init__(self, filter_sizes, nb_row, nb_col):

        in_x = Input(shape=(nb_row, nb_col, 1), name='in_x')

        convolutions = list()

        for window_size in filter_sizes:

            conv = Convolution2D(nb_filter=1, nb_row=window_size, nb_col=nb_col,
                                 border_mode='valid',
                                 activation='tanh',
                                 name='conv_{:d}'.format(window_size))(in_x)

            max_pool = MaxPooling2D(name='maxpool_{:d}'.format(window_size),
                                    pool_size=(1, 1),
                                    border_mode='valid')(conv)

            convolutions.append(max_pool)

        self.model = Model(input=[in_x], output=convolutions)


if __name__ == '__main__':

    np.random.seed(42)

    m = 10; n = 10

    nn = MaxPoolTests([2, 3], nb_row=m, nb_col=n)
    nn.model.compile(optimizer='adam', loss='mse')

    x = np.zeros((1, m, n, 1))

    for i,v in enumerate(np.linspace(-1, 1, m)):
        x[0, i, i, 0] = v

    y = nn.model.predict(x)

    plt.figure()
    plt.imshow(y[0][0,:,:,0])
    plt.show()

    print('All done.')

silentsnooc on 17 Feb 2017

I recently found that using the ADAM optimizer has a stochastic element to it that may be causing the randomness you are seeing if the previously commented solutions did not work. I switched my optimizer to Adagrad to remove the randomness from the ADAM optimizer and it seems to allow for repeated results. I tested by taking a snapshot of the weight values (using model.save_weights('*.h5) method) and used a h5diff tool to check that the weights were indeed equal for repeated tests after 1 epoch of training the data. Hopefully this helps some people trying to get repeated results for research/testing.

ProgrammerJed on 2 Apr 2017

👍10

@ProgrammerJed can you please point out to where exactly that randomness is in ADAM optimizer ? I (mean in the code)

MahmoudIsmail88 on 1 May 2017

@ProgrammerJed and @MahmoudIsmail88 - I am using the adam optimizer and have no problems with randomness as long as I set the seed before running model.compile and model.fit. Have a look at code below, I always get the same results with the same seed.

`def cnn_model():
model = Sequential()
# create model
model.add(Conv1D(filters=150, kernel_size=3, padding='same', activation='relu', input_shape=(30, 400)))
model.add(GlobalMaxPooling1D())
model.add(Dense(250, activation='relu'))
model.add(Dense(3, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model

CREATE AND FIT MODEL

np.random.seed(21)
m = cnn_model()
m.fit(train_xx, one_hot_y, epochs=5, batch_size=320)`

stevenzim on 2 May 2017

👍4

@farmeh Did you ever find anything out about that? I am also curious about what the best way to select seeds in a loop is.

gkericks on 30 May 2017

@ProgrammerJed

Thank you man.

VanitarNordic on 27 Oct 2017

@stevenzim did you test on GPU tensorflow?

hkmztrk on 12 Dec 2017

Calling
np.random.seed(42)
Before any keras model worked for me

bilalmahmood1 on 6 Feb 2018

One other thing to be careful of is if you're running any other sort of randomness before you initialize the variables, any change will affect the random assignment.

cpernoud on 14 Dec 2018

There are many possible sources of variability: random number generators (including Python's hash function which is used for example in dicts and sets), multithreading, in particular when using some optimized CUDA functions (e.g., TensorFlow's reduce_mean() function relies on such an optimized function, so using it on a GPU means that the output will be fast but not perfectly deterministic).
Note that PYTHONHASHSEED should be set before starting Python (if you set it within your python program, it has no effect).
Check out this video for more details: https://youtu.be/Ys8ofBeR2kA

ageron on 15 Jan 2019

In Tensorflow 2.0 you can set random seed like this :

import tensorflow as tf
tf.random.set_seed(221)


from tensorflow import keras
from tensorflow.keras import layers


model = keras.Sequential( [ 
layers.Dense(2,name = 'one'),
layers.Dense(3,activation = 'sigmoid', name = 'two'),
layers.Dense(2,name = 'three')])

x = tf.random.uniform((12,12))
model(x)