Keras: Not reproducible results with Tensorflow backend if another (unused) session is created before.

Created on 10 Feb 2019 · 6Comments · Source: keras-team/keras

System

Mac OS X 10.14 (Mojave)
Tensorflow 1.12.2
Keras 2.2.4
Installed via pip and virtualenv
Runs on CPU

I'm trying to get reproducible results between two identical models. It works in general, but whenever I create a tensorflow Session at the beginning, which is not used anywhere - result become stochastic.

Here is a scrip which is failing on weights comparison:

import random

import numpy as np
import tensorflow as tf
from keras.models import Model
from keras import initializers
from keras.layers import Input, Dense
from keras.datasets import boston_housing
from keras import backend as K


def get_weights_and_preds():
    (x_train, y_train), (x_test, y_test) = boston_housing.load_data(seed=42, test_split=0.2)

    inputs = Input(batch_shape=(None, x_train.shape[1]))

    outputs = Dense(
        units=32, activation='relu',
        kernel_initializer=initializers.get({'class_name': 'he_uniform', 'config': {'seed': 42}}),
        use_bias=False,
        bias_initializer='zeros')(inputs)

    outputs = Dense(
        units=1, activation='linear',
        kernel_initializer=initializers.get({'class_name': 'he_uniform', 'config': {'seed': 42}}),
        use_bias=False,
        bias_initializer='zeros')(outputs)

    model = Model(inputs, outputs)
    model.compile(optimizer='adam', loss='mean_absolute_error')

    model.fit(x_train, y_train,
              batch_size=32, epochs=5, shuffle=True, verbose=True)

    return model.get_weights(), model.predict(x_test)


tf.Session()

random.seed(42)
tf.set_random_seed(42)
np.random.seed(42)

session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(config=session_conf)
K.set_session(sess)

w1, p1 = get_weights_and_preds()
w2, p2 = get_weights_and_preds()

for i in range(len(w1)):
    np.testing.assert_array_equal(w1[i], w2[i])

np.testing.assert_array_equal(p1, p2)

Error is following:

Traceback (most recent call last):
  File "../keras_random.py", line 56, in <module>
    np.testing.assert_array_equal(w1[i], w2[i])
  File "/Users/viktor.kovryzhkin/.virtualenvs/keras-test/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 896, in assert_array_equal
    verbose=verbose, header='Arrays are not equal')
  File "/Users/viktor.kovryzhkin/.virtualenvs/keras-test/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 819, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not equal

Mismatch: 74.5%
Max absolute difference: 0.00156796
Max relative difference: 0.08005489
 x: array([[ 0.516751,  0.13484 , -0.187835,  0.305085,  0.22556 ,  0.412772,
        -0.423129,  0.496579,  0.267026,  0.108562,  0.188156, -0.692385,
        -0.421752,  0.694956, -0.009893, -0.36041 ,  0.192134, -0.023192,...
 y: array([[ 0.516684,  0.13477 , -0.187908,  0.305013,  0.225632,  0.412699,
        -0.423058,  0.496579,  0.266948,  0.108562,  0.188156, -0.692316,
        -0.42183 ,  0.694885, -0.009967, -0.36034 ,  0.192205, -0.023128,...

If I delete line creating tensorflow Session (which is not used anywhere), test passes. Why is it happening? Am I missing something?

To investigate tensorflow

Source

vikua

👍1

Most helpful comment

I have to thank you for reporting this issue, @vikua !
I wasn't able to get reproducible results in my research before, even though I seeded the random number generators. Using the TF AdamOptimizer solved the problem instantly.

joosephook on 15 Feb 2019

👍4

All 6 comments

I tried running the code without fitting the model.
The weights are initialized identically, so the mismatch starts somewhere in the middle of training.
In addition, the batch_ids ~~for~~ seemed to be the same for both training runs.

I was able to avoid the error by using a different optimizer:
model.compile(optimizer='rmsprop', loss='mse'),
so perhaps it's related to the choice of optimizer?

joosephook on 11 Feb 2019

Interesting.

I did my initial tests on several optimizers (adam, rmsprop, adagrad, adadelta, sgd), but on tensorflow 1.6.0 (which is a version I use in production). This test failed for all of them.

I tried this on tensorflow 1.12.2 & adam just to confirm it is reproducible on newer version.

vikua on 11 Feb 2019

Looks like it is something in Keras optimizer. If I compile model with tensorflow tf.train.AdamOptimizer (or any other optimizer) the issue is not reproducible:

    optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
    model.compile(optimizer=optimizer, loss='mse')

vikua on 12 Feb 2019

👍2

joosephook on 15 Feb 2019

👍4

@vikua I ran the code in TF1.12, I don't see any issues. Thanks!
Downloading data from https://s3.amazonaws.com/keras-datasets/boston_housing.npz
57344/57026 [==============================] - 0s 2us/step
Epoch 1/5
404/404 [==============================] - 1s 2ms/step - loss: 268.9717
Epoch 2/5
404/404 [==============================] - 0s 83us/step - loss: 188.8862
Epoch 3/5
404/404 [==============================] - 0s 87us/step - loss: 111.6779
Epoch 4/5
404/404 [==============================] - 0s 97us/step - loss: 55.7327
Epoch 5/5
404/404 [==============================] - 0s 97us/step - loss: 42.7104
Epoch 1/5
404/404 [==============================] - 1s 2ms/step - loss: 268.9717
Epoch 2/5
404/404 [==============================] - 0s 88us/step - loss: 188.8862
Epoch 3/5
404/404 [==============================] - 0s 90us/step - loss: 111.6779
Epoch 4/5
404/404 [==============================] - 0s 86us/step - loss: 55.7327
Epoch 5/5
404/404 [==============================] - 0s 93us/step - loss: 42.7104

jvishnuvardhan on 22 Feb 2019

Looks like it is something in Keras optimizer. If I compile model with tensorflow tf.train.AdamOptimizer (or any other optimizer) the issue is not reproducible:
    optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
    model.compile(optimizer=optimizer, loss='mse')

Thank you so much @vikua and @joosephook for this discussion.. I had searched everywhere, tried all things and nothing helped but following your suggestions, the results are reproducible now and across different machines (having different OS) as well!