Keras: Multiprocessing: Failed to get device properties

Created on 10 May 2018 · 15Comments · Source: keras-team/keras

I would like to use a keras model in a multiprocessing setup.
The model is used in a generator, which produces data to train another model.

As long as I don't use multiprocessing, everything works fine.
But with multiprocessing, I get the following error:

E tensorflow/core/grappler/clusters/utils.cc:81] Failed to get device properties, error code: 3

I searched how to use Keras in a multithreaded context and found this:
https://github.com/keras-team/keras/issues/5640

Apparently, I need to call _make_predict_function and get the tensorflow graph.
I added this to my code:

before any training:

q_approximator = create_model()
q_approximator_fixed = create_model()

q_approximator._make_predict_function()
q_approximator_fixed._make_predict_function()

# only this one will be trained
q_approximator.compile(RMSprop(LEARNING_RATE, rho=RHO, epsilon=EPSILON), loss=huber_loss)

graph = tf.get_default_graph()
#graph = K.get_session().graph         # this way also doesn't work

inside the generator:
```
with graph.as_default():
q_values = q_approximator_fixed.predict([state.reshape(1, *INPUT_SHAPE),
np.ones((1, NUM_ACTIONS))])

and finally, the training setup:

q_approximator.fit_generator(interaction_generator(q_approximator_fixed,
replay_memory,
exploration,
interaction_counter,
interaction_lock),
epochs=10, steps_per_epoch=BATCH_SIZE * 1000,
use_multiprocessing=True,
workers=1)
```

With just 1 worker and no multiprocessing it works fine. Multiple workers and no multiprocessing, also fine.
But a single worker and multiprocessing makes the program crash with the above error message.

How can I use a keras model in a multiprocessing context ?

Source

lhk

All 15 comments

Since the error message complains about no device information, I thought it was probably a problem of initializing tensorflow correctly in the other processes.

So I decided to try something else first:
Instead of sharing the model, which apparently doesn't provide necessary information, I would recreate the model in every process.

Now I provide the weights for the model as parameter to the generator and have this code within the generator:

q_approximator_fixed = create_model()
q_approximator_fixed.set_weights(weights)

It hangs on loading the weights.

Ok, so I googled tensorflow variable initialization in a multiprocess setup. This issue: https://github.com/tensorflow/tensorflow/issues/5448 seems to have exactly my problem.
They create a new session in each subprocess, and provide the existing graph to this session.
I added this code inside my generator

    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    config.log_device_placement = True

    sess = tf.Session(config=config, graph= graph)
    K.set_session(sess)

    q_approximator_fixed = create_model()
    q_approximator_fixed.set_weights(weights)

Now it logs how the variables are allocated:

I tensorflow/core/common_runtime/placer.cc:886] input_1: (Placeholder)/job:localhost/replica:0/task:0/device:CPU:0

But they get allocated on CPU, which is not what I want, and the code still hangs during / after allocation.

The issue also says that moving imports to the subprocess fixed the problem. So I added the following at the top of my generator. But that doesn't do anything:

    import keras
    import tensorflow as tf

lhk on 10 May 2018

Here is a minimal working example to reproduce the error.

Please note: I have not applied any of the proposed solutions here.
They don't work for me, and I didn't want to clutter the code with them.

import numpy as np

from keras.layers import Input, Dense
from keras.models import Model
from keras.optimizers import Adam

def create_model():
    input_layer = Input((10,))
    dense = Dense(10)(input_layer)

    return Model(inputs=input_layer, outputs=dense)

model_outside = create_model()
model_outside.compile(Adam(1e-3), "mse")

def subprocess_routine(weights):
    model_inside = create_model()
    model_inside.set_weights(weights)

    while True:
        batch = np.random.rand(10, 10)
        prediction = model_inside.predict(batch)

        yield batch, prediction

weights = model_outside.get_weights()

model_outside.fit_generator(subprocess_routine(weights),
                            epochs=10,
                            steps_per_epoch=100,
                            use_multiprocessing=True,
                            workers=1)

lhk on 11 May 2018

I have created a question on SO, too: https://stackoverflow.com/questions/50289628/use-keras-in-multiprocessing

lhk on 11 May 2018

I have the exact same problem.

Mestalbet on 13 May 2018

I got same problem when using multiprocessing create Keras model at Linux.
But not happenl at Windows 10.

gigayaya on 27 May 2018

I got this problem' too. Ubuntu16.04+cuda9.0+tensorflow_gpu1.8+keras2.2.0

thomas-young-2013 on 14 Jul 2018

Same here on Ubuntu16.04 nvidia-docker container tensorflow-gpu1.8 and keras2.2.0

nfrik on 17 Jul 2018

I am having the same issue. Initially I was fixing a problem about resetting states from a Sequence object I use for fit_generator where tensorflow would complain about tensors being in different graphs. Now when I have multiprocessing=True, and do a

with session.as_default():
  with graph.as_default():
    model.reset()

in my Sequence object's __getitem__ function (which I have to, because I use stateful LSTMs), I get the mentioned error:

2018-08-01 23:52:33.053069: E tensorflow/core/grappler/clusters/utils.cc:82] Failed to get device properties, error code: 3

If I don't use multiprocessing, the training proceeds as usual. Not being able to use multiprocessing is horrible for my pipeline, so multiprocessing=False is not really an option. I am thus a bit stuck.Any ideas?

Linux Mint 18.2 (xenial-based), Cuda 9.0.176, keras 2.2.2, tensorflow-gpu 1.9.0