Keras: TypeError: can't pickle _thread.lock objects

Created on 1 Nov 2017  Â·  24Comments  Â·  Source: keras-team/keras

Information:

  • Keras version 2.0.8
  • Tensorflow version 1.3.0
  • Python 3.6

Minimal example to reproduce the error:

from keras.layers import Input, Lambda, Dense
from keras.models import Model
from keras.callbacks import ModelCheckpoint
from keras.optimizers import Adam
import tensorflow as tf
import numpy as np

x = Input(shape=(30,3))
low = tf.constant(np.random.rand(30, 3).astype('float32'))
high = tf.constant(1 + np.random.rand(30, 3).astype('float32'))
clipped_out_position = Lambda(lambda x, low, high: tf.clip_by_value(x, low, high),
                                      arguments={'low': low, 'high': high})(x)

model = Model(inputs=x, outputs=[clipped_out_position])
optimizer = Adam(lr=.1)
model.compile(optimizer=optimizer, loss="mean_squared_error")
checkpoint = ModelCheckpoint("debug.hdf", monitor="val_loss", verbose=1, save_best_only=True, mode="min")
training_callbacks = [checkpoint]
model.fit(np.random.rand(100, 30, 3), [np.random.rand(100, 30, 3)], callbacks=training_callbacks, epochs=50, batch_size=10, validation_split=0.33)

Error output:

Train on 67 samples, validate on 33 samples
Epoch 1/50
10/67 [===>..........................] - ETA: 0s - loss: 0.1627Epoch 00001: val_loss improved from inf to 0.17002, saving model to debug.hdf
Traceback (most recent call last):
  File "debug_multitask_inverter.py", line 19, in <module>
    model.fit(np.random.rand(100, 30, 3), [np.random.rand(100, 30, 3)], callbacks=training_callbacks, epochs=50, batch_size=10, validation_split=0.33)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/training.py", line 1631, in fit

â–½
    validation_steps=validation_steps)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/training.py", line 1233, in _fit_loop
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/callbacks.py", line 73, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/callbacks.py", line 414, in on_epoch_end
    self.model.save(filepath, overwrite=True)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/topology.py", line 2556, in save
    save_model(self, filepath, overwrite, include_optimizer)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/models.py", line 107, in save_model
    'config': model.get_config()
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/topology.py", line 2397, in get_config
    return copy.deepcopy(config)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 215, in _deepcopy_list
    append(deepcopy(a, memo))
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 169, in deepcopy
    rv = reductor(4)
TypeError: can't pickle _thread.lock objects

It seems like the error has occurred in the past in different contexts here, but I'm not dumping the model directly -- I'm using the ModelCheckpoint callback. Any idea what could be going wrong?

tensorflow

Most helpful comment

This exception is raised mainly because you're trying to serialize an unserializable object.
In the context, the "unserializable" object is the tf.tensor.

So remember this: Don't let raw tf.tensors wandering in your model.

For my case, I'm trying to use K.shape() to acquire a shape of a tensor, and reuse it later, like this:

        x_shape = K.shape(x)
        x = SomeLayers(x)
        x = Lambda( lambda x: K.reshape(x, [x_shape[0], x_shape[1]]))(x)

x_shape is a tensorflow tensor, it is not associated with any other keras layers. That's why I call it a lonely wandering tensor. It will cause the "can't pickle _thread.RLock objects" error.

A possible solution would be:

    x_shape = Lambda(lambda x: K.shape(x), outpute_shape=(you_should_know, ))(x)
    x = SomeLayers(x)
    x = Lambda(lambda xs: K.reshape(xs[0], [xs[1][0], xs[1][1]]), output_shape=(you_should_know))([x, x_shape])

No wandering tensorflow tensors, No errors.

All 24 comments

I looked into it some more and it seems to have something to with the Lambda layer when it hits this line. Attempting to get_config on the Lambda layer seems to lock the config object somehow?

Edit: It seems like this is only when the arguments are TensorFlow tensors. Is there any way to get around this issue so I can wrap a tf function in Lambda layer?

Hitting this problem with VAE, without any of GRU/RNN/LSTM. Strange thing is it was fine before, and only starts acting up after I put the model into a class.

Fixed. VAE has a lambda, which refers to data attributes in class. Assign them to locals and let lambda use only locals.

@lyxm Hi, I encounter the same issue with you. Would you mind providing the solution in more details?

Need to see your code. Basic idea is to check for loops in your data references,

@lyxm, do you mind showing your code before (non-working code) and after the fix? This will be enough for us to understand in which direction we should think. thanks in advance

class Model:
    .....

    def make_model(self):
        .....
        x = Input(shape=(self.input_dim,))
        z_mean = Dense(self.latent_dim, activation='elu')(x)
        z_log_var = Dense(self.latent_dim, activation='elu')(x)

        # this makes it work
        latent_dim = self.latent_dim
        epsilon_std = self.epsilon_std

        def sampling(args):
            z_mean, z_log_var = args
            epsilon = K.random_normal(shape=(latent_dim,),
                                      mean=0., stddev=epsilon_std)
            return z_mean + K.exp(z_log_var/2) * epsilon

        z = Lambda(sampling, output_shape=(self.latent_dim,))([z_mean, z_log_var])

The code above works.

If you refer to self.latent_dim and self.epsilon_std directly inside sampling, it would complain.

is the problem solved?i see the same problem.

I also face this problem (my model has Lambda layer). I can avoid this error by setting save_weights_only in ModelCheckpoint.

Python cannot pickle lambda expressions. You may want to try replacing them (e.g. the one you passed to your Lambda layer) with named functions, as @lyxm suggested.

I have the same problem. I have a tensorflow model which extends BaseEstimator and ClassifierMixin:

class Model(BaseEstimator, CassifierMixin):

    def __init__(self, image, label):
        self.image = image
        self.label = label
        self.x_ = None
        self.predict
        self.optimize
        self.error

    def optimize(self):
        current_error = self.error
        logprob = tf.log(self.predict + 1e-12) * (1 - current_error)
        cross_entropy = -tf.reduce_sum(self.label * logprob)
        optimizer = tf.train.RMSPropOptimizer(0.03)
        return optimizer.minimize(cross_entropy)

    def error(self):
        mistakes = tf.not_equal(
            tf.argmax(self.label, 1), tf.argmax(self.predict, 1)
        )
        return tf.reduce_mean(tf.cast(mistakes, tf.float32))

    def fit(self, images, labels):
        Xtr, ytr = images, labels
        with tf.Session() as sess:
            sess.run(tf.initialize_all_variables())
            for epoch in range(10):
                for batch in range(60):
                    X = Xtr[batch * 100:(batch + 1) * 100]
                    y = ytr[batch * 100:(batch + 1) * 100]
                    sess.run(self.optimize, {self.image: X, self.label: y})

        return self

mnist = input_data.read_data_sets('./data/', one_hot=True)
Xte, yte = mnist.test.images, mnist.test.labels
Xtr, ytr = mnist.train.images, mnist.train.labels
image = tf.placeholder(tf.float32, [None, 784])
label = tf.placeholder(tf.float32, [None, 10])
model = Model(image, label)
classifier = OneVsRestClassifier(model)
y = np.random.randint(0, 100, len(
classifier.fit(Xtr, y)

TypeError: can't pickle _thread.RLock objects                                                                                                                                                                       

Am I missing something here. Any sklearn developers commenting would be really useful in order to catch and learn from our mistakes.

@kirk86 My guess is that tf.Session() acquires a lock on a resource (GPU?) and Sklearn cannot serialize the lock.

with tf.Session() as sess: should end the session and release resources at the end of the scope, not sure why it doesn't happen, maybe due to self.optimize.

Can you try with a dummy fit or by explicitly ending the session?

@mratsim I'll definitely try to end the session, so from you comment I can infer that sklearn serializes all estimators?

I am also encountering this issue when trying to use the ELMo embedding via a lambda layer:

def init():
  ...
  self.elmo_model = hub.Module("https://tfhub.dev/google/elmo/1", trainable=True)
  sess.run(tf.global_variables_initializer())
  sess.run(tf.tables_initializer())

def build_model():
  elmo_model = self.elmo_model

  def ElmoEmbedding(x):
    squeezed = tf.squeeze(tf.cast(x, tf.string), axis=1)
    return elmo_model(squeezed, signature="default", as_dict=True)["elmo"]

  input_text = Input(shape=(1,), dtype=tf.string, name='words_input')
  tokens = Lambda(ElmoEmbedding, output_shape=(None,1024), name='word_embeddings')(input_text)

  inputNodes = [input_text]
  ...
  model = Model(inputs=inputNodes, outputs=[output])

I have tried @lyxm's approach of referencing only the local variable (as you can see from the elmo_model variable), as well as both of the two answers on this Stackoverflow thread. However, none of them works for me. Do you have any idea how I can fix it? Thank you.

The full traceback (Keras 2.1.6, tensorflow 1.8.0, python 3.6.6):

Traceback (most recent call last):
  File "Train.py", line 81, in <module>
    model.fit(epochs=25)
  File "xxxxx/BiLSTM.py", line 435, in fit
    self.saveModel(modelName, epoch, dev_score, test_score)
  File "xxxxx/BiLSTM.py", line 626, in saveModel
    self.models[modelName].save(savePath, True)
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/topology.py", line 2591, in save
    save_model(self, filepath, overwrite, include_optimizer)
  File "/usr/local/lib/python3.6/dist-packages/keras/models.py", line 126, in save_model
    'config': model.get_config()
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/topology.py", line 2432, in get_config
    return copy.deepcopy(config)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 215, in _deepcopy_list
    append(deepcopy(a, memo))
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 220, in _deepcopy_tuple
    y = [deepcopy(a, memo) for a in x]
  File "/usr/lib/python3.6/copy.py", line 220, in <listcomp>
    y = [deepcopy(a, memo) for a in x]
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 220, in _deepcopy_tuple
    y = [deepcopy(a, memo) for a in x]
  File "/usr/lib/python3.6/copy.py", line 220, in <listcomp>
    y = [deepcopy(a, memo) for a in x]
  File "/usr/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 169, in deepcopy
    rv = reductor(4)
TypeError: can't pickle _thread.RLock objects

It looks like it works by changing return copy.deepcopy(config) to just return config in Model.get_config() in topology.py (Keras 2.1.6). However, is there any consequence in doing that?

=====

Just to be safe, instead of doing the above directly, I added a parameter no_deep_copy in Model.get_config() which defaults to be False, and changed the return statement to be return config if no_deep_copy else copy.deepcopy(config). I then changed the 'config': model.get_config() line (which caused the problem; see traceback above) to 'config': model.get_config(no_deep_copy=True) in save_model(model, filepath, overwrite=True, include_optimizer=True) in models.py(Keras 2.1.6). Could you let me know if there is any consequence in doing that? Thanks!

I have a similar error message using Python 3.6 on Windows but this is not related to keras. Some things to try that might help you trace the cause.

  1. Try Python 3.7 on Windows. In my project this works, but Windows Python 3.6 does not.
  2. Try Python 3 on Linux. I have used Python 3.5 on Linux with the same code and it does not raise this error.

  3. In Linux try adding the line:
    multiprocessing.set_start_method('spawn')
    and this may break the Linux code if it was previously working.

I am doing a multiprocessing project and developing on a Windows computer but ultimately deploying on a Raspberry Pi (Linux) PC. It has been tricky to get everything to work, but especially on Windows.

@ZhaofengWu Did you ever find a solution to that Elmo problem ????

This exception is raised mainly because you're trying to serialize an unserializable object.
In the context, the "unserializable" object is the tf.tensor.

So remember this: Don't let raw tf.tensors wandering in your model.

For my case, I'm trying to use K.shape() to acquire a shape of a tensor, and reuse it later, like this:

        x_shape = K.shape(x)
        x = SomeLayers(x)
        x = Lambda( lambda x: K.reshape(x, [x_shape[0], x_shape[1]]))(x)

x_shape is a tensorflow tensor, it is not associated with any other keras layers. That's why I call it a lonely wandering tensor. It will cause the "can't pickle _thread.RLock objects" error.

A possible solution would be:

    x_shape = Lambda(lambda x: K.shape(x), outpute_shape=(you_should_know, ))(x)
    x = SomeLayers(x)
    x = Lambda(lambda xs: K.reshape(xs[0], [xs[1][0], xs[1][1]]), output_shape=(you_should_know))([x, x_shape])

No wandering tensorflow tensors, No errors.

I looked into it some more and it seems to have something to with the Lambda layer when it hits this line. Attempting to get_config on the Lambda layer seems to lock the config object somehow?

Edit: It seems like this is only when the arguments are TensorFlow tensors. Is there any way to get around this issue so I can wrap a tf function in Lambda layer?

I have same error. How did you wrap a "tf" function in "Lambda" layer?

Closing this issue since several workarounds are provided in the SO link given above. Feel free to reopen if the issue still persists. Thanks!

I am facing the same issue still and not able to solve it using the above mentioned solutions. Can someone please advise for my case mentioned here: https://stackoverflow.com/questions/57233539/typeerror-cant-pickle-thread-rlock-objects

I am facing issue in save the model

from keras import backend as K
K.clear_session()
latent_dim = 300
embedding_dim=100

Encoder

encoder_inputs = Input(shape=(max_text_len,))

embedding layer

enc_emb = Embedding(x_voc, embedding_dim,trainable=True)(encoder_inputs)

encoder lstm 1

encoder_lstm1 = LSTM(latent_dim,return_sequences=True,return_state=True,dropout=0.4,recurrent_dropout=0.4)
encoder_output1, state_h1, state_c1 = encoder_lstm1(enc_emb)

encoder lstm 2

encoder_lstm2 = LSTM(latent_dim,return_sequences=True,return_state=True,dropout=0.4,recurrent_dropout=0.4)
encoder_output2, state_h2, state_c2 = encoder_lstm2(encoder_output1)

encoder lstm 3

encoder_lstm3=LSTM(latent_dim, return_state=True, return_sequences=True,dropout=0.4,recurrent_dropout=0.4)
encoder_outputs, state_h, state_c= encoder_lstm3(encoder_output2)

Set up the decoder, using encoder_states as initial state.

decoder_inputs = Input(shape=(None,))

embedding layer

dec_emb_layer = Embedding(y_voc, embedding_dim,trainable=True)
dec_emb = dec_emb_layer(decoder_inputs)
decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True,dropout=0.4,recurrent_dropout=0.2)
decoder_outputs,decoder_fwd_state, decoder_back_state = decoder_lstm(dec_emb,initial_state=[state_h, state_c])

Attention layer

attn_layer = AttentionLayer(name='attention_layer')
attn_out, attn_states = attn_layer([encoder_outputs, decoder_outputs])

Concat attention input and decoder LSTM output

decoder_concat_input = Concatenate(axis=-1, name='concat_layer')([decoder_outputs, attn_out])

dense layer

decoder_dense = TimeDistributed(Dense(y_voc, activation='softmax'))
decoder_outputs = decoder_dense(decoder_concat_input)

Define the model

model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.summary()

Error:
TypeError: can't pickle _thread.RLock objects

Was this page helpful?
0 / 5 - 0 ratings