Keras: TypeError: can't pickle _thread.lock objects

Created on 1 Nov 2017 · 24Comments · Source: keras-team/keras

Information:

Keras version 2.0.8
Tensorflow version 1.3.0
Python 3.6

Minimal example to reproduce the error:

from keras.layers import Input, Lambda, Dense
from keras.models import Model
from keras.callbacks import ModelCheckpoint
from keras.optimizers import Adam
import tensorflow as tf
import numpy as np

x = Input(shape=(30,3))
low = tf.constant(np.random.rand(30, 3).astype('float32'))
high = tf.constant(1 + np.random.rand(30, 3).astype('float32'))
clipped_out_position = Lambda(lambda x, low, high: tf.clip_by_value(x, low, high),
                                      arguments={'low': low, 'high': high})(x)

model = Model(inputs=x, outputs=[clipped_out_position])
optimizer = Adam(lr=.1)
model.compile(optimizer=optimizer, loss="mean_squared_error")
checkpoint = ModelCheckpoint("debug.hdf", monitor="val_loss", verbose=1, save_best_only=True, mode="min")
training_callbacks = [checkpoint]
model.fit(np.random.rand(100, 30, 3), [np.random.rand(100, 30, 3)], callbacks=training_callbacks, epochs=50, batch_size=10, validation_split=0.33)

Error output:

Train on 67 samples, validate on 33 samples
Epoch 1/50
10/67 [===>..........................] - ETA: 0s - loss: 0.1627Epoch 00001: val_loss improved from inf to 0.17002, saving model to debug.hdf
Traceback (most recent call last):
  File "debug_multitask_inverter.py", line 19, in <module>
    model.fit(np.random.rand(100, 30, 3), [np.random.rand(100, 30, 3)], callbacks=training_callbacks, epochs=50, batch_size=10, validation_split=0.33)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/training.py", line 1631, in fit

▽
    validation_steps=validation_steps)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/training.py", line 1233, in _fit_loop
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/callbacks.py", line 73, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/callbacks.py", line 414, in on_epoch_end
    self.model.save(filepath, overwrite=True)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/topology.py", line 2556, in save
    save_model(self, filepath, overwrite, include_optimizer)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/models.py", line 107, in save_model
    'config': model.get_config()
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/topology.py", line 2397, in get_config
    return copy.deepcopy(config)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 215, in _deepcopy_list
    append(deepcopy(a, memo))
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 169, in deepcopy
    rv = reductor(4)
TypeError: can't pickle _thread.lock objects

It seems like the error has occurred in the past in different contexts here, but I'm not dumping the model directly -- I'm using the ModelCheckpoint callback. Any idea what could be going wrong?

tensorflow

Source

jlin816

👍20

Most helpful comment

This exception is raised mainly because you're trying to serialize an unserializable object.
In the context, the "unserializable" object is the tf.tensor.

So remember this: Don't let raw tf.tensors wandering in your model.

For my case, I'm trying to use K.shape() to acquire a shape of a tensor, and reuse it later, like this:

        x_shape = K.shape(x)
        x = SomeLayers(x)
        x = Lambda( lambda x: K.reshape(x, [x_shape[0], x_shape[1]]))(x)

x_shape is a tensorflow tensor, it is not associated with any other keras layers. That's why I call it a lonely wandering tensor. It will cause the "can't pickle _thread.RLock objects" error.

A possible solution would be:

    x_shape = Lambda(lambda x: K.shape(x), outpute_shape=(you_should_know, ))(x)
    x = SomeLayers(x)
    x = Lambda(lambda xs: K.reshape(xs[0], [xs[1][0], xs[1][1]]), output_shape=(you_should_know))([x, x_shape])

No wandering tensorflow tensors, No errors.

MoyanZitto on 7 Mar 2019

👍7 🎉2 👀1

All 24 comments

I looked into it some more and it seems to have something to with the Lambda layer when it hits this line. Attempting to get_config on the Lambda layer seems to lock the config object somehow?

Edit: It seems like this is only when the arguments are TensorFlow tensors. Is there any way to get around this issue so I can wrap a tf function in Lambda layer?

jlin816 on 2 Nov 2017

👍3

Is this the same as https://github.com/tensorflow/tensorflow/issues/11157?

astrojuanlu on 7 Dec 2017

Found the solution on StackOverflow: https://stackoverflow.com/questions/44855603/typeerror-cant-pickle-thread-lock-objects-in-seq2seq

f00r on 3 Jan 2018

👍6

Hitting this problem with VAE, without any of GRU/RNN/LSTM. Strange thing is it was fine before, and only starts acting up after I put the model into a class.

Fixed. VAE has a lambda, which refers to data attributes in class. Assign them to locals and let lambda use only locals.

lyxm on 29 Mar 2018

👍2

@lyxm Hi, I encounter the same issue with you. Would you mind providing the solution in more details?

wanyu-lin on 17 Apr 2018

Need to see your code. Basic idea is to check for loops in your data references,

lyxm on 17 Apr 2018

👎21

@lyxm, do you mind showing your code before (non-working code) and after the fix? This will be enough for us to understand in which direction we should think. thanks in advance

nkrot on 26 Apr 2018

class Model:
    .....

    def make_model(self):
        .....
        x = Input(shape=(self.input_dim,))
        z_mean = Dense(self.latent_dim, activation='elu')(x)
        z_log_var = Dense(self.latent_dim, activation='elu')(x)

        # this makes it work
        latent_dim = self.latent_dim
        epsilon_std = self.epsilon_std

        def sampling(args):
            z_mean, z_log_var = args
            epsilon = K.random_normal(shape=(latent_dim,),
                                      mean=0., stddev=epsilon_std)
            return z_mean + K.exp(z_log_var/2) * epsilon

        z = Lambda(sampling, output_shape=(self.latent_dim,))([z_mean, z_log_var])

The code above works.

If you refer to self.latent_dim and self.epsilon_std directly inside sampling, it would complain.

lyxm on 27 Apr 2018

👍6

is the problem solved?i see the same problem.

moonflyer714 on 4 May 2018

I also face this problem (my model has Lambda layer). I can avoid this error by setting save_weights_only in ModelCheckpoint.

icoxfog417 on 28 May 2018

👍7

Python cannot pickle lambda expressions. You may want to try replacing them (e.g. the one you passed to your Lambda layer) with named functions, as @lyxm suggested.

evilnose on 25 Jun 2018

👍2 🎉1

I have the same problem. I have a tensorflow model which extends BaseEstimator and ClassifierMixin:

class Model(BaseEstimator, CassifierMixin):

    def __init__(self, image, label):
        self.image = image
        self.label = label
        self.x_ = None
        self.predict
        self.optimize
        self.error

    def optimize(self):
        current_error = self.error
        logprob = tf.log(self.predict + 1e-12) * (1 - current_error)
        cross_entropy = -tf.reduce_sum(self.label * logprob)
        optimizer = tf.train.RMSPropOptimizer(0.03)
        return optimizer.minimize(cross_entropy)

    def error(self):
        mistakes = tf.not_equal(
            tf.argmax(self.label, 1), tf.argmax(self.predict, 1)
        )
        return tf.reduce_mean(tf.cast(mistakes, tf.float32))

    def fit(self, images, labels):
        Xtr, ytr = images, labels
        with tf.Session() as sess:
            sess.run(tf.initialize_all_variables())
            for epoch in range(10):
                for batch in range(60):
                    X = Xtr[batch * 100:(batch + 1) * 100]
                    y = ytr[batch * 100:(batch + 1) * 100]
                    sess.run(self.optimize, {self.image: X, self.label: y})

        return self

mnist = input_data.read_data_sets('./data/', one_hot=True)
Xte, yte = mnist.test.images, mnist.test.labels
Xtr, ytr = mnist.train.images, mnist.train.labels
image = tf.placeholder(tf.float32, [None, 784])
label = tf.placeholder(tf.float32, [None, 10])
model = Model(image, label)
classifier = OneVsRestClassifier(model)
y = np.random.randint(0, 100, len(
classifier.fit(Xtr, y)

TypeError: can't pickle _thread.RLock objects

Am I missing something here. Any sklearn developers commenting would be really useful in order to catch and learn from our mistakes.

kirk86 on 3 Jul 2018

@kirk86 My guess is that tf.Session() acquires a lock on a resource (GPU?) and Sklearn cannot serialize the lock.

with tf.Session() as sess: should end the session and release resources at the end of the scope, not sure why it doesn't happen, maybe due to self.optimize.

Can you try with a dummy fit or by explicitly ending the session?

mratsim on 3 Jul 2018

@mratsim I'll definitely try to end the session, so from you comment I can infer that sklearn serializes all estimators?

kirk86 on 3 Jul 2018

I am also encountering this issue when trying to use the ELMo embedding via a lambda layer:

def init():
  ...
  self.elmo_model = hub.Module("https://tfhub.dev/google/elmo/1", trainable=True)
  sess.run(tf.global_variables_initializer())
  sess.run(tf.tables_initializer())

def build_model():
  elmo_model = self.elmo_model

  def ElmoEmbedding(x):
    squeezed = tf.squeeze(tf.cast(x, tf.string), axis=1)
    return elmo_model(squeezed, signature="default", as_dict=True)["elmo"]

  input_text = Input(shape=(1,), dtype=tf.string, name='words_input')
  tokens = Lambda(ElmoEmbedding, output_shape=(None,1024), name='word_embeddings')(input_text)

  inputNodes = [input_text]
  ...
  model = Model(inputs=inputNodes, outputs=[output])

I have tried @lyxm's approach of referencing only the local variable (as you can see from the elmo_model variable), as well as both of the two answers on this Stackoverflow thread. However, none of them works for me. Do you have any idea how I can fix it? Thank you.

The full traceback (Keras 2.1.6, tensorflow 1.8.0, python 3.6.6):

Traceback (most recent call last):
  File "Train.py", line 81, in <module>
    model.fit(epochs=25)
  File "xxxxx/BiLSTM.py", line 435, in fit
    self.saveModel(modelName, epoch, dev_score, test_score)
  File "xxxxx/BiLSTM.py", line 626, in saveModel
    self.models[modelName].save(savePath, True)
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/topology.py", line 2591, in save
    save_model(self, filepath, overwrite, include_optimizer)
  File "/usr/local/lib/python3.6/dist-packages/keras/models.py", line 126, in save_model
    'config': model.get_config()
  File "/usr/local/lib/python3.6/dist-packages/keras/engine/topology.py", line 2432, in get_config
    return copy.deepcopy(config)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 215, in _deepcopy_list
    append(deepcopy(a, memo))
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 220, in _deepcopy_tuple
    y = [deepcopy(a, memo) for a in x]
  File "/usr/lib/python3.6/copy.py", line 220, in <listcomp>
    y = [deepcopy(a, memo) for a in x]
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 220, in _deepcopy_tuple
    y = [deepcopy(a, memo) for a in x]
  File "/usr/lib/python3.6/copy.py", line 220, in <listcomp>
    y = [deepcopy(a, memo) for a in x]
  File "/usr/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.6/copy.py", line 169, in deepcopy
    rv = reductor(4)
TypeError: can't pickle _thread.RLock objects

ZhaofengWu on 4 Jul 2018

It looks like it works by changing return copy.deepcopy(config) to just return config in Model.get_config() in topology.py (Keras 2.1.6). However, is there any consequence in doing that?

=====

Just to be safe, instead of doing the above directly, I added a parameter no_deep_copy in Model.get_config() which defaults to be False, and changed the return statement to be return config if no_deep_copy else copy.deepcopy(config). I then changed the 'config': model.get_config() line (which caused the problem; see traceback above) to 'config': model.get_config(no_deep_copy=True) in save_model(model, filepath, overwrite=True, include_optimizer=True) in models.py(Keras 2.1.6). Could you let me know if there is any consequence in doing that? Thanks!

ZhaofengWu on 5 Jul 2018

👍1

I have a similar error message using Python 3.6 on Windows but this is not related to keras. Some things to try that might help you trace the cause.

Try Python 3.7 on Windows. In my project this works, but Windows Python 3.6 does not.
Try Python 3 on Linux. I have used Python 3.5 on Linux with the same code and it does not raise this error.
In Linux try adding the line:
multiprocessing.set_start_method('spawn')
and this may break the Linux code if it was previously working.

I am doing a multiprocessing project and developing on a Windows computer but ultimately deploying on a Raspberry Pi (Linux) PC. It has been tricky to get everything to work, but especially on Windows.

themzlab on 13 Aug 2018

👀1 👍1

@ZhaofengWu Did you ever find a solution to that Elmo problem ????

jewl123 on 9 Jan 2019

This exception is raised mainly because you're trying to serialize an unserializable object.
In the context, the "unserializable" object is the tf.tensor.

So remember this: Don't let raw tf.tensors wandering in your model.

For my case, I'm trying to use K.shape() to acquire a shape of a tensor, and reuse it later, like this:

        x_shape = K.shape(x)
        x = SomeLayers(x)
        x = Lambda( lambda x: K.reshape(x, [x_shape[0], x_shape[1]]))(x)

x_shape is a tensorflow tensor, it is not associated with any other keras layers. That's why I call it a lonely wandering tensor. It will cause the "can't pickle _thread.RLock objects" error.

A possible solution would be:

    x_shape = Lambda(lambda x: K.shape(x), outpute_shape=(you_should_know, ))(x)
    x = SomeLayers(x)
    x = Lambda(lambda xs: K.reshape(xs[0], [xs[1][0], xs[1][1]]), output_shape=(you_should_know))([x, x_shape])

No wandering tensorflow tensors, No errors.

MoyanZitto on 7 Mar 2019

👍7 🎉2 👀1

I looked into it some more and it seems to have something to with the Lambda layer when it hits this line. Attempting to get_config on the Lambda layer seems to lock the config object somehow?

Edit: It seems like this is only when the arguments are TensorFlow tensors. Is there any way to get around this issue so I can wrap a tf function in Lambda layer?

I have same error. How did you wrap a "tf" function in "Lambda" layer?

minda163 on 7 Mar 2019

See this answer on Stackoverflow.

alar0330 on 19 Mar 2019

👍4 🎉1

Closing this issue since several workarounds are provided in the SO link given above. Feel free to reopen if the issue still persists. Thanks!

ymodak on 22 Mar 2019

I am facing the same issue still and not able to solve it using the above mentioned solutions. Can someone please advise for my case mentioned here: https://stackoverflow.com/questions/57233539/typeerror-cant-pickle-thread-rlock-objects

KashyapCKotak on 29 Jul 2019

👍2

I am facing issue in save the model

from keras import backend as K
K.clear_session()
latent_dim = 300
embedding_dim=100

Encoder

encoder_inputs = Input(shape=(max_text_len,))

embedding layer

enc_emb = Embedding(x_voc, embedding_dim,trainable=True)(encoder_inputs)

encoder lstm 1

encoder_lstm1 = LSTM(latent_dim,return_sequences=True,return_state=True,dropout=0.4,recurrent_dropout=0.4)
encoder_output1, state_h1, state_c1 = encoder_lstm1(enc_emb)

encoder lstm 2

encoder_lstm2 = LSTM(latent_dim,return_sequences=True,return_state=True,dropout=0.4,recurrent_dropout=0.4)
encoder_output2, state_h2, state_c2 = encoder_lstm2(encoder_output1)

encoder lstm 3

encoder_lstm3=LSTM(latent_dim, return_state=True, return_sequences=True,dropout=0.4,recurrent_dropout=0.4)
encoder_outputs, state_h, state_c= encoder_lstm3(encoder_output2)

Set up the decoder, using `encoder_states` as initial state.

decoder_inputs = Input(shape=(None,))

embedding layer

dec_emb_layer = Embedding(y_voc, embedding_dim,trainable=True)
dec_emb = dec_emb_layer(decoder_inputs)
decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True,dropout=0.4,recurrent_dropout=0.2)
decoder_outputs,decoder_fwd_state, decoder_back_state = decoder_lstm(dec_emb,initial_state=[state_h, state_c])

Attention layer

attn_layer = AttentionLayer(name='attention_layer')
attn_out, attn_states = attn_layer([encoder_outputs, decoder_outputs])

Concat attention input and decoder LSTM output

decoder_concat_input = Concatenate(axis=-1, name='concat_layer')([decoder_outputs, attn_out])

dense layer

decoder_dense = TimeDistributed(Dense(y_voc, activation='softmax'))
decoder_outputs = decoder_dense(decoder_concat_input)

Define the model

model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.summary()

Error:
TypeError: can't pickle _thread.RLock objects

SumitNikam on 14 Nov 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Dropout error with Functional API ((Cast uint8 to bool is not supported)

MarkVdBergh · 3Comments

In training process, validation data are necessary?

Imorton-zd · 3Comments

Cost-sensitive classification

zygmuntz · 3Comments

Model with Dropout layer wrapped in TimeDistributed fails on Theano

somewacko · 3Comments

compile() should not require arguments when not training

kylemcdonald · 3Comments

Keras: TypeError: can't pickle _thread.lock objects

Most helpful comment

All 24 comments

Hitting this problem with VAE, without any of GRU/RNN/LSTM. Strange thing is it was fine before, and only starts acting up after I put the model into a class.

Encoder

embedding layer

encoder lstm 1

encoder lstm 2

encoder lstm 3

Set up the decoder, using encoder_states as initial state.

embedding layer

Attention layer

Concat attention input and decoder LSTM output

dense layer

Define the model

Related issues

Set up the decoder, using `encoder_states` as initial state.