Keras: example code lstm_seq2seq.py warns about non-serializable keywords when attempting to save a model

Created on 12 Apr 2018 · 32Comments · Source: keras-team/keras

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

[X ] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/keras-team/keras.git --upgrade --no-deps
[ X] If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.
[ ] If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
[ X] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).
https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq.py

Basic Issue:
I'm running on TensorFlow

In the Keras example linked above, the file lstm_seq2seq.py generates an error.

line 153: model.save('s2s.h5') returns

2379: UserWarning: Layer lstm_2 was passed non-serializable keyword arguments: {'initial_state': [<tf.Tensor 'lstm_1/while/Exit_2:0' shape=(?, 256) dtype=float32>, <tf.Tensor 'lstm_1/while/Exit_3:0' shape=(?, 256) dtype=float32>]}. They will not be included in the serialized model (and thus will be missing at deserialization time).
  str(node.arguments) + '. They will not be included '

Although this is phrased as a warning not an error , the result seems to be that the saved model is missing required information.

I've successfully saved other models in the past - so its something specific to this model.

Other information
I've tried breaking up the model and saving the weights and config separately (see below) but model.get_weights() returns the same error.

# alternative method to save model by breaking it up into weights and config
import os
import pickle
def save_model(model, MODEL_DIR):
  if not os.path.isdir(MODEL_DIR):
    os.makedirs(MODEL_DIR)
  weights = model.get_weights()
  with open(os.path.join(MODEL_DIR ,'model'),'wb') as file_:
    pickle.dump(weights[1:], file_)
  with open(os.path.join(MODEL_DIR, 'config.json'),'w') as file_:
    file_.write(model.to_json())

save_model(model,'model_dir')

I tried to look into how model.get_weights() is implemented. Its just a loop that calls layer.get_weights() for each layer of the model.

Source

michaele4321

👍18

Most helpful comment

@microdave There are two versions of the encoder/decoder constructor; the one at
https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq.py
(as linked to by OP) only works if you have just trained the model, because it relies on already having encoder_inputs and encoder_states defined when it assigns:
encoder_model = Model(encoder_inputs, encoder_states)

It has these because encoder_inputs and encoder_states are defined during model setup. The other version at
https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq_restore.py
is needed if you are reloading the model: it dissects the layers of the loaded model and picks out the bits it needs to reconstruct everything. E.g. it precedes the above line with

encoder_inputs = model.input[0]   # input_1
encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output   # lstm_1
encoder_states = [state_h_enc, state_c_enc]

I had the same experience as you until I realised this second version was needed, so hopefully this will fix your issue.

chrispyT on 8 Aug 2018

👍10

All 32 comments

I am also getting the same "warning". Any solution to this problem yet?

KinWaiCheuk on 18 May 2018

👍4

Also having this issue. It being a warning is definitely an understatement, after saving and loading my model the performance is way off. I believe this is related (if not identical) to issue #8428. It is also referenced in several Google Group discussions, eg here. @fchollet mentions there the model should be saved properly, but I'm quite sure it's not.

martvl on 15 Jun 2018

👍3

I'm also having the same issue. BTW how would I save the encoder states?? Any idea folks

gyanendro on 19 Jul 2018

I am also getting the same "warning". Any solution to this problem yet?

ruizhang1993 on 27 Jul 2018

@martvl When you load the saved model, are you also redefining encoder_model and decoder_model? If, when you say the performance is way off, you mean that the decoded sentences look completely different to how they did when you first trained this model, then I had the same concern, but found that redefining encoder_model and decoder_model according to the git code fixed the issue and recovered the behaviour of the original model. I hope that helps!

As for the warnings, I was under the impression they were referring to the fact that the outputs of the encoder are being discarded during training. If that's the case, then it should be nothing to worry about (although I would like to know how to stop the warning from showing). I'm not completely certain so someone please feel free to confirm/correct.

chrispyT on 27 Jul 2018

@chrispyT I tried reloading the model using keras.models.load_model() from the .HDF5 file that produced the "warning", that indeed resulted in completely different sentences than the original model was producing. It might be that redefining the model with the correct layers and then loading the weights fixes the issue. However, I'm currently not using this model anymore for the project I was working on, so I won't be trying this solution anytime soon. Maybe someone else can check if that fixes the issue?

martvl on 28 Jul 2018

@chrispyT I have tried your approach and unfortunately it does not fix the issue. Here was my process:

Construct and train the model as normal, saving it after the model.fit() step
Get the test accuracy - I achieved roughly 70%
Run the exact same code (which reconstructs the decoders/encoders), replacing the model.fit() step with a call to load_model()
Get the test accuracy - this time 0%

KingDavidOfTepper on 7 Aug 2018

👍1

encoder_inputs = model.input[0]   # input_1
encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output   # lstm_1
encoder_states = [state_h_enc, state_c_enc]

I had the same experience as you until I realised this second version was needed, so hopefully this will fix your issue.

chrispyT on 8 Aug 2018

👍10

Has this been solved in some way? @chrispyT's suggestion does not fix the problem, since the internal states are not saved and restored in the model. Saving/Loading changes the predictions (only if you restart python/kernel/notebook... it works fine as long as the original model is still in memory, for some reason) and the loss is as bad as it is initially (in my case). This means that currently it is not possible in Keras to do seq2seq with storing/loading the model inbetween training runs or am I missing something?

da-bu on 13 Feb 2019

@da-bu Have you tried the solution(?) from my most recent comment on Aug 8? I believe it addresses the saving and reloading of ~~internal states~~ weights. It is (or at least was) definitely possible to save and reload a seq2seq model in Keras (edit: I realise now this refers to reloading for prediction only).

chrispyT on 13 Feb 2019

@chrispyT Wow thank you for your super-fast reply! :) Yes I've tried the code you've linked to. This works fine for reconstructing the model as long as saving and loading happens in the same python session. However, if I save the model, restart my python notebook, and then load the model again it 1) yields different predictions, and 2) seems to be "untrained" again as it has a high loss.

So training, saving, loading, training some more is not possible at the moment. I think the issue is that it only saves the weights but not the internal states if I understand this correctly. I also get the user warning that the original poster had (with the latest versions of keras and tensorflow).

Edit: The seq2seq_restory code (https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq_restore.py) only addresses restoring the models for prediction (and that doesn't work across python sessions it seems) - but I am also interested in continuing the training of the model.

da-bu on 13 Feb 2019

Daniel, I was unable to fix this issue as well. As you indicated, it saves
the weights but not the internal states.

On Wed, Feb 13, 2019 at 10:01 AM Daniel notifications@github.com wrote:

@chrispyT https://github.com/chrispyT Wow thank you for your super-fast
reply! :) Yes I've tried the code you've linked to. This works fine for
reconstructing the model as long as saving and loading happens in the same
python session. However, if I save the model, restart my python notebook,
and then load the model again it 1) yields different predictions, and 2)
seems to be "untrained" again as it has a high loss. So training, saving,
loading, training some more is not possible at the moment. I think the
issue is that it only saves the weights but not the internal states if I
understand this correctly. I also get the user warning that the original
poster had (I have latest version of keras and tensorflow).

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/9914#issuecomment-463301785,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHXGg7spdn08B2vpBl6yEbXreBOu2L83ks5vNFLngaJpZM4TRHJ5
.

KingDavidOfTepper on 13 Feb 2019

@KingDavidOfTepper: Thank you for the reply! Have you found a workaround for your case?

Also, should we open a new issue for this? Or what would be a way to bring attention to this? Given that it's an issue with a main model architecture, which is even part of the official keras examples, I think this might deserve a fix or more "official" workaround suggestions.

da-bu on 13 Feb 2019

@da-bu Sorry, I misunderstood, so I retract my statement about it being 'definitely' possible! I'm afraid I didn't encounter this problem myself as I wasn't retraining my models.

chrispyT on 13 Feb 2019

So, is there any solution to this problem yet? Anyone? Any idea @fchollet ?
For info: this is on Keras v2.2.4, installed using pip inside an Anaconda virtual env.

wishvivek on 14 Feb 2019

Thank you all for the input and confirmations. There definitely is something that is not saved and restored as expected, but I wonder if my understanding is correct. Shouldn't the values of the internal states be a result of the LSTM processing the input sequence? i.e. they are not something that needs to be stored after all? (see the stateful attribute - I use stateful=False, e.g. see http://philipperemy.github.io/keras-stateful-lstm/). But if so, what is the thing that isn't properly stored and retrieved? A weight matrix?

da-bu on 14 Feb 2019

I have the same problem, do someone found the solution?
The model is correctly saved with this warning when I load the model the prediction is not the same.

azanux on 13 Mar 2019

I haven't found a solution unfortunately. Well, in a way - I switched to using pytorch instead...

da-bu on 13 Mar 2019

I look forward to this problem being resolved soon.

HevenSY on 24 Mar 2019

Will this be easier to track/fix with the changes to SavedModel in TensorFlow 2.0 ?

niccottrell on 6 Apr 2019

So, is there any solution to this problem yet?

jiayiwang5 on 8 Apr 2019

Got the same problem. Any solution??

victorE23 on 17 May 2019

Having the same problem - there was a suggestion elsewhere that using a checkpoint and reloading from the checkpoint would solve the problem, but it doesn't :(

can anyone advise on how to I can install this francoisdelarbre fix into my colab notebook?

garrettoca on 22 May 2019

Could we just save the states after training with open('seq2seq.states', 'wb').write(states) ???

nixon-voxell on 28 May 2019

I tried to reach out to @fchollet on twitter with this issue. Maybe he will respond: https://twitter.com/em_aryan/status/1136256827996327937

aryancodify on 5 Jun 2019

@nixon-nyx Did you try to save and reload the states? Did it work and if it did can you please share the code?

aryancodify on 5 Jun 2019

Yes. The code is simply on the keras seq2seq demo.

On Wed, 5 Jun 2019 at 21:07, Aryan Singh notifications@github.com wrote:

@nixon-nyx https://github.com/nixon-nyx Did you try to save and reload
the states? Did it work and if it did can you please share the code?

—
You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/9914?email_source=notifications&email_token=AKNQXZSTZQNOQB5EKPERAHLPY63BHA5CNFSM4E2EOJ42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODW7UMTY#issuecomment-499074639,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKNQXZUQZRQIDLOG2DZ4TITPY63BHANCNFSM4E2EOJ4Q
.

nixon-voxell on 5 Jun 2019

It even work fine for bidirectional seq2seq model.

On Wed, 5 Jun 2019 at 21:08, Nyx AIoT nyxiot@gmail.com wrote:

Yes. The code is simple on the keras seq2seq demo.

On Wed, 5 Jun 2019 at 21:07, Aryan Singh notifications@github.com wrote:

@nixon-nyx https://github.com/nixon-nyx Did you try to save and reload
the states? Did it work and if it did can you please share the code?

—
You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/9914?email_source=notifications&email_token=AKNQXZSTZQNOQB5EKPERAHLPY63BHA5CNFSM4E2EOJ42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODW7UMTY#issuecomment-499074639,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKNQXZUQZRQIDLOG2DZ4TITPY63BHANCNFSM4E2EOJ4Q
.

nixon-voxell on 5 Jun 2019

@nixon-nyx I got the training, sample code to work and saved the model but how to save and reload the states is what i am stuck at, because without encoder states the saved model is no good. Can you please help me there?

aryancodify on 5 Jun 2019

@aryancodify Sure.

from __future__ import print_function

from sklearn.model_selection import train_test_split
from keras.models import Model, load_model
from keras.layers import Dense, Input, Embedding, Bidirectional, Concatenate, LSTM, CuDNNLSTM
from keras.preprocessing.sequence import pad_sequences
from keras.callbacks import TensorBoard, Callback#, ModelCheckpoint
from keras.optimizers import RMSprop
import keras.backend as K


# Seq2Seq model

latent_dim = 256

enc_input = Input(shape=(None, ), name='enc_input')
enc_emb = Embedding(input_dim=self.p_vocab_size, output_dim=128, input_length=self.maxlen, name='enc_emb')
enc_lstm = Bidirectional(CuDNNLSTM(units=latent_dim, return_state=True, name='enc_lstm'))
enc_out, forward_h, forward_c, back_h, back_c = enc_lstm(enc_emb(enc_input))

enc_states = [Concatenate()([forward_h, back_h]), Concatenate()([forward_c, back_c])]

dec_input = Input(shape=(None, self.c_vocab_size), name='dec_input')

dec_lstm = CuDNNLSTM(units=latent_dim*2, return_sequences=True, return_state=True, name='dec_lstm')
dec_out, _, _ = dec_lstm(dec_input, initial_state=enc_states)
dec_dense = Dense(units=self.c_vocab_size, activation='softmax', name='dec_dense')
dec_target = dec_dense(dec_out)

model = Model([enc_input, dec_input], dec_target)
model.compile(loss='categorical_crossentropy', optimizer=self.optimizer)
model.summary()

'''
use this only if you want to train on a big dataset as generator
enable you to go through your dataset in small batches
'''
model.fit_generator(generator=train_gen,
                    steps_per_epoch=train_batches,
                    epochs=self.epochs,
                    verbose=1,
                    validation_data=test_gen,
                    validation_steps=test_batches,
                    callbacks=[checkpoint, tensorboard])

model.save_weights(self.weight_filepath)

you will need to change the vocab size to yours and the maxlen of a sentence to yours

nixon-voxell on 5 Jun 2019

For inference, you simply need to build the model once again and load the weights using model.load_weights(weight_filepath)

# Seq2Seq model

latent_dim = 256

enc_input = Input(shape=(None, ), name='enc_input')
enc_emb = Embedding(input_dim=self.p_vocab_size, output_dim=128, input_length=self.maxlen, name='enc_emb')
enc_lstm = Bidirectional(CuDNNLSTM(units=latent_dim, return_state=True, name='enc_lstm'))
enc_out, forward_h, forward_c, back_h, back_c = enc_lstm(enc_emb(enc_input))

enc_states = [Concatenate()([forward_h, back_h]), Concatenate()([forward_c, back_c])]

dec_input = Input(shape=(None, self.c_vocab_size), name='dec_input')

dec_lstm = CuDNNLSTM(units=latent_dim*2, return_sequences=True, return_state=True, name='dec_lstm')
dec_out, _, _ = dec_lstm(dec_input, initial_state=enc_states)
dec_dense = Dense(units=self.c_vocab_size, activation='softmax', name='dec_dense')
dec_target = dec_dense(dec_out)

model = Model([enc_input, dec_input], dec_target)
model.compile(loss='categorical_crossentropy', optimizer=self.optimizer)
model.load_weights(inf_weight_filepath)

# inference
enc_model = Model(enc_input, enc_states)

dec_state_in = [Input(shape=(latent_dim*2,)), Input(shape=(latent_dim*2,))]
inf_dec_out, h, c = dec_lstm(dec_input, initial_state=dec_state_in)
dec_states = [h, c]
dec_out = dec_dense(inf_dec_out)

dec_model = Model([dec_input] + dec_state_in, [dec_out] + dec_states)

nixon-voxell on 5 Jun 2019

still no solution for this problem ?