Keras: "bad marshal data" when loading model that was saved with python 2.7 into python 3.4.

Created on 26 Jul 2017 · 34Comments · Source: keras-team/keras

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

[x] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
[x] If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.
[ ] If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
[x] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

I trained a model using python 2.7, and now I need to load it using python 3.4. The model includes a simple Lambda layer. The simplified script below reproduces the error:

from keras.models import Model, load_model
from keras.layers import Input, Lambda
import sys

if sys.version_info < (3,4):
    inp = Input(shape=(28,28,1))
    x = Lambda(lambda x: x + 1)(inp)
    model = Model(inp, x)
    model.save('lambdamodel.hdf5')
else:
    model = load_model('lambdamodel.hdf5') # Error here.
    model.summary()

The model gets created and saved fine in a python2.7 virtualenv:

(py2keras) kzh@otter:tmp$ python --version
Python 2.7.12
(py2keras) kzh@otter:tmp$ pip list
DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.
backports.weakref (1.0rc1)
bleach (1.5.0)
funcsigs (1.0.2)
h5py (2.7.0)
html5lib (0.9999999)
Keras (2.0.6)
Markdown (2.6.8)
mock (2.0.0)
numpy (1.13.1)
pbr (3.1.1)
pip (9.0.1)
protobuf (3.3.0)
PyYAML (3.12)
scipy (0.19.1)
setuptools (36.2.3)
six (1.10.0)
tensorflow (1.2.1)
Theano (0.9.0)
Werkzeug (0.12.2)
wheel (0.29.0)
(py2keras) kzh@otter:tmp$ python lambdabug.py 
Using TensorFlow backend.
Done

The error comes up when loading in a python3 virtualenv:

(py3keras) kzh@otter:tmp$ python --version
Python 3.5.2
(py3keras) kzh@otter:tmp$ pip list
backports.weakref (1.0rc1)
bleach (1.5.0)
h5py (2.7.0)
html5lib (0.9999999)
Keras (2.0.6)
Markdown (2.6.8)
numpy (1.13.1)
pip (9.0.1)
protobuf (3.3.0)
PyYAML (3.12)
scipy (0.19.1)
setuptools (36.2.3)
six (1.10.0)
tensorflow (1.2.1)
Theano (0.9.0)
Werkzeug (0.12.2)
wheel (0.29.0)
(py3keras) kzh@otter:tmp$ python lambdabug.py 
Using TensorFlow backend.
Traceback (most recent call last):
  File "lambdabug.py", line 11, in <module>
    model = load_model('lambdamodel.hdf5')
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/models.py", line 233, in load_model
    model = model_from_config(model_config, custom_objects=custom_objects)
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/models.py", line 307, in model_from_config
    return layer_module.deserialize(config, custom_objects=custom_objects)
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/layers/__init__.py", line 54, in deserialize
    printable_module_name='layer')
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/utils/generic_utils.py", line 139, in deserialize_keras_object
    list(custom_objects.items())))
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/engine/topology.py", line 2476, in from_config
    process_layer(layer_data)
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/engine/topology.py", line 2462, in process_layer
    custom_objects=custom_objects)
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/layers/__init__.py", line 54, in deserialize
    printable_module_name='layer')
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/utils/generic_utils.py", line 139, in deserialize_keras_object
    list(custom_objects.items())))
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/layers/core.py", line 697, in from_config
    function = func_load(config['function'], globs=globs)
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/utils/generic_utils.py", line 200, in func_load
    code = marshal.loads(code.encode('raw_unicode_escape'))
ValueError: bad marshal data (unknown type code)

I've never used marshal directly myself and don't have time to dig much further into this. In the meantime I'll keep using python2.7 for the code I was planning to move to 3.4. Any tips or fixes are appreciated.

Source

alexklibisz

Most helpful comment

One (somewhat hacky) fix is the following: if you can recreate the architecture (i.e. you have the original code used to generate it), you can instantiate the model from that code and then use model.load_weights('your_model_file.hdf5') to load in the weights. This works for me but it isn't an option if you don't have the code used to create the original architecture.

alexklibisz on 9 Aug 2017

👍17 ❤7 🚀1

All 34 comments

I am also having this problem going the other way. I train on a cluster with python 3.x and need to load the model using Python2.7 to convert it with CoreML. I cannot have both on the same unfortunately due to restrictions on the cluster. My guess is that python3 is saving with unicode whereas python2.7 likes to use different encoding.

robmsmt on 31 Jul 2017

What is your output of the following on 2.x?

>>> import sys
>>> print sys.maxunicode

robmsmt on 31 Jul 2017

@rmsmith88 I get 1114111

Python 2.7.12 (default, Nov 19 2016, 06:48:10) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.maxunicode
1114111

alexklibisz on 31 Jul 2017

@alexklibisz same problems. Did you solved it?

lubeast on 1 Aug 2017

No haven't figured it out yet.

alexklibisz on 1 Aug 2017

Same problem here. Model converted from Darknet format, using tools that require Python 3. When loading the model in Python 2, I have this problem. Me too I am facing the general issue of converting and importing models to CoreML

PaoloLongato on 8 Aug 2017

alexklibisz on 9 Aug 2017

👍17 ❤7 🚀1

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

stale[bot] on 6 Nov 2017

I had the same problem and noticed that coremltools had a couple commits to support Python 3. You can clone the repo, download cmake for mac, build from source, and add the coremltools repo directory to your PYTHONPATH environment variable, then try using coremltools with python 3. It's working so far for me.

pchensoftware on 11 Nov 2017

For a hacky solution to CoreML not handling the Lambda layer:

https://github.com/allanzelener/YAD2K/issues/80#issuecomment-343685850

pchensoftware on 11 Nov 2017

i meet the same problem. my model is trained and saved with python3.4.5 as .h5, and i want to load in python2.7. but in both python, the "sys.maxunicode" is the same 1114111
any ways to fix?

bikong2 on 18 Nov 2017

Does the saved data contain integers in the ranges -9223372036854775808 to -2147483649 or 2147483648 to 9223372036854775807? Can the failure be reproduced on 32-bit platform?

serhiy-storchaka on 30 Nov 2017

i have the same problem, my model is trained on python3.5.2 and also loaded on python3.5.2 too ??

MyadaRoshdi on 4 Dec 2017

Guys, it is solved when I updated both Keras and tensorflow to the latest versions
pip install --upgrade tensorflow
pip install --upgrade keras

MyadaRoshdi on 4 Dec 2017

👍10 👎7 🎉1

In my case, it is solved when I matched versions of Keras which I saved and loaded model.
Thanks @MyadaRoshdi !

tae-jun on 9 Dec 2017

Solved while I upgraded Keras from 2.0.2 to 2.1.3.

serser on 30 Jan 2018

I got the same problem. I've tried to upgrade both keras and tf but still doesn't work. I have a self-defined layer in my model, not sure if this is the reason why cause the problem.
Anyone knows how to fix? Massive thanks.

Derekkk on 7 Feb 2018

Solved while I upgraded to latest version of Keras(2.1.4)

dkarunakaran on 22 Feb 2018

🎉1 👍1

I have latest versions of tensorflow (1.6.0) and keras (2.1.5) and am getting the problem... models were saved on Python 2.7 possibly using older versions of keras/tf... loading under Python 3.6 and latest versions gives the error above

anentropic on 19 Mar 2018

Is marshall the best choice to be using here?

https://docs.python.org/3/library/marshal.html

The marshal module exists mainly to support reading and writing the “pseudo-compiled” code for Python modules of .pyc files. Therefore, the Python maintainers reserve the right to modify the marshal format in backward incompatible ways should the need arise. If you’re serializing and de-serializing Python objects, use the pickle module instead – the performance is comparable, version independence is guaranteed, and pickle supports a substantially wider range of objects than marshal.

anentropic on 19 Mar 2018

👍2

Train with python2.7.4 + tensorflow 1.2.0(maybe) + keras(unknown). Reload with python3.6 tf 1.8 keras 2.1.6. @alexklibisz 's answer sloved my problem. I first try to use "model.to_json" and "model_from_json(json_string)" to rebuild the model architecture, but failed. Then I run the init code again(Model, Dense, Relu etc..) and then "model.load_weights", works now.

lhdgriver on 25 Jun 2018

👍1

The main problem of this is to load the models in python2 which was built in python3. With Keras==2.1.5 and Tensorflow==1.8.0, I can load simple models but still cannot load sophisticated models with merging layers. Some error occurs with "model_from_json" Any solution to this (except initializing the model and load_weights)?

signalogy on 10 Sep 2018

One (somewhat hacky) fix is the following: if you can recreate the architecture (i.e. you have the original code used to generate it), you can instantiate the model from that code and then use model.load_weights('your_model_file.hdf5') to load in the weights. This works for me but it isn't an option if you don't have the code used to create the original architecture.

This solved my problem! Thanks!

andersoncruzz on 1 Nov 2018

Closing as this is resolved

wt-huang on 7 Nov 2018

Just in case anyone else gets here and is as frustrated as I was....
This is not caused by Keras encoding of the model but by the serialization of the custom objects. So the only way of solving this is actually by re-creating the model with the custom objects in python 3.x, loading the weights and saving the model again.

code will look something like:

# create_model returns the model
m = create_model((512,512,3), 2)
m.load_weights('xxx.h5') # note that weights can be loaded from a full save, not only from save_weights file
m.save('xxx_3.5.h5'')

mosheliv on 26 Mar 2019

@anentropic

Is marshall the best choice to be using here?

Absolutely not IMO. Using marshal breaks load_model across even _minor_ versions of Python, e.g. 3.5 <-> 3.6! I am not aware of any benefit to marshal over pickle.

jkyl on 16 Apr 2019

👍1

The reason this happens is because your model have custom objects. Keras compile the objects and stores them in the model. One can argue if this is actually the right thing to do instead of saving the text as compilation time is really negligible compared to model compiling time. It is definitely a pain. Models like seresnext or shufflenet that have layers not implemented in tf are not very portable. Only way of bypassing this is to include the model architecture, compile it and load the file as weights.

mosheliv on 16 Apr 2019

The reason this happens is because your model have custom objects.

Yes, that much is clear.

The backwards incompatibility issues stem from the ___serialization method___, namely marshal. Keras uses marshal to dump and load custom objects to/from string. Yet, the documentation for marshal clearly states that:

... the Python maintainers reserve the right to modify the marshal format in backward incompatible ways should the need arise. If you’re serializing and de-serializing Python objects, use the pickle module instead ...

In fact, I believe pickle could serve as a drop-in replacement for marshal as-implemented, because it provides dumps and loads methods with the same signature.

If there is a reason why this wasn't done to begin with, I would love to know.

jkyl on 17 Apr 2019

The python compilation uses marshal coding. Python, not Keras. Why this is
done is probably lost in the fogs of time. Keras serialize the compiled
python code in h5.
There is no other "native" way of serializing python other then this or
plain text.

On Wed, Apr 17, 2019 at 10:16 AM Jonathan Kyl notifications@github.com
wrote:

The reason this happens is because your model have custom objects.

Yes, that much is clear.

The backwards incompatibility issues stem from the serialization method,
namely marshal https://docs.python.org/3/library/marshal.html. Keras
uses marshal to dump
https://github.com/keras-team/keras/blob/261bcb2515a82ba4c96c03e4960c03195218ddb4/keras/utils/generic_utils.py#L183
and load
https://github.com/keras-team/keras/blob/261bcb2515a82ba4c96c03e4960c03195218ddb4/keras/utils/generic_utils.py#L233
custom objects to/from string. Yet, the documentation for marshal clearly
states that:

... the Python maintainers reserve the right to modify the marshal format
in backward incompatible ways should the need arise. If you’re serializing
and de-serializing Python objects, use the pickle
https://docs.python.org/3/library/pickle.html#module-pickle module
instead ...

In fact, I believe pickle could serve as a drop-in replacement for marshal
as-implemented, because it provides dumps and loads methods with the same
signature.

If there is a reason why this wasn't done to begin with, I would love to
know.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/7440#issuecomment-483864373,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AL6LC5sPTPxermRaE4Bc0hpJmygnPuh_ks5vhku8gaJpZM4OkP8A
.

mosheliv on 17 Apr 2019

Just to be clear, I think the reference you gave is for DATA objects. for
this, definitely pickling is better. However, for code objects, I don't
think there is any better (or worse) way. Only way IMHO is actually to
serialize the plain text instead of the compiled python.

On Wed, Apr 17, 2019 at 10:34 AM Moshe Livne moshe.livne@gmail.com wrote:

The python compilation uses marshal coding. Python, not Keras. Why this is
done is probably lost in the fogs of time. Keras serialize the compiled
python code in h5.
There is no other "native" way of serializing python other then this or
plain text.

On Wed, Apr 17, 2019 at 10:16 AM Jonathan Kyl notifications@github.com
wrote:

The reason this happens is because your model have custom objects.

Yes, that much is clear.

The backwards incompatibility issues stem from the serialization method,
namely marshal https://docs.python.org/3/library/marshal.html. Keras
uses marshal to dump
https://github.com/keras-team/keras/blob/261bcb2515a82ba4c96c03e4960c03195218ddb4/keras/utils/generic_utils.py#L183
and load
https://github.com/keras-team/keras/blob/261bcb2515a82ba4c96c03e4960c03195218ddb4/keras/utils/generic_utils.py#L233
custom objects to/from string. Yet, the documentation for marshal clearly
states that:

... the Python maintainers reserve the right to modify the marshal format
in backward incompatible ways should the need arise. If you’re serializing
and de-serializing Python objects, use the pickle
https://docs.python.org/3/library/pickle.html#module-pickle module
instead ...

In fact, I believe pickle could serve as a drop-in replacement for
marshal as-implemented, because it provides dumps and loads methods with
the same signature.

If there is a reason why this wasn't done to begin with, I would love to
know.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/7440#issuecomment-483864373,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AL6LC5sPTPxermRaE4Bc0hpJmygnPuh_ks5vhku8gaJpZM4OkP8A
.

mosheliv on 17 Apr 2019

Sure, to be clear, I am talking about serializing python objects, specifically the function argument to the Lambda layer. When Keras saves a Lambda layer in an HDF5 file, it must first serialize this function. Currently Keras does this with the marshal module, as seen here.

Yet, pickle will also serialize arbitrary python classes and functions perfectly well, and is, in my opinion, much better suited to this task than marshal. From pickle's documentation:

The pickle module implements binary protocols for serializing and de-serializing a Python object structure.
...
Python has a more primitive serialization module called marshal, but in general __pickle should always be the preferred way to serialize Python objects.__

The benefits in our particular case are clear: pickle maintains backward compatibility, while marshal does not.

jkyl on 17 Apr 2019

Ok. I stand corrected. However, please note that even with pickling the
bytecode is not guaranteed to work between versions. It would be much much
better to at least have the option of saving the function in plaintext in
the model.

On Wed, Apr 17, 2019, 13:45 Jonathan Kyl notifications@github.com wrote:

Sure, to be clear, I am talking about serializing python objects,
specifically the function argument to the Lambda layer. When Keras saves
a Lambda layer in an HDF5 file, it must first serialize this function.
Currently Keras does this with the marshal module, as seen here
https://github.com/keras-team/keras/blob/261bcb2515a82ba4c96c03e4960c03195218ddb4/keras/utils/generic_utils.py#L183
.

Yet, pickle will also serialize arbitrary python classes and functions
perfectly well, and is, in my opinion, much better suited to this task than
marshal. From pickle's documentation
https://docs.python.org/3/library/pickle.html:

The pickle module implements binary protocols for serializing and
de-serializing a Python object structure.
...
Python has a more primitive serialization module called marshal, but in
general pickle should always be the preferred way to serialize Python
objects.

The benefits in our particular case are clear: pickle maintains backward
compatibility, while marshal does not.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/7440#issuecomment-483904952,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AL6LC8t3cGhoUC7Dx-HLV8dUlirFp8XVks5vhnycgaJpZM4OkP8A
.

mosheliv on 17 Apr 2019

From python docs:

The pickle serialization format is guaranteed to be backwards compatible across Python releases provided a compatible pickle protocol is chosen and pickling and unpickling code deals with Python 2 to Python 3 type differences

i.e you have to be sure the actual python code you're unpickling is 2/3 compatible but otherwise pickle works

anentropic on 17 Apr 2019

The problem in this case is not the serialization but the bytecode itself.
It is not compatible between versions, even minor ones. The doc also states
that next version might use a completely different internal representation,
perhaps to address this problem.
It is my understanding that whenever you are serializing lambda layer, you
are serializing the bytecode (see
https://docs.python.org/3/glossary.html#term-bytecode). This is supported
by the implementation link quoted above where keras is marshal dumping
__code__ which is the bytecode.
As such, this will not work between versions of python, no matter how you
serialize it.
So, again, the only solution is to store the text itself.

On Wed, Apr 17, 2019 at 6:59 PM anentropic notifications@github.com wrote:

From python docs:

The pickle serialization format is guaranteed to be backwards compatible
across Python releases provided a compatible pickle protocol is chosen and
pickling and unpickling code deals with Python 2 to Python 3 type
differences

i.e you have to be sure the actual python code you're unpickling is 2/3
compatible but otherwise pickle works

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/7440#issuecomment-483961598,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AL6LC6EfzObOKvDUxBzN1130esMK8fXIks5vhsZPgaJpZM4OkP8A
.