Keras: Can't load_model with error “Optimizer weight shape (256, 32) not compatible with provided weight shape (4,)”

Created on 13 Oct 2016  ·  28Comments  ·  Source: keras-team/keras

I have 3 trained model file.

  1. left_branch.h5
  2. right_branch.h5
  3. concat.h5

The model concat.h5 is fine-tuned by concatenating from the two pre-trained model as the initial model(left_branch.h5, right_branch.h5).
While left_branch.h5 and right_branch.h5 model file can be load by function keras.models.load_model(),
but I load the trained concat.h5 formatted model file, I get the error blew.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 167, in load_model
    model.optimizer.set_weights(optimizer_weight_values)
  File "/usr/local/lib/python2.7/dist-packages/keras/optimizers.py", line 97, in set_weights
    'provided weight shape ' + str(w.shape))
Exception: Optimizer weight shape (256, 32) not compatible with provided weight shape (4,)
stale

Most helpful comment

I am using Keras 1.1.1 and am having the same problem. Deleting the optimizer weights as a workaround works for me.

Just in case someone needs to do the same, here's the code:

import h5py
f = h5py.File('model_file.h5', 'r+')
del f['optimizer_weights']
f.close()

All 28 comments

Similar problem is mentioned in issue #3964. I tried hadikazemi solution. It worked for me.

@xisnu Thanks for your advice! I trained many models and saved into many files, so it is laborious to delete the optimiser_weight manualy which advised by issue #3964.
Finally, I solved it by a using load_weights, as follows:

# serialize model to JSON
model_json = model.to_json()
with open("model.json", "w") as json_file:
    json_file.write(model_json)
# serialize weights to HDF5
model.save_weights("model.h5")
print("Saved model into h5 file")

# later...

# load json and create model
json_file = open('model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
# load weights into new model
loaded_model.load_weights("model.h5")
print("Loaded model from disk")

# evaluate loaded model on test data
loaded_model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
score = loaded_model.evaluate(X, Y, verbose=0)

In my case, the problem was that on one machine everything was fine, on another I got "Optimizer weight shape ... not compatible with provided weight shape ..." exception.

The problem turned out to be in Keras installed using different methods on different machines.

Although it had the same version 1.1.0 on both the machine that built the model and the machine that loaded the model, the files in /usr/local/lib/python2.7/dist-packages/keras/ were different (in particular, /usr/local/lib/python2.7/dist-packages/keras/models.py).

Uninstalling and re-installing Keras on the problematic machine, solved the error, since now files in /usr/local/lib/python2.7/dist-packages/keras/ on both the machine that built the model and the machine that loaded the model were identical (to what I had on the machine that built the model).

Confirming this issue with loading models stored after train. Issue is with weights shapes; it helps to remove optimizer from the model file. Then it loads successfully. I updated Keras to recent 1.1.0, and use Python 2.7.12 from Anaconda 4.0.0 but it does not help. I also found that if you load model in Spyder 2 times, then second load if fine. Seems to be really cool if the issue is fixed. Thanks

@fsonntag
I think the problem comes from https://github.com/fchollet/keras/commit/028aae19bf5ae6efe0b32d25d1c700224eebfcf9

If you do not use weights.sort(key=lambda x: x.name),
weights order will be different from when it is saved.
weights.sort(key=lambda x: x.auto_name) should not be used.

All the weights should have proper names and their order should be stable.

Sorry, I don't really get your point. So you're saying, name should always be set in order to solve this issue? Since Theano doesn't set name, I introduced sorting by auto_name. Alternatively one can copy the auto_name attributes to name.

Since Theano doesn't set name

@fsonntag
Is it correct???
I think Theano has both name and auto_name.
I believe you do not need to use auto_name.

https://github.com/fchollet/keras/blob/master/keras/backend/theano_backend.py#L67
https://github.com/Theano/Theano/blob/master/theano/gof/graph.py#L387

Yes, Theano variables have a name property, but it is set to None when trying to order the weights. That was part of the fix in 028aae1. As name was None under Theano, the sorting didn't work. Under Python 2, no error was raised. It was just ignored and not sorted. Python 3 raised an error, that's why I discovered that issue.

Yes, Theano variables have a name property, but it is set to None when trying the weights.

Do you mean creating a variable without a name?
I think all the variables in keras.layers classes assign names.

Maybe you should change it to weights.sort(key=lambda x: x.name if x.name else x.auto_name).
I checked that it solves the load_model() problem.

I still face the same problem when I load_model()
I've upgraded the latest version of keras
Is this issue solved?
2016-11-10 9 29 13

You have to save by the latest git version and load_model() by the latest git version.
Did you do that?
It is impossible to load Keras 1.1.0 and 1.1.1 saved model.

I fix it by cloning the repo and reinstall keras
(I use pip to uninstall and install again, but it does't work until I install keras by cloning the repo)
Thanks!

我模型算完了,就不重装了。
修改 /usr/local/lib/python2.7/site-packages/keras/optimizers.py
行 line 92 add
amap={}
for w in weights:
amap[str(w.shape)]=w
for pv, p, w in zip(param_values, params, weights):
w=amap[str(pv.shape)]
if pv.shape != w.shape:
print(p,w)
raise Exception('Optimizer weight shape ' +
str(pv.shape) +
' not compatible with '
'provided weight shape ' + str(w.shape))
weight_value_tuples.append((p, w))
有点坑,要print model.summary() 中的,"Param #" 列不一样才行。

I am using Keras 1.1.1 and am having the same problem. Deleting the optimizer weights as a workaround works for me.

Just in case someone needs to do the same, here's the code:

import h5py
f = h5py.File('model_file.h5', 'r+')
del f['optimizer_weights']
f.close()

I can confirm the same issue. Deleting the optimizer_weights as suggested by @dteoh works if your model is already trained.

However, I'm working on a project where I train for N number of epochs, stop training, adjust learning rate, then re-load and continue training the model.

In this case, deleting my optimizer_weights would lead to the error:

AttributeError: 'Model' object has no attribute 'optimizer'

This error makes sense since the optimizer attribute has been deleted. But as it currently stands, I can't figure out how to:

  1. Train a (non-Sequential) model.
  2. Serialize it.
  3. Load it from disk.
  4. Adjust learning rate.
  5. Continue training.

Without the error others in this thread have mentioned.

EDIT: Spun up a new virtual environment with Keras > 1.2 installed and was able to resume training.

Another solution is to name each layer in the model.
This helps if you're using checkpointing or something similar, where you cannot save the weights separately.
For ex: model.add(LSTM(10, name='lstm1'))
This worked for me in keras 1.1.1 and theano 0.8.2

@WarpTime Have you tested it for a model with CTC loss? Because that has been a problem for me.

@xisnu No. I have only tested it on crossentropy.

Sample problem here using SGD and categorical_crossentropy.

Something like this....

model = Sequential()
model.add(LSTM(100, return_sequences=True, input_shape=(timesteps, data_dim), name='input'))
model.add(Dropout(0.5, name='dr1'))
model.add(LSTM(100, name='ls1'))
model.add(Dropout(0.5, name='dr2'))
model.add(Dense(100, name='d1'))
model.add(Dense(nb_classes, activation='softmax', name='output'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])

After updating from Keras 1.X to Keras 2.0.1, I have the same problem:

  File "./ensemble.py", line 56, in <module>
    models = [load_model(model_path) for model_path in model_names]
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 272, in load_model
    model.optimizer.set_weights(optimizer_weight_values)
  File "/usr/local/lib/python2.7/dist-packages/keras/optimizers.py", line 79, in set_weights
    'provided weight shape ' + str(w.shape))
ValueError: Optimizer weight shape (32,) not compatible with provided weight shape (3, 3, 3, 32)

Deleting the optimizer_weights group (dataset? object? directory? what is the right term?) from the HDF5 file with hdf5view fixed it as @dteoh suggested.

I updated all my models with

#!/usr/bin/env python

"""Make keras 1.x models usable in keras 2.x."""

import glob
import h5py

model_files = sorted(glob.glob('*.h5'))
for model_file in model_files:
    print("Update '{}'".format(model_file))
    with h5py.File(model_file, 'a') as f:
        if 'optimizer_weights' in f.keys():
            del f['optimizer_weights']

Now it is working again.

Also: Is it possible to save the model without optimizer parameters?

The following codes really worked! Thank you!

    with h5py.File(model_file, 'a') as f:
        if 'optimizer_weights' in f.keys():
            del f['optimizer_weights']

I had the same problem and what it seemed to work for me was to set the compile flag to False in the load_model function. Afterwards one can compile the model with the previously used optimizer.

model = load_model('my_model.hdf5', compile=False)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

in case you are having issue in deleting the keys from the model, check this out:

https://stackoverflow.com/questions/44236522/keyerror-couldnt-delete-link-cant-delete-self

It seems that the problem of loading optimizer weights is patched in Keras 2.0.8. The patch does not address the reason why the weights have different size but at least you no longer get an exception.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@oarriaga It works for me, thank you !

Closing as this is resolved

Was this page helpful?
0 / 5 - 0 ratings

Related issues

LuCeHe picture LuCeHe  ·  3Comments

Imorton-zd picture Imorton-zd  ·  3Comments

amityaffliction picture amityaffliction  ·  3Comments

braingineer picture braingineer  ·  3Comments

rantsandruse picture rantsandruse  ·  3Comments