The model includes a custom loss and 2 outputs.
The below is the dump
In [3]: keras.__version__
Out[3]: '2.1.3'
In [5]: tf.__version__
Out[5]: '1.6.0-rc0'
AttributeError Traceback (most recent call last)
~/virtualenvs/tensorflowgpu/lib/python3.6/site-packages/numpy/core/fromnumeric.py in _wrapfunc(obj, method, args, *kwds)
51 try:
---> 52 return getattr(obj, method)(args, *kwds)
53
AttributeError: 'Dataset' object has no attribute 'transpose'
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
----> 1 model=load_model("/home/julian/kaggle/sbowl2018/saved_models/model.h5")
~/virtualenvs/tensorflowgpu/lib/python3.6/site-packages/keras/models.py in load_model(filepath, custom_objects, compile)
244
245 # set weights
--> 246 topology.load_weights_from_hdf5_group(f['model_weights'], model.layers)
247
248 # Early return if compilation is not required.
~/virtualenvs/tensorflowgpu/lib/python3.6/site-packages/keras/engine/topology.py in load_weights_from_hdf5_group(f, layers)
3151 weight_values,
3152 original_keras_version,
-> 3153 original_backend)
3154 if len(weight_values) != len(symbolic_weights):
3155 raise ValueError('Layer #' + str(k) +
~/virtualenvs/tensorflowgpu/lib/python3.6/site-packages/keras/engine/topology.py in preprocess_weights_for_loading(layer, weights, original_keras_version, original_backend)
3048 weights[1] = conv_utils.convert_kernel(weights[1])
3049 if K.int_shape(layer.weights[0]) != weights[0].shape:
-> 3050 weights[0] = np.transpose(weights[0], (3, 2, 0, 1))
3051 if layer.__class__.__name__ == 'ConvLSTM2D':
3052 weights[1] = np.transpose(weights[1], (3, 2, 0, 1))
~/virtualenvs/tensorflowgpu/lib/python3.6/site-packages/numpy/core/fromnumeric.py in transpose(a, axes)
573
574 """
--> 575 return _wrapfunc(a, 'transpose', axes)
576
577
~/virtualenvs/tensorflowgpu/lib/python3.6/site-packages/numpy/core/fromnumeric.py in _wrapfunc(obj, method, args, *kwds)
60 # a downstream library like 'pandas'.
61 except (AttributeError, TypeError):
---> 62 return _wrapit(obj, method, args, *kwds)
63
64
~/virtualenvs/tensorflowgpu/lib/python3.6/site-packages/numpy/core/fromnumeric.py in _wrapit(obj, method, args, *kwds)
40 except AttributeError:
41 wrap = None
---> 42 result = getattr(asarray(obj), method)(args, *kwds)
43 if wrap:
44 if not isinstance(result, mu.ndarray):
ValueError: axes don't match array
+1
Hi,
We could work on this better if you could provide a standalone reproducible example.
Some advices:
Thanks!
Dref360
Hi,
this has been overtaken by events / used a workaround time ago. I do no longer have the code for these tests. I regret I can't pitch in.
JC
I got the similar problem when I load a model trained by 2 GPUs. However, the loading model trained by 1 GPU works good.
"ValueError: axes don't match array"
I get the same problem.
If I train a model using 2 GPUs, using multi_gpu_model and ModelCheckpoint callback to save best, it gives this error when calling load_model on the checkpoint saved model.
I want to train the model using multi gpus but run it again on a single GPU. this seems to fail
I don't think this is the best solution, but I've found a workaround for above issue.
Define the base model and 'multiple gpu compatible' model under different names:
base_model = ResNet50(weights=None, include_top=True, input_shape=(224,224,3), pooling='max')
mgpu_model = multi_gpu_model(base_model)
mgpu_model.compile(optimizer='rmsprop', loss='mse')
mgpu_model.fit ______whatever_______
While saving the model, save the 'base_model' and load this model. You will have to compile it after loading if not compiled before saving.
It works as both base model and model wrapped in multi_gpu_model share same parameters.
In case you are using callbacks to save these, go to __init__ reserved method of callbacks.ModelCheckpoint and add two arguments, one the base model to be saved and other, the path where you want to save it. Now as per requirement, save base_model wherever it is saving the model being trained. Somewhat like this:
model = load_model(model_to_use, custom_objects={'mean_iou': mean_iou,'iou_bce_loss':iou_bce_loss})
produces the axes don't match array error - model file from checkpoint has same result as a model saved after completion of model.fit_generator. Two GPU's used for training. Model ran for three days.
ValueError Traceback (most recent call last)
1 #model = load_model(model_to_use, custom_objects={'mean_iou': mean_iou,'iou_bce_loss':iou_bce_loss})
----> 2 model = load_model(model_to_use)
~\Anaconda3\lib\site-packages\keras\engine\saving.py in load_model(filepath, custom_objects, compile)
417 f = h5dict(filepath, 'r')
418 try:
--> 419 model = _deserialize_model(f, custom_objects, compile)
420 finally:
421 if opened_new_file:
~\Anaconda3\lib\site-packages\keras\engine\saving.py in _deserialize_model(f, custom_objects, compile)
272 original_keras_version,
273 original_backend,
--> 274 reshape=False)
275 if len(weight_values) != len(symbolic_weights):
276 raise ValueError('Layer #' + str(k) +
~\Anaconda3\lib\site-packages\keras\engine\saving.py in preprocess_weights_for_loading(layer, weights, original_keras_version, original_backend, reshape)
680 weights = convert_nested_time_distributed(weights)
681 elif layer.__class__.__name__ in ['Model', 'Sequential']:
--> 682 weights = convert_nested_model(weights)
683
684 if original_keras_version == '1':
~\Anaconda3\lib\site-packages\keras\engine\saving.py in convert_nested_model(weights)
668 weights=weights[:num_weights],
669 original_keras_version=original_keras_version,
--> 670 original_backend=original_backend))
671 weights = weights[num_weights:]
672 return new_weights
~\Anaconda3\lib\site-packages\keras\engine\saving.py in preprocess_weights_for_loading(layer, weights, original_keras_version, original_backend, reshape)
680 weights = convert_nested_time_distributed(weights)
681 elif layer.__class__.__name__ in ['Model', 'Sequential']:
--> 682 weights = convert_nested_model(weights)
683
684 if original_keras_version == '1':
~\Anaconda3\lib\site-packages\keras\engine\saving.py in convert_nested_model(weights)
656 weights=weights[:num_weights],
657 original_keras_version=original_keras_version,
--> 658 original_backend=original_backend))
659 weights = weights[num_weights:]
660
~\Anaconda3\lib\site-packages\keras\engine\saving.py in preprocess_weights_for_loading(layer, weights, original_keras_version, original_backend, reshape)
799 weights[0] = np.reshape(weights[0], layer_weights_shape)
800 elif layer_weights_shape != weights[0].shape:
--> 801 weights[0] = np.transpose(weights[0], (3, 2, 0, 1))
802 if layer.__class__.__name__ == 'ConvLSTM2D':
803 weights[1] = np.transpose(weights[1], (3, 2, 0, 1))
~\Anaconda3\lib\site-packages\numpy\core\fromnumeric.py in transpose(a, axes)
596
597 """
--> 598 return _wrapfunc(a, 'transpose', axes)
599
600
~\Anaconda3\lib\site-packages\numpy\core\fromnumeric.py in _wrapfunc(obj, method, args, *kwds)
49 def _wrapfunc(obj, method, args, *kwds):
50 try:
---> 51 return getattr(obj, method)(args, *kwds)
52
53 # An AttributeError occurs if the object does not have
This seems to be an issue that your letting lie and festering so not going to bother spending time to further document the issue as I don't really have the skill set to get all you want. Will try the workaround.
For the solution of this problem, i got the suggestion saying that instead of saving 'multi_gpu_model' and loading 'multi_gpu_model' . I should:
multi_gpu_model = multi_gpu_model(model, gpus=G)
model.save_model(...)
Instead of
multi_gpu_model = multi_gpu_model(model, gpus=G)
multi_gpu_model.save_model(...)
Please pay attention to the slight difference. I got this suggestion, however, I did't try it as I don't have time and I gave up using multiple GPU to train the model. Hope it is helpful on this problem.
Thanks jimmy shen
I have had great success over the last 6 weeks using two GPU’s to train a bunch of different models on two different PC’s.
Today’s failure is very unique – but I did something different. I trained the model. Saved it and its weights. Loaded it and ran it some more with different parameters, all within the same script. Don’t think I have done this before. But – to add mystery to the issue I had to do a full clean install of Windows, anaconda, etc. So the versions of the various stuff that worked for the last 6 weeks are not the same versions as this mornings failure. Later I will try to load the model into my 2nd PC that still has all versions from 5-6 weeks ago rather than the latest and often not the greatest versions.
Pretty new to Python – 8 weeks and counting – I am finding that keeping Python running is harder task than learning the code. It seems to be a very unstable process.
Will try the suggestion later today.
Thanks from another Jimmy
Sent from Mail for Windows 10
From: jimmy shen
Sent: Thursday, October 18, 2018 10:52 AM
To: keras-team/keras
Cc: PCJimmmy; Comment
Subject: Re: [keras-team/keras] load_model fails with error ValueError: axesdon't match array (#9562)
For the solution of this problem, i got the suggestion saying that instead of saving 'multi_gpu_model' and loading 'multi_gpu_model' . I should:
I just had this problem and just wanted to let any future readers know that @liketheflower and @PCJimmmy have it right. Save the model
, not the multi_gpu_model
.
Closing as this is resolved
But what if I want to save the multi_gpu_model during the training? And not necessarily its last epoch?
Does anyone know any workaround to simply load the problematic multiple GPU weight? I mean I now know the right way to save the weights, but I don't have time to retrain a model. Is it still possible to load a (problematic) weight file?
For those who want to save multi_gpu model during training. You can change the model in callback: (inspired by @RaiAbhishek )
single_model = create_model()
parallel_model = multi_gpu_model(single_model, gpus=n)
parallel_model.compile(...)
class MyModelCheckPoint(ModelCheckpoint):
def __init__(self, singlemodel, *args, **kwargs):
self.singlemodel = singlemodel
super(MyModelCheckPoint, self).__init__(*args, **kwargs)
def on_epoch_end(self, epoch, logs=None):
self.model = self.singlemodel
super(MyModelCheckPoint, self).on_epoch_end(epoch, logs)
checkpoint = MyModelCheckPoint(single_model, filepath, save_best_only=True, ...)
parallel_model.fit(..., callbacks=[checkpoint])
Similarly for other type of Callbacks.
For the solution of this problem, i got the suggestion saying that instead of saving 'multi_gpu_model' and loading 'multi_gpu_model' . I should:
- define a model
- define and compile a multi_gpu_model
- training using the multi_gpu_model
- during the saving, still saving the model instead of the multi_gpu_model
- load the model.
So the trick should be:multi_gpu_model = multi_gpu_model(model, gpus=G) model.save_model(...)
Instead of
multi_gpu_model = multi_gpu_model(model, gpus=G) multi_gpu_model.save_model(...)
Please pay attention to the slight difference. I got this suggestion, however, I did't try it as I don't have time and I gave up using multiple GPU to train the model. Hope it is helpful on this problem.
I don't know, I am having this issue and I have proceeded as you indicate by using model.save instead of multi_gpu.save
I just had this problem and just wanted to let any future readers know that @liketheflower and @PCJimmmy have it right. Save the
model
, not themulti_gpu_model
.
But is this really saving the trained weights? I made a little sample at https://github.com/LeninGF/Mnist_ConvNet_MultiGPU it happened that saving and loading with the model.save returned a model, was able to load it, but it had no trained weights
def fix_this_stupid_bug():
# get model
model = create_model()
# freeze layers that were frozen during training
for layer in model.layers:
if should_freeze(layer):
layer.trainable = False
# load weights
model.load_weights(weights_path)
# unfreeze model
for layer in model.layers:
layer.trainable = True
# save unfrozen weights
model.save_weights(out_path)
Most helpful comment
I get the same problem.
If I train a model using 2 GPUs, using multi_gpu_model and ModelCheckpoint callback to save best, it gives this error when calling load_model on the checkpoint saved model.
I want to train the model using multi gpus but run it again on a single GPU. this seems to fail