Hi guys,
I am trying to load the saved weights of several layers in a sequential model in HDF5 format to initialize weights of several layers in a graph model. It is easy to do this if both the source and target models are
sequential:
f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
if k >= len(model.layers):
# we don't look at the last (fully-connected) layers in the savefile
break
g = f['layer_{}'.format(k)]
weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
model.layers[k].set_weights(weights)
f.close()
However, if the model is defined as a graph model, I have no idea how to do this, as the model does not have the attribute model.layers. Can someone share some hints on this issue? Thanks.
Having the same problem. I hope someone replies here.
You could use:
model.nodes
Note that you will have to be carefull about the order and names of the layers.
Another option is to define a Sequential model and add it to a Graph model if you want to add more layers or modify it at some point.
partialy_loaded_model = Sequential()
# part where you define the structure of the partialy loaded model and load you weights
...
# your new model
new_model = Graph()
new_model.add_input(...)
# add your partialy loaded model
new_model.add_node(partialy_loaded_model)
# add more nodes
new_model.add_node(another_layer)
...
I am getting an error like this: -
KeyError: "Unable to open object (Object 'graph' doesn't exist)"
Hi. I am just getting the same error.
I think I didn't explained it carefully enough:
what you could do using the VGG16 example from keras' blog:
from keras.models import Sequential, Graph
from keras.layers import Convolution2D, ZeroPadding2D, MaxPooling2D
import keras.backend as K
img_width, img_height = 128, 128
# build the VGG16 network with our input_img as input
first_layer = ZeroPadding2D((1, 1), input_shape=(3, img_width, img_height))
model = Sequential()
model.add(first_layer)
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))
# get the symbolic outputs of each "key" layer (we gave them unique names).
layer_dict = dict([(layer.name, layer) for layer in model.layers])
# load the weights
import h5py
weights_path = 'vgg16_weights.h5'
f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
if k >= len(model.layers):
# we don't look at the last (fully-connected) layers in the savefile
break
g = f['layer_{}'.format(k)]
weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
model.layers[k].set_weights(weights)
f.close()
print('Model loaded.')
# Here is what you want:
graph_m = Graph()
graph_m.add_input('my_inp', input_shape=(3, img_width, img_height))
graph_m.add_node(model, name='your_model', input='my_inp')
graph_m.add_node(Flatten(), name='Flatten', input='your_model')
graph_m.add_node(Dense(4096, activation='relu'), name='Dense1', input='Flatten')
graph_m.add_node(Dropout(0.5), name='Dropout1', input='Dense1')
graph_m.add_node(Dense(4096, activation='relu'), name='Dense2', input='Dropout1')
graph_m.add_node(Dropout(0.5), name='Dropout2', input='Dense2')
graph_m.add_node(Dense(1000, activation='softmax'), name='Final', input='Dropout2')
graph_m.add_output(name='out1', input='Final')
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
graph_m.compile(optimizer=sgd, loss={'out1': 'categorical_crossentropy'})
So basically here you load only the weights for the feature extraction part and add the nonlinear structure of the classifier and fine tune the model. You could also fix the weights of your feature extraction layers using the trainable attribute of the layers.
@tboquet Thanks for your reply. Your code is clear and easy to understand. But I do not quite understand the last sentence "You could also fix the weight of your feature extraction layer using the trainable attribute of the layers". When we do backprop using graph_m.fit after
graph_m.comple, will the weights in the layers of the vgg part (which is a node now) be updated in the new network?
How to use the trainable attribute of the layers? Would you please sharing some code of both the cases (ie, fixing the weights of the vgg part or updating them when training the new network)? Thanks.
From the doc, you just have to add trainable = False to freeze the training of a layer.
Ex freezed:
...
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1', trainable=False))
...
Ex trainable:
...
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1', trainable=True))
...
trainable is True by default so that something happens if you don't know about the feature...
Thanks for the code @tboquet .
However, when I try to predict the class of an image using the code
im = cv2.resize(cv2.imread('cat.jpg'), (img_width, img_height)).astype(np.float32)
im[:,:,0] -= 103.939
im[:,:,1] -= 116.779
im[:,:,2] -= 123.68
im = im.transpose((2,0,1))
im = np.expand_dims(im, axis=0)
out = graph_m.predict(im)
print np.argmax(out)
The error which I get is shown below: -
Traceback (most recent call last):
File "main.py", line 94, in <module>
out = graph_m.predict(im)
File "/home/darkfantasy/anaconda2/lib/python2.7/site-packages/keras/models.py", line 1249, in predict
ins = [data[name] for name in self.input_order]
IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis
'
'
(None) and integer or boolean arrays are valid indices
'
You should take a look at the Graph model API documentation. You should now provide a dict of input:
out = graph_m.predict({'my_inp': im})
I tried this method.
The class which I was getting from the original sequential model was 281.
Not using this graphical model, the class is coming out to be 0 which is not correct.
What can the possible reason?
Totally normal, the weights of the last layers are not trained, they have the default value of the default initailization method. What are you trying to achieve? Please put a detailed description of your goal.
From the post where I define the model:
So basically here you load only the weights for the feature extraction part and add the nonlinear structure of the classifier and fine tune the model.
So you need a dataset (ImageNet or another one) to train the new part of the model. It was just an example of how you could define a Graph structure having saved weights from a Sequential model.
I want to find the class of an image using the predefined weights of the **VGG. But, due to random initialization of weights, the output is wrong.
Can you tell, how to load the weights of the last graphical fully connected layers?
Why don't you use the same Sequential structure?
because I have to add extra layers of LSTM network which will require inputs from more than 1 layer. So, I have to use the graphical model.
Ok so just define the full Sequential model, use the load_weights method of the Sequential model and when the weights are loaded add this Sequential model to your Graph model using add_node. It should be easy to infer based on the previous discussion.
thanks.
Can I do the same set_weights for the graphical model as well?
I'm not sure the Graph model will help you, consider posting this to the Google group with an exhaustive description of what you are trying to achieve: description of the model, of the inputs, the outputs and a link to a related paper if so. The issues should be related to bugs, new feature request, etc. The Google group is the place to post those kind of questions.
The set_weights method is available for every layer with trainable parameters.
Thanks for your time and effort.
(y)
After removing the two lines which were not there in the original network
graph_m.add_input('my_inp', input_shape=(3, img_width, img_height)) and
graph_m.add_output(name='out1', input='Final'),I get the following error.
AttributeError: 'float' object has no attribute 'type'
I suggest you read carefully the documentation and spend time on the examples so that you could see what are the purpose of these lines. You could take a look at this example.
A follow-up question, what if the model is not VGG, but an non Sequential Model, say AlexNet, where we define the model = Model(input = inputs, output = outputs) and the first layer of the model is Input((3,227,227)). If I use the graph
graph_m = Graph()
graph_m.add_input(name = 'main_input', input_shape=(3, 227, 227))
graph_m.add_node( model, name = 'alex', input = 'main_input')
graph_m.add_node(Dense(10, activation='softmax'), name='Final', input='')
graph_m.add_output(name='out1', input='Final')
I got the following error, which indicates the model and the graph_input is not connected.
File "/users/zaikun/anaconda2/lib/python2.7/site-packages/keras/engine/topology.py", line 1734, in init
str(layers_with_complete_input))
Exception: Graph disconnected: cannot obtain value for tensor main_input at layer "alex". The following previous layers were accessed without issue: []
Any idea how to fix this?
@ouceduxzk @tboquet I'm having the same problem (about Graph being disconnected), did you manage to solve it somehow?
for any one who is getting error for @tboquet code :
f = h5py.File(weights_path4)
f = f['model_weights']
layer_names = [n.decode('utf8') for n in f.attrs['layer_names']]
g = f[layer_names[0]]
weight_names = [n.decode('utf8') for n in g.attrs['weight_names']]
weight_values = [g[weight_name] for weight_name in weight_names]
give you the weight_values in layer_name (0 here) and you can set it to your layers weight with model.layers[k].set_weights(weights)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
Most helpful comment
I think I didn't explained it carefully enough:
what you could do using the VGG16 example from keras' blog:
So basically here you load only the weights for the feature extraction part and add the nonlinear structure of the classifier and fine tune the model. You could also fix the weights of your feature extraction layers using the
trainableattribute of the layers.