Keras: Extracting a shared network from a triplet siamese model

Created on 10 Dec 2017  路  13Comments  路  Source: keras-team/keras

I have a triplet architecture model with one shared CNN which takes three images as input (anchor, positive, negative), predicts feature vectors for each of them and distances between anchor-positive and anchor-negative are computed (distances are input to the loss function):

Layer (type)                    Output Shape         Params      Connected_to
==================================================================================================
input_anchor (InputLayer)       (None, 224, 224, 3)  0
__________________________________________________________________________________________________
input_pos (InputLayer)          (None, 224, 224, 3)  0
__________________________________________________________________________________________________
input_neg (InputLayer)          (None, 224, 224, 3)  0
__________________________________________________________________________________________________
resnet_model (Model)            (None, 512)          24899456    input_anchor[0][0]
                                                                 input_pos[0][0]
                                                                 input_neg[0][0]
__________________________________________________________________________________________________
pos_dist (Lambda)               (None, 1)            0           resnet_model[1][0]
                                                                 resnet_model[2][0]
__________________________________________________________________________________________________
neg_dist (Lambda)               (None, 1)            0           resnet_model[1][0]
                                                                 resnet_model[3][0]
__________________________________________________________________________________________________
stacked_dists (Lambda)          (None, 2, 1)         0           pos_dist[0][0]
                                                                 neg_dist[0][0]
==================================================================================================

I trained this architecture and saved it to a file. Now I want to load the submodel (resnet_model) and predict feature vectors of single images (one image at a time instead of three, for image retrieval). How can I do that?

from keras.models import model_from_json

json_file = open('model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)

loaded_model.load_weights("model.h5")

This is my model loaded. Now how can I get this resnet_model (Model) part from my architecture (so it would have single input and single output)? I'm not sure if it is even possible in Keras, that's why I'm asking. Regular methods as get_layer don't work here as this part of the original model takes multiple inputs and has multiple outputs. But theoretically, it should be possible as it's the main idea of the siamese networks (to train on pairs/triplets of images and then use it on a single one).

This is how the whole architecture was created:

input_shape=(224,224,3)
input_anchor = Input(shape=input_shape, name='input_anchor')
input_positive = Input(shape=input_shape, name='input_pos')
input_negative = Input(shape=input_shape, name='input_neg')

net_anchor = resnet_model(input_anchor)
net_positive = resnet_model(input_positive)
net_negative = resnet_model(input_negative)

positive_dist = Lambda(euclidean_distance, name='pos_dist')([net_anchor, net_positive])
negative_dist = Lambda(euclidean_distance, name='neg_dist')([net_anchor, net_negative])

stacked_dists = Lambda( 
            lambda vects: K.stack(vects, axis=1),
            name='stacked_dists'
)([positive_dist, negative_dist])

model = Model([input_anchor, input_positive, input_negative], stacked_dists, name='triple_siamese')

Most helpful comment

This really requires clarification because there is no documentation covering this specific problem.
If the network is trained with multiple inputs (as in the case of siamese networks) but later predictions are to be done based on a single input (image) (maybe because you already saved encodings of anchor images in a database), how do you really save and restore the model, and then make predictions? I am a bit surprised that no one has given a clear answer to the question.

All 13 comments

I am also very interested in this. I've tried returning two models in a def_model function, but it seems that training the siamese network doesn't train the subnetwork, even though they share many layers.

@marcusklaas I did it this way: Just create the model as in the code above, load the weights to the model and then use the resnet_model submodel for predicting feature vectors for individual images:

.
.
.
model = Model([input_anchor, input_positive, input_negative], stacked_dists, name='triple_siamese')

model.load_weights('./model_weights.h5')

# use resnet_model for predictions here

@tomassykora , can you elaborate how you performed '# use resnet_model for predictions here'. Thanks in advance.

@chandansheth I just meant resnet_model.predict(data).

I got this error when feeding 1000 triplets to the following network, created using above code layout.

    base_network = create_base_network(in_dim)
    input_a = Input(shape=(in_dim,))
    input_p = Input(shape=(in_dim,))
    input_n = Input(shape=(in_dim,))
    processed_a = base_network(input_a)
    processed_p = base_network(input_p)
    processed_n = base_network(input_n)
    dist_p = Lambda(euclidean_distance)([processed_a, processed_p])
    dist_n = Lambda(euclidean_distance)([processed_a, processed_n])
    stacked_dists = Lambda(lambda vects: K.stack(vects, axis=1))([dist_p, dist_n])    
    model = Model([input_a, input_p, input_n], stacked_dists)    
    model.compile ...
_______________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_48 (InputLayer)           (None, 100)          0                                            
__________________________________________________________________________________________________
input_49 (InputLayer)           (None, 100)          0                                            
__________________________________________________________________________________________________
input_50 (InputLayer)           (None, 100)          0                                            
__________________________________________________________________________________________________
sequential_18 (Sequential)      (None, 100)          121200      input_48[0][0]                   
                                                                 input_49[0][0]                   
                                                                 input_50[0][0]                   
__________________________________________________________________________________________________
lambda_65 (Lambda)              (None, 1)            0           sequential_18[1][0]              
                                                                 sequential_18[2][0]              
__________________________________________________________________________________________________
lambda_66 (Lambda)              (None, 1)            0           sequential_18[1][0]              
                                                                 sequential_18[3][0]              
__________________________________________________________________________________________________
lambda_67 (Lambda)              (None, 2, 1)         0           lambda_65[0][0]                  
                                                                 lambda_66[0][0]                  
==================================================================================================
Total params: 121,200
Trainable params: 121,200
Non-trainable params: 0

ValueError: Cannot feed value of shape (1000, 2) for Tensor 'lambda_67_target_2:0', which has shape '(?, ?, ?)'

How can I fix this? Many thanks in advance

i get the exact same error ;)
edit: i think i fixed it using K.concatenate(vects,axis=1)
as K.stack adds an additional dimension....

for the triplet_loss function you can then use:

def triplet_loss(_, y_pred):
    margin = K.constant(1)
    return 0.5*K.mean(K.maximum(K.constant(0), K.square(y_pred[:,0]) - K.square(y_pred[:,1]) + margin))

(the 0.5 is just there for the derivative)

This really requires clarification because there is no documentation covering this specific problem.
If the network is trained with multiple inputs (as in the case of siamese networks) but later predictions are to be done based on a single input (image) (maybe because you already saved encodings of anchor images in a database), how do you really save and restore the model, and then make predictions? I am a bit surprised that no one has given a clear answer to the question.

@tomassykora The way i did it was I defined a new_model, which had the same architecture as resnet_model (Model) you are using then I used set_weights to set it's weight equal to trained weights.

The exact method will be:

weights1 = siamese_model.get_layer(model.layers['index of resnet']);
new_model = resnet() #assumed it returns the model of resnet
new_model.set_weights(weights1);

Now, you can use your model separately. :)

None of the methods mentioned above worked for me, however with what @agshar96 said and this I could make it work!

Probably there is an easier way to do this, but the pipeline is:

  • Define a blank model (probably with the same code/methods you used to define your network in training), e.g. targetModel;
  • Load the model/weights of a pre-trained model, e.g. sourceModel
  • Iterate over the layers of both:
    - get_weights from a layer of sourceModel
    - set_weights to a layer of targetModel
siamese_model = load_model("path_to_model")

#In my case, this returns me a Model() representing the desired shared network 
# you will need to change index (you can also use name parameter)
sourceModel = siamese_model.get_layer(index=9)  

targetModel = densenet(...) #this returns a blank Model 

#coping the weights
for l_tg,l_sr in zip(targetModel.layers, sourceModel.layers):
        wk0=l_sr.get_weights()
        l_tg.set_weights(wk0)

How are you training this on triplet loss and minimizing the loss function.
I am new to this so please let me know this.
i have to do this in my project as well

I did:

  1. load the siamese saved model.
  2. extract the the layer with the desired trained layers
  3. create a new model with a new input layer and the extracted layer
trained_model = load_model('mnist_siamese.model', compile=False)
base_model = trained_model.get_layer('model_1')

input_shape = (28, 28)

input = Input(shape=input_shape)
x = base_model(input)

model = Model(input, x)

I encountered the same problem and my solution is:

  1. create the submodel, e.g. model=resnet_model()
  2. load weigths from h5 file specifying the layer name with 'by_name' parameter, which in your case is model.load_weights(weightsFile,by_name='resnet_model')

the by_name argument of load_weights function is a boolean True or False. Why do you set 'resnet_model' to it ?

@kleysonr Thanks I was able to extract features from the trained model.

Layer (type) Output Shape Param # Connected to

=====================================================
input_17 (InputLayer) (None, 1, 80, 80) 0
input_18 (InputLayer) (None, 1, 80, 80) 0
sequential_2 (Sequential) (None, 25) 1055501 input_17[0][0]
input_18[0][0]
lambda_2 (Lambda) (None, 1) 0 sequential_2[1][0]
sequential_2[2][0]
I selected one of the networks, and I was able to extract feature but here I am facing an issue and also I am not clear for which layer in the selected network it is predicting the features, if so how to do?

Thanks

Was this page helpful?
0 / 5 - 0 ratings