Keras: Implementation advice

Created on 29 Oct 2015 · 8Comments · Source: keras-team/keras

Howdy,

I'd like to implement this paper: http://arxiv.org/pdf/1504.03410v1.pdf

The inputs are triplets of images, so my plan is to use Graph to create three identical networks and then tie their weights/params together for every layer, like Francois describes in #783. So first question is: is that the most efficient way to create a shared pipeline that 3 different images will go through?

If so, the final layer would be a merge (concatenate) layer that would emit the three final encodings.

Now, the loss function doesn't need a Y - it uses the information inherent in the ordering of the 3 concatenated outputs. The training data is ordered such that the first of the 3 images should be closer to the second than the third.

So let's say the output is 3 rows of embeddings/hashes as follows:

[embedding_1, 
embedding_2, 
embedding_3]

The loss function would then be just like eq (2) in the paper:

T.maximum(0 , ||embedding_1 - embedding_2||**2 - ||embedding_1 - embedding_2||**2 + 1)

Interestingly, there is no need for a Y. Is there a natural way to make a loss function that doesn't need a Y, or will Keras always require some Y, even if it is a dummy one?

Thanks.

Source

sergeyf

Most helpful comment

I'm having a delightful conversation with myself =)

Here's a supremely ugly hack:

def triplet_loss(y_true, y_pred):
    #stuff
    return T.maximum(0, mse1 - mse2 + 1) - y_true[0]*0

Look! y_true is being used!

sergeyf on 29 Oct 2015

👍7

All 8 comments

Similar topic #894

EderSantana on 29 Oct 2015

Thanks for the pointer! Sounds like my Graph idea will work - just have to figure out how to do the loss function without Y's.

sergeyf on 29 Oct 2015

To make this more concrete, here is the loss function I'd like to use:

import theano.tensor as T

def triplet_loss(y_true, y_pred):
    mse1 = T.sqr(y_pred[0] - y_pred[1]).mean(axis=-1)
    mse2 = T.sqr(y_pred[0] - y_pred[2]).mean(axis=-1)
    return T.maximum(0, mse1 - mse2 + 1)

Note that it doesn't actually make use of y_true, but if you just do that, you get the following Theano error:

UnusedInputError: theano.function was asked to create a function computing outputs given certain inputs, but the provided input variable at index 3 is not part of the computational graph needed to compute the outputs: Elemwise{second,no_inplace}.0.
To make this error into a warning, you can pass the parameter on_unused_input='warn' to theano.function. To disable it completely, use on_unused_input='ignore'.

I'm not sure where to pass the parameter on_unused_input='ignore'. Any ideas?

sergeyf on 29 Oct 2015

I'm having a delightful conversation with myself =)

Here's a supremely ugly hack:

def triplet_loss(y_true, y_pred):
    #stuff
    return T.maximum(0, mse1 - mse2 + 1) - y_true[0]*0

Look! y_true is being used!

sergeyf on 29 Oct 2015

👍7

Another approach was discussed for Siamese networks in this ticket: https://github.com/fchollet/keras/issues/242#issuecomment-114996378. You can probably use a similar implementation for a triplet loss. If you use that approach, be sure to modify the fit function to preserve the triplets.

mmmikael on 3 Nov 2015

Thanks @mmmikael !

sergeyf on 3 Nov 2015

@sergeyf Hey, may I ask how could you implement the semi-hard choice of negative pairs? It seems that it is not very difficult to implement the triplet loss function but how to choice "good" positive and negative pairs ONLINE bothers me a lot. Could you share your implementation? Thanks!