Keras: Multi Loss Function

Created on 20 Oct 2016 · 10Comments · Source: keras-team/keras

Hi,
I want to implement a neural network with two loss function as in [http://ydwen.github.io/papers/WenECCV16.pdf]

How can I implement this in Keras?

Do I have to split my last dense layer and call a loss function layer for each output?
Or do I have to create my own custom layer which has to compute both softmax and center loss ?

Regards

Sebastien

Source

slegall56

👍2

Most helpful comment

I recently tried something similar. Give this a shot. I have not verified it but seems to be working. Might as well be a verification.

def prepare_model():
      inputs = Input(shape=(100,100,3))
      ...
      fc = Dense(100)(#previousLayer#)
      softmax = Softmax(fc)

       def custom_loss(y_true, y_pred):
             loss1=softmax(y_true,y_pred)
             loss2=center_loss(y_true,fc)
             return loss1+lambda*loss2

        model = Model(input, output=softmax)
        model.compile(optimizer='sgd',
              loss=custom_loss,
              metrics=['accuracy'])
       return model

biprajiman on 14 Feb 2017

👍17 🚀3

All 10 comments

I have an idea

I started to create a network with 2 outputs such as :

inputs = Input(shape=(100,100,3))
...
fc = Dense(100)(#previousLayer#)
softmax = Softmax(fc)
model = Model(input, output=[softmax, fc])
model.compile(optimizer='sgd', 
              loss=['categorical_crossentropy', 'center_loss'],
              metrics=['accuracy'], loss_weights=[1., 0.2])

First of all, doing like this, is it the good way to proceed?

Secondly, I don't know how to implement the center_loss in keras. Center_loss looks like mean square error but instead of comparing values to fixed labels, it compares values to data updated at each iteration.
But objective function receives y_true, y_pred. So in my case y_pred need to be update at each iteration.
How can I do this ?
Thank you for your help

slegall56 on 21 Oct 2016

Maybe you could also - instead of creating two output layers - create a custom loss-function. This would also allow you to put the hyperparameter they mention in the paper that balances the two supervision signals.

I am not sure if I understand correctly how this center loss works. At every iteration you update the mean of the predicted class? or of the class it should have been? Maybe you could somehow integrate this into your architecture? If you have some fixed weight layers that compute the class centers, you could instead of using the class center as prediction create a 'subtraction layer' that computes the difference between the layer that computes the right classweight and the ouput of the network and use mse to train that (with target 0). Then you cannot use the custom lossfunction anymore, but you could still implement the hyperparameter by the fixed weights in this subtraction layer. If this doesn't make any sense, could you maybe tell me how you would compute the Y_pred in each iteration if you could just input them as you like?

dieuwkehupkes on 24 Oct 2016

Well, for me creating one custom loss-function doing both of the loss is equal to having two different loss functions with the hyperparameter set by the loss_weights parameters in compile function.

Secondly, using a substaction layer is not possible because each input of the substaction layer has to be substracted by a center defined by label of the input. But in the substraction layer, I don't have any information of labels.

Let simplify the problem and let say I want just to apply the center loss function only( no softmax loss). I need to have a code like this :

inputs = Input(shape=(100,100,3))
...
fc = Dense(**500**)(#previousLayer#)
model = Model(input, output=fc)
model.compile(optimizer='sgd', loss=center_loss, metrics=['accuracy'])

Let say I have 10000 of classes so my center array has size of 10000 * 500.

And my center_loss function should look like this :

 def center_loss( y_true, y_pred, centers):
     # get batch_centers corresponding to y_pred
     #compute sum of square difference between y_pred and batch_centers

My questions are :

Where to store my centers array as loss functions are only function with y_true and y_pred parameters
How to update my centers array after backpropagation
How gradient for backpropagation can be computed for this kind of loss function

slegall56 on 24 Oct 2016

@slegall56 I'm doing the same thing right now. Your initial idea is a good try. Rather than caffe, keras has no argument " loss_weights", so maybe it asks for a self-defined loss function. I think a model with two outputs is ok.
model.compile(optimizer='sgd', loss=['categorical_crossentropy', 'center_loss'], metrics=['accuracy'], loss_weights=[1., 0.2])

belavenir on 13 Feb 2017

Well, indeed 2 outputs with loss_weights parameters seems to work. We just need to define our own center loss function in objectives.py

slegall56 on 14 Feb 2017

I recently tried something similar. Give this a shot. I have not verified it but seems to be working. Might as well be a verification.

def prepare_model():
      inputs = Input(shape=(100,100,3))
      ...
      fc = Dense(100)(#previousLayer#)
      softmax = Softmax(fc)

       def custom_loss(y_true, y_pred):
             loss1=softmax(y_true,y_pred)
             loss2=center_loss(y_true,fc)
             return loss1+lambda*loss2

        model = Model(input, output=softmax)
        model.compile(optimizer='sgd',
              loss=custom_loss,
              metrics=['accuracy'])
       return model

biprajiman on 14 Feb 2017

👍17 🚀3

@biprajiman how to understand the loss2=center_loss(y_true,fc), do you explain it in detail.
Or give one snip for center_loss? Thanks.

alyato on 26 Jun 2017

@slegall56 Do you run it well?

alyato on 26 Jun 2017

@alyato if you have a look at the paper...it is a simple L2 norm square or you can use any other loss function based on your requirement.

biprajiman on 26 Jun 2017

Hii,
Suppose if i have 2 loss functions.Some of the weights are common for both the outputs. How will it update the weights using back propagation theorem?
Thanks in advance.