Keras: Cost function that isn't just y_pred, y_true?

Created on 19 Jul 2017 · 5Comments · Source: keras-team/keras

I have a cost function that in addition to using the overall network output, needs to multiply it by another function of the network weights (it's actually the partial derivative of the output with respect to one of the inputs). Custom cost functions are parameterised as f(y_true, y_pred), and so cannot be used to provide this second function of the weights that I want.

I've seen a similar issue before where @shamidreza states that they had to use Theano for this functionality.

Is it still the case that this is the best option? I've only used Keras in R before so have no experience with either TensorFlow or Theano, would either be suitable in R?

Source

stulacy

👍2

Most helpful comment

It sounds like this example implements what you need. It's not ideal in my opinion, feels a bit like a workaround, but it appears to work.

hgaiser on 20 Jul 2017

👍3

All 5 comments

@stulacy . This might be something you can implement in a custom loss function using the following approach (not sure, but offering as a suggestion). (1) The current weights of the model are available to your custom cost function via model.get_weights, (2) consider using a direct tensorflow call to symbolically compute the partial derivative in question (using "zero" for the predicted output so that the gradient is the model output wrt to node weights.) See:
https://keras.io/backend/#using-the-abstract-keras-backend-to-write-new-code
and https://www.tensorflow.org/api_docs/python/tf/gradients

(3) Write the final cost expression in terms of these derived parameters.

Experimental idea that may or may not work, but wanted to give a suggestion. Hope this helps. Thanks.

td2014 on 19 Jul 2017

It sounds like this example implements what you need. It's not ideal in my opinion, feels a bit like a workaround, but it appears to work.

hgaiser on 20 Jul 2017

👍3

I'm going to close this issue as the custom Layer method method suggested by @hgaiser works for my use case. Essentially rather than adding a loss function explicitly, you create a custom layer that calculates the loss. When using the network for prediction you create a new model that uses an earlier layer as the output of interest. The K.gradients function also provided the differentiation that I required.

stulacy on 27 Jul 2017

👍2

Any chance you could show an example of how you implemented this in keras for R?
If you in fact did add the custom layer in R. Thanks.