Keras: changeable loss weights for multiple output when using train_on_batch

Created on 5 Jun 2018 · 8Comments · Source: keras-team/keras

Hi folks,

I'm wondering if there is an easy way to change the "loss_weights" for a network (with multiple outputs) after every iteration, when I can only use "train_on_batch" function. I've seen people suggestting to change the "loss_weights" through callbacks like in #2595 . But since I will use train_on_batch, I guess callbacks is no longer available (please correct me if I'm wrong).

Any idea will be greatly appreciated.

Thanks,
JC

Source

yushuinanrong

Most helpful comment

If you want to change the loss_weights argument in model.compile(), you need to recompile the model. This may be slow and expensive. Definitely not something you can do at each batch.

Anything that you want to change at every batch should be cast as in input to the model (for the sake of efficiency).

Consider the following 2-output model:

from keras import layers
from keras import Model

inputs = layers.Input(...)
out1 = layers.Dense(...)(inputs)
out2 = layers.Dense(...)(inputs)
model = Model(inputs, [out1, out2])
model.compile(optimizer, loss=[binary_crossentropy, binary_crossentropy])

You could add an input for loss weights and for targets:

inputs = layers.Input(...)
loss_weights = layers.Input(shape=(1,))
targets1 = layers.Input(...)
targets2 = layers.Input(...)
out1 = layers.Dense(...)(inputs)
out2 = layers.Dense(...)(inputs)
model = Model([inputs, loss_weights, targets1, targets2], [out1, out2])

The loss you would use would be, say, loss_weights for out1 and 1 - loss_weights for out2:

from keras import losses

loss = loss_weights * losses.binary_crossentropy(targets1, out1) + (1 - loss_weights) * losses.binary_crossentropy(targets2, out2)

model.add_loss(loss)

Compile:

model.compile(optimizer)

Then train like that:

input_data = np.random.random((32, 3))
loss_data = np.array([0.5 for _ in range(len(input_data))])
target1_data = np.random.random((32, 3))
target2_data = np.random.random((32, 3))
model.train_on_batch([input_data, loss_data, target1_data, target2_data])

Let me know if that works for you. There are other possibilities as well.

fchollet on 5 Jun 2018

👍4

All 8 comments

If you want to change the loss_weights argument in model.compile(), you need to recompile the model. This may be slow and expensive. Definitely not something you can do at each batch.

Anything that you want to change at every batch should be cast as in input to the model (for the sake of efficiency).

Consider the following 2-output model:

from keras import layers
from keras import Model

inputs = layers.Input(...)
out1 = layers.Dense(...)(inputs)
out2 = layers.Dense(...)(inputs)
model = Model(inputs, [out1, out2])
model.compile(optimizer, loss=[binary_crossentropy, binary_crossentropy])

You could add an input for loss weights and for targets:

inputs = layers.Input(...)
loss_weights = layers.Input(shape=(1,))
targets1 = layers.Input(...)
targets2 = layers.Input(...)
out1 = layers.Dense(...)(inputs)
out2 = layers.Dense(...)(inputs)
model = Model([inputs, loss_weights, targets1, targets2], [out1, out2])

The loss you would use would be, say, loss_weights for out1 and 1 - loss_weights for out2:

from keras import losses

loss = loss_weights * losses.binary_crossentropy(targets1, out1) + (1 - loss_weights) * losses.binary_crossentropy(targets2, out2)

model.add_loss(loss)

Compile:

model.compile(optimizer)

Then train like that:

input_data = np.random.random((32, 3))
loss_data = np.array([0.5 for _ in range(len(input_data))])
target1_data = np.random.random((32, 3))
target2_data = np.random.random((32, 3))
model.train_on_batch([input_data, loss_data, target1_data, target2_data])

Let me know if that works for you. There are other possibilities as well.

fchollet on 5 Jun 2018

👍4

Hi @fchollet ,

Thank you very much for providing the solution. After two small modifications, you code works like a charm.
Fist modification:

model.compile(optimizer) --> model.compile(optimizer, loss = None)

Second modification:

model.train_on_batch([input_data, loss_data, target1_data, target2_data]) --> 

model.train_on_batch([input_data, loss_data, target1_data, target2_data], y = None)

Bests,
JC

yushuinanrong on 6 Jun 2018

👍1

@yushuinanrong out of curiosity, which version of keras are you using? Your API for model.add_loss appears to be different from what's on the master branch: https://github.com/keras-team/keras/blob/e6d2179/keras/engine/base_layer.py#L949

AvantiShri on 6 Jun 2018

Hi @AvantiShri ,
I made a wrong comment. Actually what I changed is:

model.compile(optimizer) --> model.compile(optimizer, loss = None)

yushuinanrong on 6 Jun 2018

👍2

I can't see your comment on github. Have you solved your problem?

On Mon, Aug 13, 2018 at 11:41 AM, bob2277 notifications@github.com wrote:

@yushuinanrong https://github.com/yushuinanrong I am still pretty new
to Keras and can't understand one thing about your solution.
When you are training on batch you pass y as None but the model has two
outputs. So where do you pass the true labels of those outputs??

I mean you need the true labels to train the network right??
Please tell where I am going wrong.

Thanks

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/10358#issuecomment-412562595,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AQdIhFQZq5dglkSym_4TsblsQm8XH_mEks5uQZ4-gaJpZM4Ua07m
.

yushuinanrong on 13 Aug 2018

Yeah i did. It was a mistake on my part so I removed the comment.
Thanks

bob2277 on 13 Aug 2018

Hi @fchollet ,

Thank you very much for providing the solution. After two small modifications, you code works like a charm.
Fist modification:
model.compile(optimizer) --> model.compile(optimizer, loss = None)
Second modification:
model.train_on_batch([input_data, loss_data, target1_data, target2_data]) --> 

model.train_on_batch([input_data, loss_data, target1_data, target2_data], y = None)
Bests,
JC

I know this is an old thread, but I am trying to have dynamic loss weights while using a single network that is made up of multiple networks. I am trying to implement this into CycleGan with a perceptual reconstruction that I implemented, here is a link to the code;
https://gist.github.com/Shalomash/6c3e9beb2fe3cba703acf3a5896981d1
I am trying to add a separate input to the Generator + Discriminator so that I can feed a separate loss weight to it and have the perceptual reconstruction loss dynamically weighted, unfortunately I seem to be having issues adding this to the network as it is a graph of connected models, what would be the most effective way to go about implementing this?
Thanks

Shalomash on 25 Jul 2019

I am trying to implement dynamic loss weighting for dual-loss GAN too but I am facing an InvalidArgumentError "You must feed a value for placeholder tensor". Problem explained in https://github.com/keras-team/keras/issues/9385#issuecomment-523941651.

Is there some other approach to dynamic loss weighting that bypasses this error? Thank you!