Keras: changeable loss weights for multiple output

Created on 4 May 2016 · 19Comments · Source: keras-team/keras

Hi all, what's an easy way to set changeable loss weights for multiple output, for example, is there a way to modify the loss weights in callback?

stale

Source

xingdi-eric-yuan

Most helpful comment

You can pass list of K.variable into loss_weights.

alpha = K.variable(0.5)
beta = K.variable(0.5)
model.compile(..., loss_weights=[alpha, beta], ...)

and define your own Callback

class MyCallback(Callback):
    def __init__(self, alpha, beta):
        self.alpha = alpha
        self.beta = beta
    # customize your behavior
    def on_epoch_end(self, epoch, logs={}):
        self.alpha = self.alpha - 0.1
        self.beta = self.beta + 0.1

then pass it into fit

model.fit( ..., callbacks=[MyCallback(alpha, beta)], ...)

P.S. Haven't thoroughly tested it.

joelthchao on 4 May 2016

👍43 ❤5 🎉1

All 19 comments

You can pass list of K.variable into loss_weights.

alpha = K.variable(0.5)
beta = K.variable(0.5)
model.compile(..., loss_weights=[alpha, beta], ...)

and define your own Callback

class MyCallback(Callback):
    def __init__(self, alpha, beta):
        self.alpha = alpha
        self.beta = beta
    # customize your behavior
    def on_epoch_end(self, epoch, logs={}):
        self.alpha = self.alpha - 0.1
        self.beta = self.beta + 0.1

then pass it into fit

model.fit( ..., callbacks=[MyCallback(alpha, beta)], ...)

P.S. Haven't thoroughly tested it.

joelthchao on 4 May 2016

👍43 ❤5 🎉1

The idea works, thanks!
In callback, I used something like:

def on_epoch_end(self, epoch, logs={}):
        if epoch == 2:
            K.set_value(self.alpha, K.get_value(self.alpha) / 1.5)
            K.set_value(self.beta, K.get_value(self.beta) * 1.5)
        logger.info("epoch %s, alpha = %s, beta = %s" % (epoch, K.get_value(self.alpha), K.get_value(self.beta)))

xingdi-eric-yuan on 5 May 2016

👍14 🎉2

Hi guys. I can't find any documentation on what the loss_weights parameter actually does. @joelthchao can you provide an intuitive explanation?

pGit1 on 7 Jan 2017

@pGit1 Please check this example. End of this section provides detail explanations and usages.

joelthchao on 7 Jan 2017

@joelthchao thanks for this. I have read this example actually but I don't see the intuition behind choosing ".2". It says:

We compile the model and assign a weight of 0.2 to the auxiliary loss.

Why was this number chosen? What is the intuition behind this choice? That is what I don't understand.
I mean if I have a classification output and a regression output I would think I would want to have a small loss weight for the regression output so the internal representation of the network would not be dominated by the MSE since those numbers could be huge. Is this understanding accurate? Or should loss_weights be treated as a hyper-parameter in multi output problems (even still an intuition would be nice to understand so as to avoid blindly picking values)?

pGit1 on 7 Jan 2017

👍1

It's a hyper-parameter, usually we need to adjust it according to 1) the importance of losses 2) the actual value of losses. We do need experiments to choose a correct number, to prevent one loss dominate others.

joelthchao on 8 Jan 2017

Makes sense. Thank you!

On Sun, Jan 8, 2017 at 1:11 AM, Joel notifications@github.com wrote:

It's a hyper-parameter, usually we need to adjust it according to 1) the
importance of losses 2) the actual value of losses. We do need experiments
to choose a correct number, to prevent one loss dominate others.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/fchollet/keras/issues/2595#issuecomment-271132558,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ANU-St6m_0Q-foL64uK80DDAaggJZwvhks5rQH4AgaJpZM4IWsB_
.

pGit1 on 8 Jan 2017

@joelthchao I want to update my paramter alpha in the loss in a similar fashion with a callback.

def custom_loss(x, x_pred): 
    loss1 = binary_crossentropy(x, x_pred)
    return K.pow(loss1, alpha)

How can I change the alpha in the callback to actually change the alpha in the loss function? It is not correct for my application to pass alpha as a loss_weight in the model.fit method.
Any help appreciated!

nabsabraham on 9 Oct 2018

👍2

You can pass list of K.variable into loss_weights.

alpha = K.variable(0.5)
beta = K.variable(0.5)
model.compile(..., loss_weights=[alpha, beta], ...)

and define your own Callback

class MyCallback(Callback):
    def __init__(self, alpha, beta):
        self.alpha = alpha
        self.beta = beta
    # customize your behavior
    def on_epoch_end(self, epoch, logs={}):
        self.alpha = self.alpha - 0.1
        self.beta = self.beta + 0.1

then pass it into fit

model.fit( ..., callbacks=[MyCallback(alpha, beta)], ...)

P.S. Haven't thoroughly tested it.

@joelthchao don't we need to compile the model again in this case ?

panwarnaveen9 on 16 Mar 2019

👍1

It doesn't seem like we need to recompile as long as we're using the variables.

I can confirm this through running the code myself, and by looking at this comment.

Let me know if you find some different behavior.

mycal-tucker on 6 May 2019

@mycal-tucker what do you mean by variables ? If I have function like this.. Will it update the params ?

class ChangeLossWeight(Callback):

    def __init__(self, alpha, beta):
        self.alpha = alpha
        self.beta = beta

    # custom behaviour
    def on_epoch_end(self, epoch, logs={}):
        if epoch == 2:
            self.alpha = 0.7
            self.beta = 0.3
        if epoch == 5:
            self.alpha = 0.5
            self.beta = 0.5

panwarnaveen9 on 7 May 2019

I mean a Keras variable. I think your current setup wouldn't work because you're just updating the numbers. Here's something I did that doesn't perfectly match the Callback class model you've got but that does work. Porting my code over to your setup shouldn't be too much trouble.

class MyCustomModel:
    # Define a static variable for now because I don't know how else to get around the callback being static.
    # I am sure that there is a more elegant way, but for now this works.
    beta_var = None
    # Create a new model by stringing together two models.
    def __init__(self, model1, model2):
        model2_output = model2(model1.output)  # Feed model1 into model2
        self.model = Model(inputs=[model1.input],
                           outputs=[model1.output, model2_output ],
                           name='my_custom_model')

    def compile(self, alpha=0, beta=0):
        MyCustomModel.beta_var = K.variable(beta)  # Create variable so it can be updated during training.
        # Two losses to optimize for: mse and binary_crossentropy.
        # Only update one of the weights, to show that we have that flexibility.
        self.model.compile(optimizer=RMSprop(lr=0.001),
                           loss=['mse', 'binary_crossentropy'],
                           loss_weights=[1.0, MyCustomModel.beta_var])

    def train(self, inputs, outputs, epochs, batch_size):
        callbacks = [LambdaCallback(on_epoch_end=LSTMAutoencoder.on_epoch_end)]
        self.model.fit(inputs, outputs, epochs=epochs,
                       batch_size=batch_size,
                       callbacks=callbacks)

    @staticmethod
    def on_epoch_end(epoch, _):
        # Update loss weight parameters, as shown in https://github.com/keras-team/keras/issues/2595
        # Have the weight decrease geometrically, just for fun.
        K.set_value(MyCustomModel.beta_var, K.get_value(MyCustomModel.beta_var) * 0.95)
        print("Beta value", K.get_value(MyCustomModel.beta_var))

Does this make sense to you? As I point out in the comments, my static variable workaround is probably not the most elegant way of getting things working, but it at least works.

mycal-tucker on 7 May 2019

👍1

TypeError: ('Not JSON Serializable:')
This error comes while saving the model. Any updates on this?
https://github.com/keras-team/keras/issues/9444

tejal567 on 16 May 2019

TypeError: ('Not JSON Serializable:')
This error comes while saving the model. Any updates on this?

9444

I've come across the same error.

Barnonewdm on 16 May 2019

I have tried the trick from https://github.com/keras-team/keras/issues/9444#issuecomment-395260154.

9444 works well.

Barnonewdm on 16 May 2019

Changing the loss_weights by the custom callback as described (@joelthchao and others) doesn't seem to work. The k.variable values are changed, but the total loss calculation doesn't take the new values in consideration. Consider this example where loss_weights were supposed to simply switch from the loss of one output (0, 1) to the loss of other output (1, 0)

import numpy as np
import tensorflow as tf
import keras
from keras.layers import *
from keras.callbacks import Callback
from keras.optimizers import Adam

input_1 = Input(shape=(1,))
hidden_0 = Dense(units=10, activation='tanh')(input_1)
dense_0  = Dense(1, activation='tanh')(hidden_0)
dense_1  = Dense(1, activation='tanh')(hidden_0)

model = keras.Model(inputs=input_1, outputs=[dense_0, dense_1])
model.summary()

alpha = K.variable(0)
beta = K.variable(1)

model.compile(optimizer=Adam(),
              loss=['mean_absolute_error', 'binary_crossentropy'],
              metrics=['accuracy'],
              loss_weights=[alpha, beta])


class CustomValidationLoss(Callback):
    def __init__(self, alpha, beta):
        self.alpha = alpha
        self.beta = beta

    def on_epoch_end(self, epoch, logs={}):
        print ("loss_weights used:", K.get_value(self.alpha), K.get_value(self.beta))
        if epoch == 1:
            self.alpha = K.variable(1)
            self.beta = K.variable(0)



X = np.random.rand(100,1)
Y1 = np.random.rand(100,1)
Y2 = np.random.rand(100,1)


model.fit(X, [Y1, Y2],
          epochs=4,
          verbose=2,
          callbacks=[CustomValidationLoss(alpha, beta)])

specific values might change, but the total loss doesn't seem to use the updated loss_weights values in its calculation, even though they are printed fine

example of output

 - 1s - loss: 1.2862 - dense_2_loss: 0.4501 - dense_3_loss: 1.2862 - dense_2_acc: 0.0000e+00 - dense_3_acc: 0.0000e+00
loss_weights used: 0.0 1.0
Epoch 2/4
 - 0s - loss: 1.1401 - dense_2_loss: 0.4611 - dense_3_loss: 1.1401 - dense_2_acc: 0.0000e+00 - dense_3_acc: 0.0000e+00
loss_weights used: 0.0 1.0
Epoch 3/4
 - 0s - loss: 1.0499 - dense_2_loss: 0.4716 - dense_3_loss: 1.0499 - dense_2_acc: 0.0000e+00 - dense_3_acc: 0.0000e+00
loss_weights used: 1.0 0.0
Epoch 4/4
 - 0s - loss: 0.9893 - dense_2_loss: 0.4814 - dense_3_loss: 0.9893 - dense_2_acc: 0.0000e+00 - dense_3_acc: 0.0000e+00
loss_weights used: 1.0 0.0

what am I missing?

Falcatrua on 20 Aug 2019

👍5

import numpy as np
import tensorflow as tf
import keras
from keras.layers import *
from keras.callbacks import Callback
from keras.optimizers import Adam

input_1 = Input(shape=(1,))
hidden_0 = Dense(units=10, activation='tanh')(input_1)
dense_0  = Dense(1, activation='tanh')(hidden_0)
dense_1  = Dense(1, activation='tanh')(hidden_0)

model = keras.Model(inputs=input_1, outputs=[dense_0, dense_1])
model.summary()

alpha = K.variable(0)
beta = K.variable(1)

model.compile(optimizer=Adam(),
              loss=['mean_absolute_error', 'binary_crossentropy'],
              metrics=['accuracy'],
              loss_weights=[alpha, beta])


class CustomValidationLoss(Callback):
    def __init__(self, alpha, beta):
        self.alpha = alpha
        self.beta = beta

    def on_epoch_end(self, epoch, logs={}):
        print ("loss_weights used:", K.get_value(self.alpha), K.get_value(self.beta))
        if epoch == 1:
            self.alpha = K.variable(1)
            self.beta = K.variable(0)



X = np.random.rand(100,1)
Y1 = np.random.rand(100,1)
Y2 = np.random.rand(100,1)


model.fit(X, [Y1, Y2],
          epochs=4,
          verbose=2,
          callbacks=[CustomValidationLoss(alpha, beta)])

specific values might change, but the total loss doesn't seem to use the updated loss_weights values in its calculation, even though they are printed fine

example of output

 - 1s - loss: 1.2862 - dense_2_loss: 0.4501 - dense_3_loss: 1.2862 - dense_2_acc: 0.0000e+00 - dense_3_acc: 0.0000e+00
loss_weights used: 0.0 1.0
Epoch 2/4
 - 0s - loss: 1.1401 - dense_2_loss: 0.4611 - dense_3_loss: 1.1401 - dense_2_acc: 0.0000e+00 - dense_3_acc: 0.0000e+00
loss_weights used: 0.0 1.0
Epoch 3/4
 - 0s - loss: 1.0499 - dense_2_loss: 0.4716 - dense_3_loss: 1.0499 - dense_2_acc: 0.0000e+00 - dense_3_acc: 0.0000e+00
loss_weights used: 1.0 0.0
Epoch 4/4
 - 0s - loss: 0.9893 - dense_2_loss: 0.4814 - dense_3_loss: 0.9893 - dense_2_acc: 0.0000e+00 - dense_3_acc: 0.0000e+00
loss_weights used: 1.0 0.0

what am I missing?

I think that you shouldn't reassign the variables as K.variable in:
if epoch == 1: self.alpha = K.variable(1); self.beta = K.variable(0);

Instead just assign values, like this:
if epoch == 1: self.alpha = 1; self.beta = 0;
That's just a tip, I haven't tested it, but it should work, since you are not creating a new K.variable, instead just assigning a new value to the existing one.

lorinczszabolcs on 7 May 2020

I did this and it was working.. Thanks @joelthchao & @xingdi-eric-yuan :)

class LossWeightAdjust(Callback):
    def __init__(self, alpha, beta, gamma, delta):
        self.alpha = alpha
        self.beta = beta
        self.gamma = gamma
        self.delta = delta
    # customize your behavior
    def on_epoch_end(self, epoch, logs):
        losses = np.array([v for k,v in logs.items() if k in ['val_starts_0_loss', 'val_stops_0_loss', 'val_starts_1_loss', 'val_stops_1_loss']], dtype=np.float64)
        losses = (losses - 0.5*losses.min()) / (losses.max() - 0.5*losses.min())
        losses = losses/np.sum(losses)

        K.set_value(self.alpha, losses[0])
        K.set_value(self.beta, losses[1])
        K.set_value(self.gamma, losses[2])
        K.set_value(self.delta, losses[3])

        print("\n Loss weights recalibrated to alpha = %s, beta = %s, gamma = %s, delta = %s " % (np.round(losses[0],2),
                                                                                                  np.round(losses[1],2),
                                                                                                  np.round(losses[2],2),
                                                                                                  np.round(losses[3],2)))

        logger.info("Loss weights recalibrated to alpha = %s, beta = %s, gamma = %s, delta = %s " % (K.get_value(self.alpha), K.get_value(self.beta), K.get_value(self.gamma), K.get_value(self.delta)))

span_detection_model = build_model()

alpha = K.variable(0.25)
beta = K.variable(0.25)
gamma = K.variable(0.25)
delta = K.variable(0.75)

span_detection_model.compile(..., loss_weights= "starts_0":alpha,"stops_0":beta,"starts_1":gamma,"stops_1":delta})

dsadulla on 14 Jun 2020

This does not work for me. print(model.__dict__['compiled_loss'].__dict__['_user_loss_weights']) shows that the loss weights remain unchanged. (Below you see the code from before plus one import statement and the two print statements) In another example I trained on X=[[1,0],[1,0],...] and y1=[[1],[1],...], y2=[[0],[0],...] with loss weights (0,1) in the beginning and loss weights (1,0) after a few epochs. There, the weights are almost constant after the loss weights change, even though the y1 is extremely poorly fitted.

import numpy as np
import tensorflow as tf
K = tf.keras.backend
import keras
from keras.layers import *
from keras.callbacks import Callback

input_1 = Input(shape=(1,))
hidden_0 = Dense(units=10, activation='tanh')(input_1)
dense_0 = Dense(1, activation='tanh')(hidden_0)
dense_1 = Dense(1, activation='tanh')(hidden_0)

model = keras.Model(inputs=input_1, outputs=[dense_0, dense_1])
model.summary()

alpha = K.variable(0)
beta = K.variable(1)

model.compile(optimizer='adam',
              loss=['mean_absolute_error', 'binary_crossentropy'],
              metrics=['accuracy'],
              loss_weights=[alpha, beta])

class CustomValidationLoss(Callback):
    def __init__(self, alpha, beta):
        self.alpha = alpha
        self.beta = beta

    def on_epoch_end(self, epoch, logs={}):
        print("loss_weights used:", K.get_value(self.alpha), K.get_value(self.beta))
        if epoch == 10:
            self.alpha = K.variable(1)
            self.beta = K.variable(0)

X = np.random.rand(100, 1)
Y1 = np.random.rand(100, 1)
Y2 = np.random.rand(100, 1)

print(model.__dict__['compiled_loss'].__dict__['_user_loss_weights'])

model.fit(X, [Y1, Y2],
          epochs=20,
          verbose=2,
          callbacks=[CustomValidationLoss(alpha, beta)])

print(model.__dict__['compiled_loss'].__dict__['_user_loss_weights'])