Keras: Changing Learning Rate & Momentum During Training?

Created on 24 Oct 2015 · 9Comments · Source: keras-team/keras

I saw that @EderSantana (my hero) was working on changing the learning rates during training. I believe that commit was made here: https://github.com/fchollet/keras/issues/536

My question is that I normally have my model train on one epoch, do some predictions, and then train the next epoch. Can I simply modify the learning rate after say epoch 5?

for example:

for iteration in (0, 100):
    if iteration>5:
        optimizer.lr.set_value(0.02)
        optimizer.momentum.set_value(0.04) 
    model.fit(x_embed, y, nb_epoch=1, batch_size = batch_size, show_accuracy=True)
    preds = model.predict(x_embed_sample, verbose=0, batch_size = batch_size)[0]

Its kinda hard to test to see if this actually works or not.

Source

NickShahML

👍6

Most helpful comment

Your solution does look all right. You can always test it right before fit with

print(model.lr.get_value())

"Shared variable" in Theano means that it will always use the latest values. I guess you were worried that with a compiled model that lr would be froze somewhere. But it is not, if you change it, the compiled model will use the new value.

You can also do that with a callback, so you don't need to do any intervention or to halt fit to do the change. With that very same PR, it was added a LearningRateScheduler. For your problem you can use it like

def scheduler(epoch):
    if epoch == 5:
        model.lr.set_value(.02)
    return model.lr.get_value()

change_lr = LearningRateScheduler(scheduler)

model.fit(x_embed, y, nb_epoch=1, batch_size = batch_size, show_accuracy=True,
       callbacks=[chage_lr])

PS: Thanks for the kind words.

EderSantana on 24 Oct 2015

👍48 ❤12 👎2

All 9 comments

Your solution does look all right. You can always test it right before fit with

print(model.lr.get_value())

def scheduler(epoch):
    if epoch == 5:
        model.lr.set_value(.02)
    return model.lr.get_value()

change_lr = LearningRateScheduler(scheduler)

model.fit(x_embed, y, nb_epoch=1, batch_size = batch_size, show_accuracy=True,
       callbacks=[chage_lr])

PS: Thanks for the kind words.

EderSantana on 24 Oct 2015

👍48 ❤12 👎2

"Shared variable" in Theano means that it will always use the latest values. I guess you were worried that with a compiled model that lr would be froze somewhere.

Yes! This was exactly my concern.

Thank you for such an in depth reply. Really dude, sometime I wouldn't mind buying you a few lattes -- you've made really useful commits and helpful explanations!

NickShahML on 24 Oct 2015

👍3

@EderSantana I tried your code but it gives me this error:
AssertionError: The output of the "schedule" function should be float.

sunshineatnoon on 29 Jan 2016

Hi guys,

I was wondering how to change the learning rate according to the validation loss. I read in many papers that they reduce the learning rate by a factor of two when validation loss does not improve above a certain threshold and stop training when the validation loss converges. So how I incorporate the validation loss in my definition of learning rate schedule function?

Cheers

Kevinpsk on 14 Jun 2016

@sunshineatnoon the callback function has to return type float, example:
def step_decay(epoch):
initial_lrate = float(0.1)
if epoch > 14:
return float(0.0007)
else:
return initial_lrate

jaimeblasco on 29 Jun 2016

👍1

this thread is still in the top of google, despite being outdated.
Here is the new solution from #5724

import backend as K
def scheduler(epoch):
    if epoch == 5:
    k.set_value(model.optimizer.lr, .02)
return K.get_value(model.optimizer.lr)

Demetrio92 on 17 Oct 2018

👍9

this thread is still in the top of google, despite being outdated.
Here is the new solution from #5724
import backend as K
def scheduler(epoch):
    if epoch == 5:
    k.set_value(model.optimizer.lr, .02)
return K.get_value(model.optimizer.lr)

Hi @Demetrio92, is that code work for you. because I am getting exception which model is not defined.
File "train_model.py", line 241, in scheduler K.set_value(model.optimizer.lr, (K.get_value(model.optimizer.lr) / lr_rate)) NameError: name 'model' is not defined