Keras: How to get learning rate during training ?

Created on 26 May 2016  路  16Comments  路  Source: keras-team/keras

I'm new using keras, I want to get the learning rate during training LSTM with sgd optimizer, I have set the decay parameter, it seems it works, but when I use model.optimizer.lr.get_value() to read the learning rate, it didn't change at all, my setting is as follows:

lr_init=0.12; decay_init=1e-2; batch_size_init=30
momentum_init=0.9; np_epoch_init=1
sgd = SGD(lr=lr_init, decay=decay_init, momentum=momentum_init, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
early_stopping = EarlyStopping(patience=2,verbose=1)
lr_record=[]
for iteration in range(1, 60):
    print('Iteration', iteration)
    hist=model.fit(X, y, callbacks=[early_stopping],validation_data=(X_val, y_val),batch_size=batch_size_init, nb_epoch=np_epoch_init)
    lr_temp=model.optimizer.lr.get_value()
    lr_record.append(lr_temp)

after ran it, the learning rate recorded in lr_recorddid not change a bit at all, I'm wondering if someone could help me, thanks in advance!

tensorflow

Most helpful comment

Use callback to access model.optimizer.

class SGDLearningRateTracker(Callback):
    def on_epoch_end(self, epoch, logs={}):
        optimizer = self.model.optimizer
        lr = K.eval(optimizer.lr * (1. / (1. + optimizer.decay * optimizer.iterations)))
        print('\nLR: {:.6f}\n'.format(lr))

# define your model

sgd = SGD(lr=0.01, decay=0.9)
model.compile(loss='mse', optimizer=sgd)
model.fit(x, y, callbacks=[SGDLearningRateTracker()])

Notice: It only works for SGD. If you want to do the same thing on other optimizer, please understand meaning of learning rate of that optimizer and modify lr calculation part accordingly.

All 16 comments

Base learning rate (lr) is unchanged in SGD and the learning rate is calculated according to iteration. Ref: code.

Thank you, but with decay parameter(I think keras use the 1/t decay), how could I monitor the learning rate by each epoach?

Use callback to access model.optimizer.

class SGDLearningRateTracker(Callback):
    def on_epoch_end(self, epoch, logs={}):
        optimizer = self.model.optimizer
        lr = K.eval(optimizer.lr * (1. / (1. + optimizer.decay * optimizer.iterations)))
        print('\nLR: {:.6f}\n'.format(lr))

# define your model

sgd = SGD(lr=0.01, decay=0.9)
model.compile(loss='mse', optimizer=sgd)
model.fit(x, y, callbacks=[SGDLearningRateTracker()])

Notice: It only works for SGD. If you want to do the same thing on other optimizer, please understand meaning of learning rate of that optimizer and modify lr calculation part accordingly.

Thanks very much!

@joelthchao I tried your Callback class and it runs for SGD. However, when i set an RMSprop optimizer, i get the error:

AttributeError: 'RMSprop' object has no attribute 'iterations'
keras.__version__
'0.3.2'

I can see on https://github.com/fchollet/keras/blob/master/keras/optimizers.py that the RMSprop class does have an iterations attribute. Is my version too old?

@joelthchao, thank you for your suggested solution. However, you didn't include your imports. What is the "K" from "K.eval"? Putting in keras does not work.

import keras.backend as K

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

And how to deal with less trivial schedulers such as ReduceLROnPlateau? Is there a way to access the actual learning rate instead of recalculating it?

Judging from the code of ReduceLROnPlateau the lr = float(K.get_value(self.model.optimizer.lr)) should return it, so no arithmetics with optimizer.decay * optimizer.iterations etc are needed. What is the rationale behind using those, @joelthchao?

@DSLituiev
To clarify, there are two different kinds of "lr":

  1. base_lr: optimizer.lr.
  2. real lr that apply on gradient: optimizer use its algorithm and base_lr to calculate it. (e.g. SGD: optimizer.lr * (1. / (1. + optimizer.decay * optimizer.iterations))

In your case, ReduceLROnPlateaudirectly changes base_lr, therefore, you still need to calculate real lr depends on which optimizer do you use.

@joelthchao When I use this code, I get an error: Incompatible type conversion requested to type 'float32' for variable of type 'int64_ref'. I am using Keras with Tensorflow backend. Then I converted values from optimizer using tensorflow.to_float(x ,name='ToFloat'), it seems to work.

_lr = tf.to_float(optimizer.lr, name='ToFloat')
_decay = tf.to_float(optimizer.decay, name='ToFloat')
_iter = tf.to_float(optimizer.iterations, name='ToFloat')

lr = K.eval(_lr * (1. / (1. + _decay * _iter)))
print(' - LR: {:.6f}\n'.format(lr))

How does one feed current lr into the tensorboard? Custom Callback?

@vspinu That's what I did here, see my PR to Keras

Closing this issue since its resolved. Thanks!

using LambdaCallback and on_epoch_begin

train_lr_callback = LambdaCallback( on_epoch_begin= lambda epoch,logs: print("LearningRate of %e" % (K.eval(model.optimizer.lr)) )

model.fit_generator(train_loader,
    steps_per_epoch = totalTrainImgsNum // batch_size, 
    epochs = total_epoches,
    initial_epoch = 22,
    validation_data = test_loader,
    validation_steps = 35080 // batch_size,
    callbacks = [checkpoint, changelr, tensorboard,train_lr_callback])
Was this page helpful?
0 / 5 - 0 ratings

Related issues

zygmuntz picture zygmuntz  路  3Comments

farizrahman4u picture farizrahman4u  路  3Comments

nryant picture nryant  路  3Comments

fredtcaroli picture fredtcaroli  路  3Comments

kylemcdonald picture kylemcdonald  路  3Comments