I'm new using keras, I want to get the learning rate during training LSTM with sgd optimizer, I have set the decay parameter, it seems it works, but when I use model.optimizer.lr.get_value()
to read the learning rate, it didn't change at all, my setting is as follows:
lr_init=0.12; decay_init=1e-2; batch_size_init=30
momentum_init=0.9; np_epoch_init=1
sgd = SGD(lr=lr_init, decay=decay_init, momentum=momentum_init, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
early_stopping = EarlyStopping(patience=2,verbose=1)
lr_record=[]
for iteration in range(1, 60):
print('Iteration', iteration)
hist=model.fit(X, y, callbacks=[early_stopping],validation_data=(X_val, y_val),batch_size=batch_size_init, nb_epoch=np_epoch_init)
lr_temp=model.optimizer.lr.get_value()
lr_record.append(lr_temp)
after ran it, the learning rate recorded in lr_record
did not change a bit at all, I'm wondering if someone could help me, thanks in advance!
Base learning rate (lr
) is unchanged in SGD
and the learning rate is calculated according to iteration. Ref: code.
Thank you, but with decay parameter(I think keras use the 1/t decay), how could I monitor the learning rate by each epoach?
Use callback to access model.optimizer
.
class SGDLearningRateTracker(Callback):
def on_epoch_end(self, epoch, logs={}):
optimizer = self.model.optimizer
lr = K.eval(optimizer.lr * (1. / (1. + optimizer.decay * optimizer.iterations)))
print('\nLR: {:.6f}\n'.format(lr))
# define your model
sgd = SGD(lr=0.01, decay=0.9)
model.compile(loss='mse', optimizer=sgd)
model.fit(x, y, callbacks=[SGDLearningRateTracker()])
Notice: It only works for SGD. If you want to do the same thing on other optimizer, please understand meaning of learning rate of that optimizer and modify lr calculation part accordingly.
Thanks very much!
@joelthchao I tried your Callback class and it runs for SGD. However, when i set an RMSprop optimizer, i get the error:
AttributeError: 'RMSprop' object has no attribute 'iterations'
keras.__version__
'0.3.2'
I can see on https://github.com/fchollet/keras/blob/master/keras/optimizers.py that the RMSprop class does have an iterations attribute. Is my version too old?
@joelthchao, thank you for your suggested solution. However, you didn't include your imports. What is the "K" from "K.eval"? Putting in keras does not work.
import keras.backend as K
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
And how to deal with less trivial schedulers such as ReduceLROnPlateau
? Is there a way to access the actual learning rate instead of recalculating it?
Judging from the code of ReduceLROnPlateau
the lr = float(K.get_value(self.model.optimizer.lr))
should return it, so no arithmetics with optimizer.decay * optimizer.iterations
etc are needed. What is the rationale behind using those, @joelthchao?
@DSLituiev
To clarify, there are two different kinds of "lr":
optimizer.lr
.base_lr
to calculate it. (e.g. SGD: optimizer.lr * (1. / (1. + optimizer.decay * optimizer.iterations))
In your case, ReduceLROnPlateau
directly changes base_lr
, therefore, you still need to calculate real lr depends on which optimizer do you use.
@joelthchao When I use this code, I get an error: Incompatible type conversion requested to type 'float32' for variable of type 'int64_ref'
. I am using Keras with Tensorflow backend. Then I converted values from optimizer using tensorflow.to_float(x ,name='ToFloat'), it seems to work.
_lr = tf.to_float(optimizer.lr, name='ToFloat')
_decay = tf.to_float(optimizer.decay, name='ToFloat')
_iter = tf.to_float(optimizer.iterations, name='ToFloat')
lr = K.eval(_lr * (1. / (1. + _decay * _iter)))
print(' - LR: {:.6f}\n'.format(lr))
How does one feed current lr into the tensorboard? Custom Callback?
@vspinu That's what I did here, see my PR to Keras
Closing this issue since its resolved. Thanks!
train_lr_callback = LambdaCallback( on_epoch_begin= lambda epoch,logs: print("LearningRate of %e" % (K.eval(model.optimizer.lr)) )
model.fit_generator(train_loader,
steps_per_epoch = totalTrainImgsNum // batch_size,
epochs = total_epoches,
initial_epoch = 22,
validation_data = test_loader,
validation_steps = 35080 // batch_size,
callbacks = [checkpoint, changelr, tensorboard,train_lr_callback])
Most helpful comment
Use callback to access
model.optimizer
.Notice: It only works for SGD. If you want to do the same thing on other optimizer, please understand meaning of learning rate of that optimizer and modify lr calculation part accordingly.