In some other frameworks, always learning rate decay every epoch. Like this formula
lr = lr_0 * 1/(1+decay_rate * epoch_num)
But i see in keras get_updates,the updates variable looks will be called every batch_fit_loop.
Does that means keras update learning rate decay between every batch??
On every epoch
But you can create your own callback where you can do it on every batches
this says it updates per batch.
Thx~ @gattia . It seems @joelthchao confirm Learning Rate is update every batch. I just curious why keras implement learning decay in this way.
I cant say I have the real answer as to why, but I would assume it is to be more flexible. By allowing update per batch, it is easy to make a calculation to allow a certain effect per epoch. However, if it is only updated per epoch, there is no way to have any updates prior to the finish of an epoch. If someone is using lots of data, epochs might be very slow to update.
It is written on on_epoch_begin, so lr is updated every epochs class
LearningRateScheduler(Callback):
"""Learning rate scheduler.
schedule: a function that takes an epoch index as input
(integer, indexed from 0) and current learning rate
and returns a new learning rate as output (float).
verbose: int. 0: quiet, 1: update messages.
"""
def __init__(self, schedule, verbose=0):
super(LearningRateScheduler, self).__init__()
self.schedule = schedule
self.verbose = verbose
def on_epoch_begin(self, epoch, logs=None):
if not hasattr(self.model.optimizer, 'lr'):
raise ValueError('Optimizer must have a "lr" attribute.')
lr = float(K.get_value(self.model.optimizer.lr))
try: # new API
lr = self.schedule(epoch, lr=lr)
except TypeError: # old API for backward compatibility
lr = self.schedule(epoch)
if not isinstance(lr, (float, np.float32, np.float64)):
raise ValueError('The output of the "schedule" function '
'should be float.')
K.set_value(self.model.optimizer.lr, lr)
if self.verbose > 0:
print('\nEpoch %05d: LearningRateScheduler reducing learning '
'rate to %s.' % (epoch + 1, lr)
2018-04-16 13:08 GMT+01:00 Anthony notifications@github.com:
I cant say I have the real answer as to why, but I would assume it is to
be more flexible. By allowing update per batch, it is easy to make a
calculation to allow a certain effect per epoch. However, if it is only
updated per epoch, there is no way to have any updates prior to the finish
of an epoch. If someone is using lots of data, epochs might be very slow to
update.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/9938#issuecomment-381576687,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AMRCHfIDWgczq1A-K1OGYkdDb0X1M6ngks5tpImngaJpZM4TVQQI
.
You are correct that learningRateScheduler works that way. But the question was not about learningRateScheduler, but instead about the decay parameter that is part of essentially all of the learning optimizers.
@tchaton your comprehension is wrong. The LearningRateScheduler is just a callback that model can adjust learning rate when you add a LearningRateScheduler implentation to callbacks.
What I was meaning, is that you can implement a weight decay over epoch is
you want to.
2018-04-17 7:30 GMT+01:00 seanxh notifications@github.com:
@tchaton https://github.com/tchaton your comprehension is wrong. The
LearningRateScheduler is just a callback that model can adjust learning
rate when you add a LearningRateScheduler implentation to callbacks.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/9938#issuecomment-381862187,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AMRCHWrRos0OX_is4aGyjrAnA_TmWLOpks5tpYvzgaJpZM4TVQQI
.
Most helpful comment
You are correct that learningRateScheduler works that way. But the question was not about learningRateScheduler, but instead about the decay parameter that is part of essentially all of the learning optimizers.