Keras: How does the parameter of decay in keras work concretely?

Created on 25 May 2016 · 5Comments · Source: keras-team/keras

I'm confused about the decay parameter in SGD() function of keras, how does it work concretely, for example, I use a learning rate of lr=1.0, and set the decay parameter to be decay=1e-6, anyone can help me? Thanks in advance!

stale

Source

twangnh

👍3

Most helpful comment

a batch update

joelthchao on 26 May 2016

👍10

All 5 comments

The learning rate (lr) updates according to code.

joelthchao on 25 May 2016

👍1

Thank you, but I'm confused about the self.iterations ,does it mean a batch updata, of an epoch, or something, for example, I want to train a RNN, and I ran the training with total size of 20000 training data, batch_size=20,np_epoch=3, initial learning rate =0.1,decay=1e-6, then how will it do the decay?

 def get_updates(self, params, constraints, loss):
     grads = self.get_gradients(loss, params)
     lr = self.lr * (1. / (1. + self.decay * self.iterations))
     self.updates = [(self.iterations, self.iterations + 1.)]

twangnh on 26 May 2016

👍7

a batch update

joelthchao on 26 May 2016

👍10

batch update meaning it decays the lr every batch (means decay 1000 times in the single epoch)?