Keras: Fix Adam Optimizer to Implement Paper Correctly

Created on 5 Jul 2018  ·  4Comments  ·  Source: keras-team/keras

Adam was recently demonstrated to be implemented incorrectly in several packages including Keras. Propose to fix the optimizer using method described here:

https://arxiv.org/pdf/1711.05101.pdf

Most helpful comment

Learning rate decay != weight decay.

They just aren't the same thing.

All 4 comments

  1. Keras doesn't call it weight decay, it is explicitly called l2 regularization (see https://keras.io/regularizers/) which is indeed accurate.

While common deep learning frameworks of these algorithms implement L2 regularization (often calling it “weight decay” in what may be misleading due to the inequivalence we expose).

Keras calls it l2 regularization.

  1. https://arxiv.org/pdf/1705.08292.pdf

I don't think that's right.

https://keras.io/optimizers/#adam

  • decay: float >= 0. Learning rate decay over each update.

Regarding 2, Im not sure what this has to do with the conversation. Did
you mean to say that Adam doesnt generalize as well as SGD? That's besides
the point.

On Thu, Jul 5, 2018 at 5:41 PM, brge17 notifications@github.com wrote:

>

1.

Keras doesn't call it weight decay, it is explicitly called l2
regularization (see https://keras.io/regularizers/) which is indeed
accurate.
2.

https://arxiv.org/pdf/1705.08292.pdf


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/10611#issuecomment-402860787,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ALarq1UIB0nW3tWl5hBr05Z7VhP0dsHMks5uDogdgaJpZM4VETgP
.

Learning rate decay != weight decay.

They just aren't the same thing.

Is the optimizer fixed now?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

LuCeHe picture LuCeHe  ·  3Comments

farizrahman4u picture farizrahman4u  ·  3Comments

snakeztc picture snakeztc  ·  3Comments

fredtcaroli picture fredtcaroli  ·  3Comments

braingineer picture braingineer  ·  3Comments