Keras: Gradient Noise

Created on 11 Apr 2016 · 11Comments · Source: keras-team/keras

Add gradient noise to optimizers based on this paper: http://arxiv.org/abs/1511.06807

stale

Source

jnphilipp

👍7

Most helpful comment

I improved and released my code to a separate package: https://github.com/cpury/keras_gradient_noise

With this you can add Gradient Noise to any kind of optimizer.

Install via pip install keras_gradient_noise. Usage example:

from keras.optimizers import Adam
from keras_gradient_noise import add_gradient_noise

# ...

NoisyAdam = add_gradient_noise(Adam)

model.compile(optimizer=NoisyAdam(noise_eta=0.1, noise_gamma=0.4))

@fchollet if you're interested, I could convert this into a PR.

cpury on 1 Dec 2017

❤4

All 11 comments

How about GaussianNoise, see https://keras.io/layers/noise/ ? Have you tried that? Seems to me that you have to add a noise-layer instead of changing the optimizer to achieve the effect you desire.

apacha on 8 Jun 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

stale[bot] on 6 Sep 2017

👎1

Update: I improved this code and converted it to a package: https://github.com/cpury/keras_gradient_noise

@jnphilipp I was also curious to try it out. Here's my implementation. I just tried it and it seems to improve training as expected:

import keras
from keras import backend as K


class NoisySGD(keras.optimizers.SGD):
    def __init__(self, noise_eta=0.01, noise_gamma=0.55, **kwargs):
        super(NoisySGD, self).__init__(**kwargs)
        with K.name_scope(self.__class__.__name__):
            self.noise_eta = K.variable(noise_eta, name='noise_eta')
            self.noise_gamma = K.variable(noise_gamma, name='noise_gamma')

    def get_gradients(self, loss, params):
            grads = super(NoisySGD, self).get_gradients(loss, params)

            # Add decayed gaussian noise
            t = K.cast(self.iterations, K.dtype(grads[0]))
            variance = self.noise_eta / ((1 + t) ** self.noise_gamma)

            grads = [
                grad + K.random_normal(
                    grad.shape,
                    mean=0.0,
                    stddev=K.sqrt(variance),
                    dtype=K.dtype(grads[0])
                )
                for grad in grads
            ]

            return grads

    def get_config(self):
        config = {'noise_eta': float(K.get_value(self.noise_eta)),
                  'noise_gamma': float(K.get_value(self.noise_gamma))}
        base_config = super(NoisySGD, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

Put this somewhere and use it as your optimizer. If you want to combine it with another optimizer, e.g. Adam, just subclass it and add the gradient noise part to get_gradients()

cpury on 29 Nov 2017

❤4

Thanks @cpury but I can't use it, does it work if I do
model.compile(loss='mse', optimizer=NoisySGD) ?
And if I want to use Nadam with it I just need to change to
class NoisySGD(keras.optimizers.Nadam): ?
To finish, I also wanted to know is there was a way to change the quantity of noise ?

edmondja on 29 Nov 2017

@edmondja

does it work if I do ... ?

I believe you forgot the brackets after NoisySGD. Try this:

model.compile(loss='mse', optimizer=NoisySGD())

And if I want to use Nadam with it I just need to change to class NoisySGD(keras.optimizers.Nadam): ?

I looked at the Nadam code and it doesn't seem to overwrite the get_gradients function, so it would probably work as you proposed. Feel free to try and report back.

a way to change the quantity of noise

My NoisySGD implementation takes two additional keyword args: noise_eta and noise_gamma. They correspond to η and γ from figure 1 in the paper. By default, they are set to 0.01 and 0.55 respectively. You can experiment with them, e.g. by setting:

model.compile(loss='mse', optimizer=NoisySGD(noise_eta=0.3, noise_gamma=0.8))

cpury on 29 Nov 2017

❤1

This works @cpury , thank you, I am testing it while tuning eta and it seems to work even better than dropout.

edmondja on 29 Nov 2017

@edmondja I had an error in my code, I just fixed it. Please update your get_gradients method accordingly. Sorry for the trouble!

cpury on 1 Dec 2017

👍1

I improved and released my code to a separate package: https://github.com/cpury/keras_gradient_noise

With this you can add Gradient Noise to any kind of optimizer.

Install via pip install keras_gradient_noise. Usage example:

from keras.optimizers import Adam
from keras_gradient_noise import add_gradient_noise

# ...

NoisyAdam = add_gradient_noise(Adam)

model.compile(optimizer=NoisyAdam(noise_eta=0.1, noise_gamma=0.4))

@fchollet if you're interested, I could convert this into a PR.

cpury on 1 Dec 2017

❤4

@cpury

I tried your package, but when running Keras with Tensorflow backend, there was error:

keras_gradient_noise/gradient_noise.py", line 43, in get_gradients
  for grad in grads
AttributeError: 'IndexedSlices' object has no attribute 'shape'

Is this a known issue?