Keras: Gradient Noise

Created on 11 Apr 2016  路  11Comments  路  Source: keras-team/keras

Add gradient noise to optimizers based on this paper: http://arxiv.org/abs/1511.06807

stale

Most helpful comment

I improved and released my code to a separate package: https://github.com/cpury/keras_gradient_noise

With this you can add Gradient Noise to any kind of optimizer.

Install via pip install keras_gradient_noise. Usage example:

from keras.optimizers import Adam
from keras_gradient_noise import add_gradient_noise

# ...

NoisyAdam = add_gradient_noise(Adam)

model.compile(optimizer=NoisyAdam(noise_eta=0.1, noise_gamma=0.4))

@fchollet if you're interested, I could convert this into a PR.

All 11 comments

How about GaussianNoise, see https://keras.io/layers/noise/ ? Have you tried that? Seems to me that you have to add a noise-layer instead of changing the optimizer to achieve the effect you desire.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

Update: I improved this code and converted it to a package: https://github.com/cpury/keras_gradient_noise

@jnphilipp I was also curious to try it out. Here's my implementation. I just tried it and it seems to improve training as expected:

import keras
from keras import backend as K


class NoisySGD(keras.optimizers.SGD):
    def __init__(self, noise_eta=0.01, noise_gamma=0.55, **kwargs):
        super(NoisySGD, self).__init__(**kwargs)
        with K.name_scope(self.__class__.__name__):
            self.noise_eta = K.variable(noise_eta, name='noise_eta')
            self.noise_gamma = K.variable(noise_gamma, name='noise_gamma')

    def get_gradients(self, loss, params):
            grads = super(NoisySGD, self).get_gradients(loss, params)

            # Add decayed gaussian noise
            t = K.cast(self.iterations, K.dtype(grads[0]))
            variance = self.noise_eta / ((1 + t) ** self.noise_gamma)

            grads = [
                grad + K.random_normal(
                    grad.shape,
                    mean=0.0,
                    stddev=K.sqrt(variance),
                    dtype=K.dtype(grads[0])
                )
                for grad in grads
            ]

            return grads

    def get_config(self):
        config = {'noise_eta': float(K.get_value(self.noise_eta)),
                  'noise_gamma': float(K.get_value(self.noise_gamma))}
        base_config = super(NoisySGD, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

Put this somewhere and use it as your optimizer. If you want to combine it with another optimizer, e.g. Adam, just subclass it and add the gradient noise part to get_gradients()

Thanks @cpury but I can't use it, does it work if I do
model.compile(loss='mse', optimizer=NoisySGD) ?
And if I want to use Nadam with it I just need to change to
class NoisySGD(keras.optimizers.Nadam): ?
To finish, I also wanted to know is there was a way to change the quantity of noise ?

@edmondja

does it work if I do ... ?

I believe you forgot the brackets after NoisySGD. Try this:

model.compile(loss='mse', optimizer=NoisySGD())

And if I want to use Nadam with it I just need to change to class NoisySGD(keras.optimizers.Nadam): ?

I looked at the Nadam code and it doesn't seem to overwrite the get_gradients function, so it would probably work as you proposed. Feel free to try and report back.

a way to change the quantity of noise

My NoisySGD implementation takes two additional keyword args: noise_eta and noise_gamma. They correspond to 畏 and 纬 from figure 1 in the paper. By default, they are set to 0.01 and 0.55 respectively. You can experiment with them, e.g. by setting:

model.compile(loss='mse', optimizer=NoisySGD(noise_eta=0.3, noise_gamma=0.8))

This works @cpury , thank you, I am testing it while tuning eta and it seems to work even better than dropout.

@edmondja I had an error in my code, I just fixed it. Please update your get_gradients method accordingly. Sorry for the trouble!

I improved and released my code to a separate package: https://github.com/cpury/keras_gradient_noise

With this you can add Gradient Noise to any kind of optimizer.

Install via pip install keras_gradient_noise. Usage example:

from keras.optimizers import Adam
from keras_gradient_noise import add_gradient_noise

# ...

NoisyAdam = add_gradient_noise(Adam)

model.compile(optimizer=NoisyAdam(noise_eta=0.1, noise_gamma=0.4))

@fchollet if you're interested, I could convert this into a PR.

@cpury

I tried your package, but when running Keras with Tensorflow backend, there was error:

keras_gradient_noise/gradient_noise.py", line 43, in get_gradients
  for grad in grads
AttributeError: 'IndexedSlices' object has no attribute 'shape'

Is this a known issue?

@yoquankara I've never seen that before. But let's continue this elsewhere. Could you create an issue on my repository with some more details?

@cpury Thanks, I will post to your repo.

Was this page helpful?
0 / 5 - 0 ratings