Keras: Add synthetic gradient training per "Decoupled Neural Interfaces using Synthetic Gradients"

Created on 19 Aug 2016 · 10Comments · Source: keras-team/keras

stale

Source

ihodes

👍33

Most helpful comment

Hi.
This is my experimental reproduction of "Decoupled Neural Interfaces using Synthetic Gradients"
https://github.com/rarilurelo/tensorflow-synthetic_gradient

There are two problems in my code for using Keras as a complete wrapper.

Initial gradient
Initial gradient can be set by Tensorflow(tf.gradient's grad_ys), but K.gradient has no argument for this.
Does anyone know how to set initial gradient by Theano?
control dependencies
An M which synthesis the gradients of the lower layer is trained by the true error descending from the upper layer.
This gif on DeepMind's blog is very useful to understand this flow(https://deepmind.com/blog/decoupled-neural-networks-using-synthetic-gradients/#gif-1).
The true error must be computed before the layer updated, so I need to use tf.control_dependencies.
I think it can be implemented by modifying Model: _make_train_function() and Model's initial argument.
However tensorflow and theano have different ways to control dependencies, so it's difficult and will be ugly.

Thanks.

rarilurelo on 20 Sep 2016

👍7

All 10 comments

yes please

llSourcell on 21 Aug 2016

👍6

Hi.
This is my experimental reproduction of "Decoupled Neural Interfaces using Synthetic Gradients"
https://github.com/rarilurelo/tensorflow-synthetic_gradient

There are two problems in my code for using Keras as a complete wrapper.

Initial gradient
Initial gradient can be set by Tensorflow(tf.gradient's grad_ys), but K.gradient has no argument for this.
Does anyone know how to set initial gradient by Theano?
control dependencies
An M which synthesis the gradients of the lower layer is trained by the true error descending from the upper layer.
This gif on DeepMind's blog is very useful to understand this flow(https://deepmind.com/blog/decoupled-neural-networks-using-synthetic-gradients/#gif-1).
The true error must be computed before the layer updated, so I need to use tf.control_dependencies.
I think it can be implemented by modifying Model: _make_train_function() and Model's initial argument.
However tensorflow and theano have different ways to control dependencies, so it's difficult and will be ugly.

Thanks.

rarilurelo on 20 Sep 2016

👍7

@rarilurelo Initial grads can be set in theano using known_grads option : theano.gradient.grad(cost, wrt, consider_constant=None, disconnected_inputs='raise', add_names=True, known_grads=None, return_disconnected='zero', null_gradients='raise')

vikasverma1077 on 10 Jun 2017

@rarilurelo and @llSourcell Please put an example of RNN(LSTM) using synthetic gradient if possible because its no-where at internet and when I tried with myself, I faced lots-off problem.
And one question: How can we updates the weight at particular layer using synthetic gradient, as LSTM layers shared weights so updating the weight of one layer can update the weights of other too ?

rishikksh20 on 7 Jul 2017

👍1

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.