Keras: ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

Created on 20 Mar 2019 · 10Comments · Source: keras-team/keras

I asked my question on StackOverflow. Link.

I tried to make a custom layer using keras. I only want to implement the following 2 lines of code in call function which should be trainable.

AV = K.dot(A, Vin)
Vout = K.dot(AV, W)

dimensions of A, Vin and W are (n, n), (?, n, c) and (c, f) respectively.
I would like to train my network on mnist or cifar10 dataset.
Sharky said in her answer that it depends on the dataset and data shapes.
I don't get exactly what is the problem here.
Please, someone, help me to overcome this problem.
Thank you.

Source

Utsav-Patel

👍5

Most helpful comment

@Utsav-Patel, @luozhouyang, @cottrell

Does your code have any weights that were defined but left unused? That may be the reason for that error. My guess is that since it is not being used, its gradient can not be computed wr.t. loss. Thus gradient is None.

This is more difficult to identify if your layer is inheriting from another layer. Calling the super constructor will add weights that you probably don't use. In which case, don't call the super().

I've coded out an example to show this in action (tf version 1.13.1, keras 2.2.4). Comment out the line

 v = v+K.dot(x, self.kernelB)       ### comment out this line to get NONE gradient error

inside call(), to get the error. If commented out that, then self.kernelB is never used, and keras gives you an error.

from keras import backend as K
from keras.layers import Layer, Activation
from keras.engine.base_layer import InputSpec
import numpy as np
from keras.models import Sequential

class CustomDense(Layer):

    def __init__(self, units, bias_constraint=None, **kwargs):

        if 'input_shape' not in kwargs and 'input_dim' in kwargs:
            kwargs['input_shape'] =  (kwargs.pop('input_dim'),)

        super(CustomDense, self).__init__(**kwargs)
        self.num_outputs = units
        self.input_spec = InputSpec(min_ndim=2)

    def build(self, input_shape):
        # Create a trainable weight variable for this layer.
        self.kernelA = self.add_weight(name='kernelA',
                                       shape=(input_shape[1], self.num_outputs),
                                       initializer='uniform')

        ##This weight is defined here, but its usage can
        ##be controlled by commenting out a line in call
        self.kernelB = self.add_weight(name='kernelB',
                                       shape=(input_shape[1], self.num_outputs),
                                       initializer='uniform')

        self.built = True
        super(CustomDense, self).build(input_shape)  # Be sure to call this at the end

    def call(self, x):
        v = K.dot(x, self.kernelA)
        v = v+K.dot(x, self.kernelB)       ### comment out this line to get 
                                                      ### NONE gradient error
        return v

    def compute_output_shape(self, input_shape):
        return (input_shape[0], self.num_outputs)

if __name__ == '__main__':
    n_units = in_dim = 10
    test = np.random.random((100,in_dim))
    model = Sequential()
    layer = CustomDense(units=n_units, input_dim=in_dim)
    model.add(layer)
    model.add(Activation("elu"))
    model.compile("adam", "mae")
    model.fit(test, test)

abaxi on 29 May 2019

👍7

All 10 comments

Which version are you on? I am hitting this on some generic math manipulations in tf 2.0. I think it is weird error message.

cottrell on 9 Apr 2019

@cottrell same issue

luozhouyang on 24 Apr 2019

@Utsav-Patel, @luozhouyang, @cottrell

This is more difficult to identify if your layer is inheriting from another layer. Calling the super constructor will add weights that you probably don't use. In which case, don't call the super().

I've coded out an example to show this in action (tf version 1.13.1, keras 2.2.4). Comment out the line

 v = v+K.dot(x, self.kernelB)       ### comment out this line to get NONE gradient error

inside call(), to get the error. If commented out that, then self.kernelB is never used, and keras gives you an error.

from keras import backend as K
from keras.layers import Layer, Activation
from keras.engine.base_layer import InputSpec
import numpy as np
from keras.models import Sequential

class CustomDense(Layer):

    def __init__(self, units, bias_constraint=None, **kwargs):

        if 'input_shape' not in kwargs and 'input_dim' in kwargs:
            kwargs['input_shape'] =  (kwargs.pop('input_dim'),)

        super(CustomDense, self).__init__(**kwargs)
        self.num_outputs = units
        self.input_spec = InputSpec(min_ndim=2)

    def build(self, input_shape):
        # Create a trainable weight variable for this layer.
        self.kernelA = self.add_weight(name='kernelA',
                                       shape=(input_shape[1], self.num_outputs),
                                       initializer='uniform')

        ##This weight is defined here, but its usage can
        ##be controlled by commenting out a line in call
        self.kernelB = self.add_weight(name='kernelB',
                                       shape=(input_shape[1], self.num_outputs),
                                       initializer='uniform')

        self.built = True
        super(CustomDense, self).build(input_shape)  # Be sure to call this at the end

    def call(self, x):
        v = K.dot(x, self.kernelA)
        v = v+K.dot(x, self.kernelB)       ### comment out this line to get 
                                                      ### NONE gradient error
        return v

    def compute_output_shape(self, input_shape):
        return (input_shape[0], self.num_outputs)

if __name__ == '__main__':
    n_units = in_dim = 10
    test = np.random.random((100,in_dim))
    model = Sequential()
    layer = CustomDense(units=n_units, input_dim=in_dim)
    model.add(layer)
    model.add(Activation("elu"))
    model.compile("adam", "mae")
    model.fit(test, test)

abaxi on 29 May 2019

👍7

@abaxi I could reproduce this error with another example similar to yours. You can find the example here: https://stackoverflow.com/a/58533503/3924118. Just remove the usage of shared_variable in the method call.

nbro on 24 Oct 2019

@Utsav-Patel This error arises when some of your weights in the model are not used. So, it shows that it is not differentiable. Make sure you use all the weights in the model to overcome this error.

giridhar-pamisetty on 30 Mar 2020

@Utsav-Patel This error arises when some of your weights in the model are not used. So, it shows that it is not differentiable. Make sure you use all the weights in the model to overcome this error.

How do you ensure using all weights in the model

chizala on 31 Mar 2020

In my case I just used the left over weights by multiplying by 0, so that all weights are covered. This solved the issue.

giridhar-pamisetty on 31 Mar 2020

In my case I just used the left over weights by multiplying by 0, so that all weights are covered. This solved the issue.

Thank you Sir

chizala on 31 Mar 2020

@giridhar-pamisetty

Can you please suggest me a way to check for unused weights ?

jaswanthbjk on 4 Sep 2020

I was using only a part of hidden node weights to calculate the output. So, after getting this error, I multiplied the remaining hidden node weights with zero. So, all the weights are covered.

giridhar-pamisetty on 4 Sep 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings