Keras: ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

Created on 20 Mar 2019  路  10Comments  路  Source: keras-team/keras

I asked my question on StackOverflow. Link.

I tried to make a custom layer using keras. I only want to implement the following 2 lines of code in call function which should be trainable.

AV = K.dot(A, Vin)
Vout = K.dot(AV, W)

dimensions of A, Vin and W are (n, n), (?, n, c) and (c, f) respectively.
I would like to train my network on mnist or cifar10 dataset.
Sharky said in her answer that it depends on the dataset and data shapes.
I don't get exactly what is the problem here.
Please, someone, help me to overcome this problem.
Thank you.

Most helpful comment

@Utsav-Patel, @luozhouyang, @cottrell

Does your code have any weights that were defined but left unused? That may be the reason for that error. My guess is that since it is not being used, its gradient can not be computed wr.t. loss. Thus gradient is None.

This is more difficult to identify if your layer is inheriting from another layer. Calling the super constructor will add weights that you probably don't use. In which case, don't call the super().

I've coded out an example to show this in action (tf version 1.13.1, keras 2.2.4). Comment out the line

 v = v+K.dot(x, self.kernelB)       ### comment out this line to get NONE gradient error

inside call(), to get the error. If commented out that, then self.kernelB is never used, and keras gives you an error.

from keras import backend as K
from keras.layers import Layer, Activation
from keras.engine.base_layer import InputSpec
import numpy as np
from keras.models import Sequential

class CustomDense(Layer):

    def __init__(self, units, bias_constraint=None, **kwargs):

        if 'input_shape' not in kwargs and 'input_dim' in kwargs:
            kwargs['input_shape'] =  (kwargs.pop('input_dim'),)

        super(CustomDense, self).__init__(**kwargs)
        self.num_outputs = units
        self.input_spec = InputSpec(min_ndim=2)

    def build(self, input_shape):
        # Create a trainable weight variable for this layer.
        self.kernelA = self.add_weight(name='kernelA',
                                       shape=(input_shape[1], self.num_outputs),
                                       initializer='uniform')

        ##This weight is defined here, but its usage can
        ##be controlled by commenting out a line in call
        self.kernelB = self.add_weight(name='kernelB',
                                       shape=(input_shape[1], self.num_outputs),
                                       initializer='uniform')

        self.built = True
        super(CustomDense, self).build(input_shape)  # Be sure to call this at the end

    def call(self, x):
        v = K.dot(x, self.kernelA)
        v = v+K.dot(x, self.kernelB)       ### comment out this line to get 
                                                      ### NONE gradient error
        return v

    def compute_output_shape(self, input_shape):
        return (input_shape[0], self.num_outputs)

if __name__ == '__main__':
    n_units = in_dim = 10
    test = np.random.random((100,in_dim))
    model = Sequential()
    layer = CustomDense(units=n_units, input_dim=in_dim)
    model.add(layer)
    model.add(Activation("elu"))
    model.compile("adam", "mae")
    model.fit(test, test)

All 10 comments

Which version are you on? I am hitting this on some generic math manipulations in tf 2.0. I think it is weird error message.

@cottrell same issue

@Utsav-Patel, @luozhouyang, @cottrell

Does your code have any weights that were defined but left unused? That may be the reason for that error. My guess is that since it is not being used, its gradient can not be computed wr.t. loss. Thus gradient is None.

This is more difficult to identify if your layer is inheriting from another layer. Calling the super constructor will add weights that you probably don't use. In which case, don't call the super().

I've coded out an example to show this in action (tf version 1.13.1, keras 2.2.4). Comment out the line

 v = v+K.dot(x, self.kernelB)       ### comment out this line to get NONE gradient error

inside call(), to get the error. If commented out that, then self.kernelB is never used, and keras gives you an error.

from keras import backend as K
from keras.layers import Layer, Activation
from keras.engine.base_layer import InputSpec
import numpy as np
from keras.models import Sequential

class CustomDense(Layer):

    def __init__(self, units, bias_constraint=None, **kwargs):

        if 'input_shape' not in kwargs and 'input_dim' in kwargs:
            kwargs['input_shape'] =  (kwargs.pop('input_dim'),)

        super(CustomDense, self).__init__(**kwargs)
        self.num_outputs = units
        self.input_spec = InputSpec(min_ndim=2)

    def build(self, input_shape):
        # Create a trainable weight variable for this layer.
        self.kernelA = self.add_weight(name='kernelA',
                                       shape=(input_shape[1], self.num_outputs),
                                       initializer='uniform')

        ##This weight is defined here, but its usage can
        ##be controlled by commenting out a line in call
        self.kernelB = self.add_weight(name='kernelB',
                                       shape=(input_shape[1], self.num_outputs),
                                       initializer='uniform')

        self.built = True
        super(CustomDense, self).build(input_shape)  # Be sure to call this at the end

    def call(self, x):
        v = K.dot(x, self.kernelA)
        v = v+K.dot(x, self.kernelB)       ### comment out this line to get 
                                                      ### NONE gradient error
        return v

    def compute_output_shape(self, input_shape):
        return (input_shape[0], self.num_outputs)

if __name__ == '__main__':
    n_units = in_dim = 10
    test = np.random.random((100,in_dim))
    model = Sequential()
    layer = CustomDense(units=n_units, input_dim=in_dim)
    model.add(layer)
    model.add(Activation("elu"))
    model.compile("adam", "mae")
    model.fit(test, test)

@abaxi I could reproduce this error with another example similar to yours. You can find the example here: https://stackoverflow.com/a/58533503/3924118. Just remove the usage of shared_variable in the method call.

@Utsav-Patel This error arises when some of your weights in the model are not used. So, it shows that it is not differentiable. Make sure you use all the weights in the model to overcome this error.

@Utsav-Patel This error arises when some of your weights in the model are not used. So, it shows that it is not differentiable. Make sure you use all the weights in the model to overcome this error.

How do you ensure using all weights in the model

In my case I just used the left over weights by multiplying by 0, so that all weights are covered. This solved the issue.

In my case I just used the left over weights by multiplying by 0, so that all weights are covered. This solved the issue.

Thank you Sir

@giridhar-pamisetty

Can you please suggest me a way to check for unused weights ?

I was using only a part of hidden node weights to calculate the output. So, after getting this error, I multiplied the remaining hidden node weights with zero. So, all the weights are covered.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

LuCeHe picture LuCeHe  路  3Comments

braingineer picture braingineer  路  3Comments

snakeztc picture snakeztc  路  3Comments

zygmuntz picture zygmuntz  路  3Comments

vinayakumarr picture vinayakumarr  路  3Comments