Keras: Cannot visualize gradients with tensorboard

Created on 10 Aug 2018 · 5Comments · Source: keras-team/keras

Training my network works fine, but the training & validation loss stop decreasing after around 60 epochs. I wanted to visualize the gradients through the tensorboard callback, but received this error message:

ValueError: An operation has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

And the callback used:

tbCallBack = callbacks.TensorBoard(log_dir=path + '/run_' + str(curr_run), histogram_freq=20, write_graph=False,write_images=False, batch_size=batch_size,write_grads=True)

My network is a three-layer CNN followed by an LSTM:

 x = TimeDistributed(
        Conv2D(16, (1, 8), activation='relu', padding='same', kernel_initializer='he_normal'))(
        input_sample)
    x = TimeDistributed(MaxPooling2D((1, 4), padding='same'))(x)
    x = TimeDistributed(BatchNormalization())(x)

    x = TimeDistributed(Conv2D(32, (1, 4), activation='relu', padding='same', kernel_initializer='he_normal'))(x)
    x = TimeDistributed(MaxPooling2D((1, 4), padding='same'))(x)
    x = TimeDistributed(BatchNormalization())(x)

    x = TimeDistributed(Conv2D(64, (1, 4), activation='relu', padding='same', kernel_initializer='he_normal'))(x)
    x = TimeDistributed(MaxPooling2D((1, 4), padding='same'))(x)
    x = TimeDistributed(Flatten())(x)

    if stateful:
        input_timesteps = Input(shape=(None, 1), name='input_timesteps', batch_shape=(batch_size, seq_len, 1))
    else:
        input_timesteps = Input(shape=(None, 1), name='input_timesteps')

    x = BatchNormalization()(x)
    x = CuDNNLSTM(64, stateful=stateful, name='lstm_layer1', return_sequences=True)(x)
    x = BatchNormalization()(x)
    x = concatenate([x, input_timesteps])
    out = TimeDistributed(Dense(1, activation='linear', kernel_initializer='he_normal'), name='output_layer')(x)

    model = Model(inputs=[input_sample, input_timesteps], outputs=[out])

I have no idea which operation is causing the graident to be None. I'm running the latest version of Keras (2.2.2) and tensorflow-gpu/tensorboard (1.10.0)

To investigate

Source

kymillev

👍9

Most helpful comment

I have implemented a ResNet architecture and have the same error. It comes from an error in the distribution of version 2.2.4 at this line.

line on master

if self.write_grads and weight in layer.trainable_weights:

line in distribution

if self.write_grads:

The moving mean of the BatchNormalization layer is not trainable and thus has no gradient.

frederikschubert on 6 Nov 2018

👍5

All 5 comments

I have exactly the same problem. Did you find any clue about the error? Checking the Keras source code quickly, all I can see that the tf.gradients() function call returns None.

kecsap on 2 Sep 2018

same problem. it's quite weird that removing histogram_freq=20 from the tensorboard creation would be ok.

ftian1 on 12 Oct 2018

Same.

goru97 on 25 Oct 2018

I have implemented a ResNet architecture and have the same error. It comes from an error in the distribution of version 2.2.4 at this line.

line on master

if self.write_grads and weight in layer.trainable_weights:

line in distribution

if self.write_grads:

The moving mean of the BatchNormalization layer is not trainable and thus has no gradient.

frederikschubert on 6 Nov 2018

👍5

I am facing the same issue, does anybody know if there is a fix for some version?

Raquelie on 22 Sep 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

compile() should not require arguments when not training

kylemcdonald · 3Comments

Cost-sensitive classification

zygmuntz · 3Comments

Regularizer config does not serialize to YAML

nryant · 3Comments

Messy log when printing inside a Callback

fredtcaroli · 3Comments

Accessing the internal states c of LSTM for all the time steps of each input sequence

yil8 · 3Comments