Keras: Cannot visualize gradients with tensorboard

Created on 10 Aug 2018  路  5Comments  路  Source: keras-team/keras

Training my network works fine, but the training & validation loss stop decreasing after around 60 epochs. I wanted to visualize the gradients through the tensorboard callback, but received this error message:

ValueError: An operation has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

And the callback used:

tbCallBack = callbacks.TensorBoard(log_dir=path + '/run_' + str(curr_run), histogram_freq=20, write_graph=False,write_images=False, batch_size=batch_size,write_grads=True)

My network is a three-layer CNN followed by an LSTM:

 x = TimeDistributed(
        Conv2D(16, (1, 8), activation='relu', padding='same', kernel_initializer='he_normal'))(
        input_sample)
    x = TimeDistributed(MaxPooling2D((1, 4), padding='same'))(x)
    x = TimeDistributed(BatchNormalization())(x)

    x = TimeDistributed(Conv2D(32, (1, 4), activation='relu', padding='same', kernel_initializer='he_normal'))(x)
    x = TimeDistributed(MaxPooling2D((1, 4), padding='same'))(x)
    x = TimeDistributed(BatchNormalization())(x)

    x = TimeDistributed(Conv2D(64, (1, 4), activation='relu', padding='same', kernel_initializer='he_normal'))(x)
    x = TimeDistributed(MaxPooling2D((1, 4), padding='same'))(x)
    x = TimeDistributed(Flatten())(x)

    if stateful:
        input_timesteps = Input(shape=(None, 1), name='input_timesteps', batch_shape=(batch_size, seq_len, 1))
    else:
        input_timesteps = Input(shape=(None, 1), name='input_timesteps')

    x = BatchNormalization()(x)
    x = CuDNNLSTM(64, stateful=stateful, name='lstm_layer1', return_sequences=True)(x)
    x = BatchNormalization()(x)
    x = concatenate([x, input_timesteps])
    out = TimeDistributed(Dense(1, activation='linear', kernel_initializer='he_normal'), name='output_layer')(x)

    model = Model(inputs=[input_sample, input_timesteps], outputs=[out])

I have no idea which operation is causing the graident to be None. I'm running the latest version of Keras (2.2.2) and tensorflow-gpu/tensorboard (1.10.0)

To investigate

Most helpful comment

I have implemented a ResNet architecture and have the same error. It comes from an error in the distribution of version 2.2.4 at this line.

line on master

if self.write_grads and weight in layer.trainable_weights:

line in distribution

if self.write_grads:

The moving mean of the BatchNormalization layer is not trainable and thus has no gradient.

All 5 comments

I have exactly the same problem. Did you find any clue about the error? Checking the Keras source code quickly, all I can see that the tf.gradients() function call returns None.

same problem. it's quite weird that removing histogram_freq=20 from the tensorboard creation would be ok.

Same.

I have implemented a ResNet architecture and have the same error. It comes from an error in the distribution of version 2.2.4 at this line.

line on master

if self.write_grads and weight in layer.trainable_weights:

line in distribution

if self.write_grads:

The moving mean of the BatchNormalization layer is not trainable and thus has no gradient.

I am facing the same issue, does anybody know if there is a fix for some version?

Was this page helpful?
0 / 5 - 0 ratings