Keras: CNN-LSTM with video frame sequence: InvalidArgumentError: You must feed a value for placeholder tensor

Created on 22 Mar 2017 · 21Comments · Source: keras-team/keras

I'm building a CNN-LSTM network in Keras (v2.02) + Tensorflow (v1.0.1) using video frames as input. I'm setting up the network as shown below:

import tensorflow as tf
import keras
import cv2

video = keras.layers.Input(shape=(None, 299,299,3),name='video_input')

cnn = keras.applications.InceptionV3(weights='imagenet',
                                 include_top='False',
                                 pooling='avg')

cnn.trainable = False
encoded_frame = keras.layers.TimeDistributed(cnn)(video)
encoded_vid = keras.layers.LSTM(256)(encoded_frame)
outputs = keras.layers.Dense(128, activation='relu')(encoded_vid)

Some of the tensor properties are below:

video
<tf.Tensor 'video_input:0' shape=(?, ?, 299, 299, 3) dtype=float32>

cnn.input
<tf.Tensor 'input_1:0' shape=(?, 299, 299, 3) dtype=float32>

cnn.output
<tf.Tensor 'predictions/Softmax:0' shape=(?, 1000) dtype=float32>    

encoded_frame
<tf.Tensor 'time_distributed_1/Reshape_1:0' shape=(?, ?, 1000) dtype=float32>

encoded_vid
<tf.Tensor 'lstm_1/TensorArrayReadV3:0' shape=(?, 256) dtype=float32>

outputs
<tf.Tensor 'dense_1/Relu:0' shape=(?, 128) dtype=float32>

Now I build the model and fit the data:

model = keras.models.Model(inputs=[video],outputs=outputs)
model.compile(optimizer='adam',
          loss='mean_squared_logarithmic_error')
# Generate random targets
y = np.random.random(size=(128,)) 
y = np.reshape(y,(-1,128))
model.fit(x=frame_sequence, y=y, validation_split=0.0,shuffle=False, batch_size=1)

where frame_sequence is a sequence of video frames from one video:

frame_sequence.shape
(1, 48, 299, 299, 3)

All seems well up to the training step model.fit, where I get an error attributed to the input_1 placeholder in the InceptionV3 model input:

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input_1' with dtype float
 [[Node: input_1 = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Training works without error if I build my CNN from scratch instead of loading InceptionV3. For example, replacing InceptionV3 with:

cnn = Sequential()
cnn.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(229, 229, 3)))
cnn.add(Conv2D(64, (3, 3), activation='relu'))
cnn.add(MaxPooling2D((2, 2)))
cnn.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
cnn.add(Conv2D(128, (3, 3), activation='relu'))
cnn.add(MaxPooling2D((2, 2)))
cnn.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
cnn.add(Conv2D(256, (3, 3), activation='relu'))
cnn.add(Conv2D(256, (3, 3), activation='relu'))
cnn.add(MaxPooling2D((2, 2)))
cnn.add(Flatten())

Here is some minimal code to reproduce the issue.

Source

Feynman27

👍8

Most helpful comment

@Feynman27 Potential workaround: set learning phase to 0 and pass Lambda(lambda x: cnn(x)) to TimeDistributed.
Here's the full example: https://gist.github.com/alfiya400/9d3bf303966f87a3c2aa92a0a0a54662

I also checked that output from TimeDistributed and cnn.predict match each other.

The drawback of this approach: you can't use Dropouts and there might be other restrictions when learning_phase is set to 0.

alfiya400 on 23 Mar 2017

👍5 🎉1

All 21 comments

I'm seeing the same error, though in a very different context. I have an implementation of Bidirectional Attention Flow (BiDAF) in Keras. I want to load that model, similar to how @Feynman27 is loading InceptionV3, then pull parts out of it for use in a model on a different, but similar task (see code here). This worked in Keras 1, but is breaking when porting our code to Keras 2. You can see a trace of the failure here. I spent several hours trying to figure out what is going on here, and I'm at a wall. I thought that the __call__ method might not be hooking up the inputs correctly, because there's an extra placeholder still lying around, but I made a minimal example trying to show that something is broken, and it actually works. I'm at a total loss for why my example doesn't have the same crash that my real model has.

I very much want this bug to be fixed - I'm happy to help debug, if anyone has suggestions on what to do. I've run out of ideas.

matt-gardner on 22 Mar 2017

Why do u define the layers as ( keras.layers.Input ) etc?

Why don’t u use this style:

from keras.models import Model
from keras.layers import Input, Dense

a = Input(shape=(32,))
b = Dense(32)(a)
model = Model(inputs=a, outputs=b)

This is could be the error?

I am trying to do similar model:

input ( video frames ) > Conv > tLSTM

the conv network alone works, but LSTM is not working.

MuOtb on 23 Mar 2017

@MuOtb they're functionally identical, some people just don't like having multiple imports at the top of their file.

nelson-liu on 23 Mar 2017

One of my colleagues referred me to this issue encountered when using BN in a sub-model applied to a time-distributed layer. Looks like a similar issue, but at the moment it's still unclear to me how to apply that recommendation to the case above with Inception.

Feynman27 on 23 Mar 2017

I also checked that output from TimeDistributed and cnn.predict match each other.

The drawback of this approach: you can't use Dropouts and there might be other restrictions when learning_phase is set to 0.

alfiya400 on 23 Mar 2017

👍5 🎉1

@alfiya400, do you have any idea _why_ that workaround works for the CNN? I tried those in my model, and it does not solve the issue.

matt-gardner on 23 Mar 2017

@alfiya400 How do you try this out? I'm just going to try your method, and hope it works me out.

buptss on 24 Mar 2017

@buptss Just follow her gist link above. It worked for me but not @matt-gardner. We're trying to figure out why, but my suspicion is that it has something to do with the batch normalization, and the Lambda function is instantiating a new BN instance for each CNN output.

Feynman27 on 24 Mar 2017

@matt-gardner I think there are two problems here(in TimeDistributed over InceptionV3 probem):

If I use TimeDistributes(Lambda(lambda x: cnn(x))) but don't set K.set_learning_phase(0) I get an error like You must feed a value for placeholder tensor 'batch_normalization_1/keras_learning_phase'.

This is happening because the uses_learning_phase parameter is not the same for cnn and model

model.uses_learning_phase=False
cnn.uses_learning_phase=True  # (because of the BatchNorm layer)

If you call model.fit it first builds the list of inputs using this code and fails to add K.learning_phase into list of inputs cause model.uses_learning_phase=False. So... to fix that you could set learning_phase to 0 or use Dropouts/BatchNorm on all your models. (probably using Dropout(0.0001) could be a workaround...)

If I use TimeDistributed(cnn) I get the You must feed a value for placeholder tensor 'input_1' error. Using Lambda(lambda x: cnn(x)) helps, but I have no idea why... Probably in your case you should try wrapping your model into Lambda..

alfiya400 on 24 Mar 2017

Yeah, I tried wrapping the TimeDistributed part of my model in a Lambda after your first comment, but it didn't work. I just now tried also wrapping the other Model that I use in a Lambda, and that didn't work, either. I still get the missing placeholder error (on an input tensor, not a batch norm tensor, so it's the second issue you mention, not the first). If I knew _why_ adding the Lambda helps in your case, maybe I could figure out what's going wrong in my case, because I'm pretty sure they're related...

matt-gardner on 24 Mar 2017

I met the same error when applying TimeDistributed to InceptionV3. I also think it due to the compatibility of TimeDistributed and BatchNormalization, because I didn't met this when using TimeDistributed to wrap VGG16 which does not have BN layer.
@alfiya400 Thanks for your solution! It works at least for now in my project.

Wenbo93 on 26 Mar 2017

I got the same problem.
I'm agree with @Wenbo93 .I think it due to the compatibility of TimeDistributed and BatchNormalization.This is my code.I used BN in a TimeDistributed CNN.

    convs= Sequential()
    convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-1))), 1, 1,input_shape=shape[1:], border_mode="same", bias=False,activation='relu'))
    convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-1))), 3, 3, border_mode="same", bias=False,activation='relu'))
    convs.add(BatchNormalization(axis=3))
    convs.add(MaxPooling2D((2,2),border_mode='same'))
    for l_cnn in range(1,nb_conv_layers):
        convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-l_cnn-1))), 1, 1, border_mode="same", bias=False,activation='relu'))
        convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-l_cnn-1))), 3, 3, border_mode="same", bias=False,activation='relu'))
        convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-l_cnn-1))), 3, 3, border_mode="same", bias=False,activation='relu'))
        convs.add(BatchNormalization(axis=3))
        convs.add(MaxPooling2D((2,2),border_mode='same'))
    convs.add(Flatten())
    #Warp the cnn and conect it with a rnn
    out=TimeDistributed(convs)(inputs)
    for l_rnn in range(nb_rnn_layers-1):
        out=LSTM(512,return_sequences=True,activation='relu',stateful=stateful)(out)
    out=LSTM(512,return_sequences=False,activation='relu',stateful=stateful)(out)
    out=Dropout(0.2)(out)
    out=Dense(1024,activation='relu')(out)
    out=Dropout(0.2)(out)
    out=Dense(1,activation=activation)(out)
    tdcnn=Model(input=[inputs],output=[out])

QuantumLiu on 26 Mar 2017

I have a small example that reproduces the problem.

nb_samples = 50
input_a_len = 50
X = np.ones((nb_samples, 2, input_a_len), dtype=np.float32)
Y = np.ones((nb_samples, 2, 1), dtype=np.float32)
input_a = Input(shape=(2, input_a_len), name='input_a', dtype='float32')
input_a_reshaped = Reshape((2, input_a_len, 1))(input_a)
pred = TimeDistributed(LSTM(1, recurrent_dropout=0.1))(input_a_reshaped)
model = Model([input_a], pred)
model.compile(loss='binary_crossentropy', optimizer='sgd')
hist = model.fit(x=X, y=Y)

This produces the error:

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'time_distributed_1/keras_learning_phase' with dtype bool
         [[Node: time_distributed_1/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

In this case:

adding K.set_learning_phase(1) explicitly, or
removing the TimeDistributed (with dimensional adjustments) or
removing the recurrent_dropout option, solves the problem.

solve the problem, but none of these workarounds seem an acceptable solution.

StefPac on 19 May 2017

@StefPac Yeah I have the same error. Adding set_learning_phase(1) solves it. What issues or consequences could arise from hard-coding this value when training (with validation) a model.

avn3r on 21 Jun 2017

@abnera in that case Dropout will also drop out neurons during validation, for example.

gewoonrik on 27 Jun 2017

@gewoonrik Thanks for the explanation. Yeah, I am getting very poor validation results by hard-coding the learning_phase: set_learning_phase(1).

avn3r on 27 Jun 2017

@QuantumLiu did you manage to find a workaround for the problem with the batchnorm layer? I am also encoutering problems when trying to make TimeDistributed(BatchNormalization())(input) which gives

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'time_distributed_1/keras_learning_phase' with dtype bool
     [[Node: time_distributed_1/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
     [[Node: Mean_3/_33 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_1296_Mean_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

redsphinx on 4 Jul 2017

@gewoonrik I've circumvented this behavior by recreating the model without the dropout layers and reloading the weights into it - it loads and the predictions are stable indicating that dropout is not applied. Batch norm layer is not so easy though - it has weights so if you drop the layers the weights won't be loaded due to layer mismatch between "training model" and "predictive" one. I am thinking this could be circumvented by creating another layer that follows the same structure as Batch Norm but returns the same value when in_train_phase is called.

tRosenflanz on 19 Jul 2017

I met the same issue in a different context. In my model I tried to produce the model to learn adaptively from Chinese character vectors to word vectors and further to word properties.

def HBLSTM4POS(maxword_per_sen=20,maxchar_per_word=8,word_vec_dim=52,pos_num=26):
    InputLayers = Input(shape=(maxword_per_sen,maxchar_per_word,word_vec_dim),name='InputTensor')
    Posmask = TimeDistributed(Masking(mask_value=0.0,input_shape=(8,52)),input_shape=(20,8,52))(InputLayers)
    WordLayer = TimeDistributed(Bidirectional(LSTM(52,return_sequences=False,dropout=0.1,input_shape=(8,52),name='WordVector')),input_shape=(20,8,52))(Posmask)
    POS_LSTM1 = Bidirectional(LSTM(52,return_sequences=True))(WordLayer)
    POS_LSTM2 = Bidirectional(LSTM(52,return_sequences=True))(POS_LSTM1)
    Dense1 = TimeDistributed(Dense(POS_NUM*3,activation='relu'))(POS_LSTM2)
    Dense2 = TimeDistributed(Dense(POS_NUM,activation='softmax',name='POS_Output'))(Dense1)
    model = Model(inputs=InputLayers, outputs=Dense2)
    model.compile(optimizer='adam', loss='binary_crossentropy',metrics=['accuracy'])
    return model

train_data = np.random.rand(100,20,8,52)
train_y = np.random.randint(26,size=(100,20,26))
model = HBLSTM4POS()
model.fit(train_data,train_y,batch_size=10,epochs=2,validation_split=0.1)

Full error message:

2017-09-10 20:42:29.012323: W tensorflow/core/framework/op_kernel.cc:1158] Invalid argument: You must feed a value for placeholder tensor 'time_distributed_2/keras_learning_phase' with dtype bool
     [[Node: time_distributed_2/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
Traceback (most recent call last):
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call
    return fn(*args)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn
    status, run_metadata)
  File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'time_distributed_2/keras_learning_phase' with dtype bool
     [[Node: time_distributed_2/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
     [[Node: mul_1/_41 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_12859_mul_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "integrated_model.py", line 36, in <module>
    model.fit(train_data,train_y,batch_size=10,epochs=2,validation_split=0.1)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1507, in fit
    initial_epoch=initial_epoch)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1156, in _fit_loop
    outs = f(ins_batch)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2269, in __call__
    **self.session_kwargs)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 789, in run
    run_metadata_ptr)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 997, in _run
    feed_dict_string, options, run_metadata)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
    target_list, options, run_metadata)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'time_distributed_2/keras_learning_phase' with dtype bool
     [[Node: time_distributed_2/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
     [[Node: mul_1/_41 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_12859_mul_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Caused by op 'time_distributed_2/keras_learning_phase', defined at:
  File "integrated_model.py", line 35, in <module>
    model = HBLSTM4POS()
  File "integrated_model.py", line 15, in HBLSTM4POS
    WordLayer = TimeDistributed(Bidirectional(LSTM(52,return_sequences=False,dropout=0.1,input_shape=(8,52),name='WordVector')),input_shape=(20,8,52))(Posmask)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 596, in __call__
    output = self.call(inputs, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/wrappers.py", line 177, in call
    y = self.layer.call(inputs)  # (num_samples * timesteps, ...)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/wrappers.py", line 263, in call
    y = self.forward_layer.call(inputs, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py", line 333, in call
    preprocessed_input = self.preprocess_input(inputs, training=None)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py", line 1077, in preprocess_input
    timesteps, training=training)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py", line 46, in _time_distributed_dense
    x = K.in_train_phase(x * expanded_dropout_matrix, x, training=training)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2602, in in_train_phase
    training = learning_phase()
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 115, in learning_phase
    name='keras_learning_phase')
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1530, in placeholder
    return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1954, in _placeholder
    name=name)
  File "/home/ht/tensorflow3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/ht/tensorflow3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'time_distributed_2/keras_learning_phase' with dtype bool
     [[Node: time_distributed_2/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
     [[Node: mul_1/_41 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_12859_mul_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

When I deleted the inherent dropout parameter in LSTM layers the error got disappeared. So I wonder the TimeDistirbuted wrapper still had some trouble in wrapping dropout.

elternativeht on 10 Sep 2017

Try instead
encoded_frame = keras.layers.TimeDistributed(cnn)(video)
use
encoded_frame = keras.layers.TimeDistributed(cnn.outputs[0])(video)

creotiv on 31 Oct 2017

raise ValueError("Tensor %s is not an element of this graph." % obj)

ValueError: Tensor Tensor("predictions/Softmax:0", shape=(?, 1000), dtype=float32) is not an element of this graph.
in keras VGG16 model

pawanvirsingh on 22 Dec 2017

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Extracting embeddings from layers

anjishnu · 3Comments

Model with Dropout layer wrapped in TimeDistributed fails on Theano

somewacko · 3Comments

Dropout error with Functional API ((Cast uint8 to bool is not supported)

MarkVdBergh · 3Comments

keras crashing when using convolutions

braingineer · 3Comments

In training process, validation data are necessary?

Imorton-zd · 3Comments