Keras: Stateful LSTMs - error despite using "batch_input_shape"

Created on 22 Mar 2016  路  15Comments  路  Source: keras-team/keras

When i add 'stateful' to LSTM, I get following Exception: If a RNN is stateful, a complete input_shape must be provided (including batch size).
Based on other threads #1125 #1130 I am using the option of "batch_input_shape" yet i am getting the error.
I raised the same in forum https://groups.google.com/forum/#!topic/keras-users/nwB3ilYY4ZQ
but no response

you can find my complete code here:
https://github.com/anujgupta82/DeepNets/blob/master/LSTM/IMDB_Embedding_w2v_LSTM_3.ipynb

stale

Most helpful comment

Figured it out from topology.py - the error message is misleading. The "Input" function takes argument "batch_shape", not "batch_input_shape".

All 15 comments

batch_input_shape must be passed to the first layer of the network.

Same problem here:
model = Sequential() model.add(GRU(100,activation='relu',stateful=True,return_sequences=True,batch_input_shape=(batch_size,X_train.shape[-2], X_train.shape[-1]))) ...

What's the X_train.shape[0] you have? do all your batches have the same number of samples? that is a must when using stateful RNNs.

X_train.shape[0] is the number of samples.

I have just tried using a batch_size that is a factor of the number of samples (so all batches have exactly the same number of samples) and it works, thanks!

A note that mentions this might be helpful in the documentation where talking about statefulness.
Edit: it's already there, my bad.

If batch_input_shape must be specified in the first layer of a stateful network, how is this done when using the functional API? The Input() layer will not allow it. I have tried everything I can think of but am still receiving this same exception ("complete input_shape must be provided (including batch size)"), even with batch size 1. I am trying to make an LRCN using TimeDistributed CNN layers, followed by a couple dense layers, followed by LSTM:

inputs = Input(shape=(1,3,227,227))

conv_1 = TimeDistributed(Convolution2D(96, 11, 11,subsample=(4,4),activation='relu',
                       name='conv_1'))(inputs)

conv_2 = TimeDistributed(MaxPooling2D((3, 3), strides=(2,2)))(conv_1)
conv_2 = TimeDistributed(LRN(name="convpool_1"))(conv_2)
conv_2 = TimeDistributed(ZeroPadding2D((2,2)))(conv_2)
conv_2 = TimeDistributed(Convolution2D(128,5,5,activation="relu",name="conv_2"))(conv_2)

...skipping similar conv layers...

dense_1 = TimeDistributed(MaxPooling2D((3, 3), strides=(2,2),name="convpool_5"))(conv_5)
dense_1 = TimeDistributed(Flatten(name="flatten"))(dense_1)
dense_1 = TimeDistributed(Dense(4096, activation='relu',name='dense_1'))(dense_1)
dense_2 = Dropout(0.5)(dense_1)
dense_2 = TimeDistributed(Dense(4096, activation='relu',name='dense_2'))(dense_2)

lstm_1 = Dropout(0.5)(dense_2)
lstm_1 = LSTM(100,
              batch_input_shape=(1,1,4096), #(batch size,timesteps,feature shape)
              return_sequences=False,
              stateful=True)(lstm_1)

dense_3 = Dense(6,name='dense_out')(lstm_1)
prediction = Activation("tanh",name="tanh")(dense_3)

(This is a regression problem; I am trying to predict six values at each time step based on an image sequence.)

The same network using the other API does not produce the same exception, but I am hoping to take advantage of the functional API, so I'd like to figure out what I'm doing wrong.

Figured it out from topology.py - the error message is misleading. The "Input" function takes argument "batch_shape", not "batch_input_shape".

@cmgladding
I tried "batch_shape", but is was not recognized by Keras. Don't know why, The issues persist for me no matter what key words I used.

I also have this issue on functional model

Here is my example for those who get stuck. Indeed, the error message is misleading. I had to change Input(shape=()) to Input(batch_shape=()) in order for it to work.

Error one:

frame_sequence = Input(shape=(TIME_STEPS, HEIGHT, WIDTH, CHANNELS))
...
net = TimeDistributed(self.vision_model)(frame_sequence)
net = LSTM(HIDDEN_UNITS, stateful=True, return_sequences=False)(net)

Correct version:

frame_sequence = Input(batch_shape=(BATCH_SIZE, TIME_STEPS, HEIGHT, WIDTH, CHANNELS))
...
net = TimeDistributed(self.vision_model)(frame_sequence)
net = LSTM(HIDDEN_UNITS, stateful=True, return_sequences=False)(net)

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.

Hi,
I am facing almost same issue for a shared convnet, then two features are concatenated and to be feed to a Dense layer and then into a LSTM. Model is like fig.1 https://arxiv.org/pdf/1705.06368.pdf

I am stucked in this issue for a long time, any help is much appricated. @farizrahman4u @dat-ai
`# First, define the vision modules
input_dim = (224, 224, 3)
image_input = Input(shape=input_dim)

vision_model = Conv2D(64, (3, 3), activation='relu', padding='same')(image_input)
vision_model = Conv2D(64, (3, 3), activation='relu')(vision_model)
vision_model = MaxPooling2D((2, 2))(vision_model)
vision_model = Conv2D(128, (3, 3), activation='relu', padding='same')(vision_model)
vision_model = Conv2D(128, (3, 3), activation='relu')(vision_model)
vision_model = MaxPooling2D((2, 2))(vision_model)
vision_model = Conv2D(256, (3, 3), activation='relu', padding='same')(vision_model)
vision_model= MaxPooling2D((2, 2))(vision_model)
out = Flatten()(vision_model)

model = Model(image_input,out)

digit_a = Input(shape=input_dim)
digit_b = Input(shape=input_dim)

The vision model will be shared, weights and all

out_a = model(digit_a)
out_b = model(digit_b)

concatenated = concatenate([out_a, out_b])
out = Dense(2048, activation='relu')(concatenated)

concat_model= Model([digit_a,digit_b],out)

batch size =32,num of unroll =2, I am not sure how to put multi input sequence as input

frame_sequence = Input(batch_shape=(32, 2,224,224,3))
unroll_feature = TimeDistributed(concat_model)(frame_sequence)`

And the error message I got

Using TensorFlow backend.
Traceback (most recent call last):
File "/home/rajat/Downloads/pycharm-2017.2.3/helpers/pydev/pydevd.py", line 1599, in
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/rajat/Downloads/pycharm-2017.2.3/helpers/pydev/pydevd.py", line 1026, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/rajat/Downloads/re3-tensorflow-master/keras_training/vision_model.py", line 52, in
unroll_feature = TimeDistributed(concat_model)(frame_sequence)
File "/home/rajat/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 602, in __call__
output = self.call(inputs, *kwargs)
File "/home/rajat/.local/lib/python2.7/site-packages/keras/layers/wrappers.py", line 188, in call
unroll=False)
File "/home/rajat/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2467, in rnn
outputs, _ = step_function(inputs[0], initial_states + constants)
File "/home/rajat/.local/lib/python2.7/site-packages/keras/layers/wrappers.py", line 179, in step
output = self.layer.call(x, *
kwargs)
File "/home/rajat/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 2058, in call
output_tensors, _, _ = self.run_internal_graph(inputs, masks)
File "/home/rajat/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 2262, in run_internal_graph
assert str(id(x)) in tensor_map, 'Could not compute output ' + str(x)
AssertionError: Could not compute output Tensor("dense_1/Relu:0", shape=(?, 2048), dtype=float32)

Hi @rajatkoner08,
Please let me know if you were able to solve this problem and how to fix it. Thanks in advance. Madhu

When i add 'stateful' to LSTM, I get following Exception: If a RNN is stateful, a complete input_shape must be provided (including batch size).
Based on other threads #1125 #1130 I am using the option of "batch_input_shape" yet i am getting the error.
I raised the same in forum https://groups.google.com/forum/#!topic/keras-users/nwB3ilYY4ZQ
but no response

you can find my complete code here:
https://github.com/anujgupta82/DeepNets/blob/master/LSTM/IMDB_Embedding_w2v_LSTM_3.ipynb

Needs to be built like so:

self.lstm_custom_1 = keras.layers.LSTM(128,batch_input_shape=batch_input_shape, return_sequences=False, stateful=True)

self.lstm_custom_1.build(batch_input_shape)

Using Keras for R with a Functional API I am observing a similar problem which I can't resolve referring to the advice given above, since the cases above refer to Keras for Python and are (for me) not easily transferred to Keras for R.

Since this thread has been closed long before, I have raised this topic anew under issue #13262 - hopefully there will be replies w.r.t. to Keras for R.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

nryant picture nryant  路  3Comments

farizrahman4u picture farizrahman4u  路  3Comments

zygmuntz picture zygmuntz  路  3Comments

fredtcaroli picture fredtcaroli  路  3Comments

harishkrishnav picture harishkrishnav  路  3Comments