When i add 'stateful' to LSTM, I get following Exception: If a RNN is stateful, a complete input_shape must be provided (including batch size).
Based on other threads #1125 #1130 I am using the option of "batch_input_shape" yet i am getting the error.
I raised the same in forum https://groups.google.com/forum/#!topic/keras-users/nwB3ilYY4ZQ
but no response
you can find my complete code here:
https://github.com/anujgupta82/DeepNets/blob/master/LSTM/IMDB_Embedding_w2v_LSTM_3.ipynb
batch_input_shape must be passed to the first layer of the network.
Same problem here:
model = Sequential()
model.add(GRU(100,activation='relu',stateful=True,return_sequences=True,batch_input_shape=(batch_size,X_train.shape[-2], X_train.shape[-1])))
...
What's the X_train.shape[0] you have? do all your batches have the same number of samples? that is a must when using stateful RNNs.
X_train.shape[0] is the number of samples.
I have just tried using a batch_size that is a factor of the number of samples (so all batches have exactly the same number of samples) and it works, thanks!
A note that mentions this might be helpful in the documentation where talking about statefulness.
Edit: it's already there, my bad.
Have a look at this : http://philipperemy.github.io/keras-stateful-lstm/
If batch_input_shape must be specified in the first layer of a stateful network, how is this done when using the functional API? The Input() layer will not allow it. I have tried everything I can think of but am still receiving this same exception ("complete input_shape must be provided (including batch size)"), even with batch size 1. I am trying to make an LRCN using TimeDistributed CNN layers, followed by a couple dense layers, followed by LSTM:
inputs = Input(shape=(1,3,227,227))
conv_1 = TimeDistributed(Convolution2D(96, 11, 11,subsample=(4,4),activation='relu',
name='conv_1'))(inputs)
conv_2 = TimeDistributed(MaxPooling2D((3, 3), strides=(2,2)))(conv_1)
conv_2 = TimeDistributed(LRN(name="convpool_1"))(conv_2)
conv_2 = TimeDistributed(ZeroPadding2D((2,2)))(conv_2)
conv_2 = TimeDistributed(Convolution2D(128,5,5,activation="relu",name="conv_2"))(conv_2)
...skipping similar conv layers...
dense_1 = TimeDistributed(MaxPooling2D((3, 3), strides=(2,2),name="convpool_5"))(conv_5)
dense_1 = TimeDistributed(Flatten(name="flatten"))(dense_1)
dense_1 = TimeDistributed(Dense(4096, activation='relu',name='dense_1'))(dense_1)
dense_2 = Dropout(0.5)(dense_1)
dense_2 = TimeDistributed(Dense(4096, activation='relu',name='dense_2'))(dense_2)
lstm_1 = Dropout(0.5)(dense_2)
lstm_1 = LSTM(100,
batch_input_shape=(1,1,4096), #(batch size,timesteps,feature shape)
return_sequences=False,
stateful=True)(lstm_1)
dense_3 = Dense(6,name='dense_out')(lstm_1)
prediction = Activation("tanh",name="tanh")(dense_3)
(This is a regression problem; I am trying to predict six values at each time step based on an image sequence.)
The same network using the other API does not produce the same exception, but I am hoping to take advantage of the functional API, so I'd like to figure out what I'm doing wrong.
Figured it out from topology.py - the error message is misleading. The "Input" function takes argument "batch_shape", not "batch_input_shape".
@cmgladding
I tried "batch_shape", but is was not recognized by Keras. Don't know why, The issues persist for me no matter what key words I used.
I also have this issue on functional model
Here is my example for those who get stuck. Indeed, the error message is misleading. I had to change Input(shape=()) to Input(batch_shape=()) in order for it to work.
Error one:
frame_sequence = Input(shape=(TIME_STEPS, HEIGHT, WIDTH, CHANNELS))
...
net = TimeDistributed(self.vision_model)(frame_sequence)
net = LSTM(HIDDEN_UNITS, stateful=True, return_sequences=False)(net)
Correct version:
frame_sequence = Input(batch_shape=(BATCH_SIZE, TIME_STEPS, HEIGHT, WIDTH, CHANNELS))
...
net = TimeDistributed(self.vision_model)(frame_sequence)
net = LSTM(HIDDEN_UNITS, stateful=True, return_sequences=False)(net)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.
Hi,
I am facing almost same issue for a shared convnet, then two features are concatenated and to be feed to a Dense layer and then into a LSTM. Model is like fig.1 https://arxiv.org/pdf/1705.06368.pdf
I am stucked in this issue for a long time, any help is much appricated. @farizrahman4u @dat-ai
`# First, define the vision modules
input_dim = (224, 224, 3)
image_input = Input(shape=input_dim)
vision_model = Conv2D(64, (3, 3), activation='relu', padding='same')(image_input)
vision_model = Conv2D(64, (3, 3), activation='relu')(vision_model)
vision_model = MaxPooling2D((2, 2))(vision_model)
vision_model = Conv2D(128, (3, 3), activation='relu', padding='same')(vision_model)
vision_model = Conv2D(128, (3, 3), activation='relu')(vision_model)
vision_model = MaxPooling2D((2, 2))(vision_model)
vision_model = Conv2D(256, (3, 3), activation='relu', padding='same')(vision_model)
vision_model= MaxPooling2D((2, 2))(vision_model)
out = Flatten()(vision_model)
model = Model(image_input,out)
digit_a = Input(shape=input_dim)
digit_b = Input(shape=input_dim)
out_a = model(digit_a)
out_b = model(digit_b)
concatenated = concatenate([out_a, out_b])
out = Dense(2048, activation='relu')(concatenated)
concat_model= Model([digit_a,digit_b],out)
frame_sequence = Input(batch_shape=(32, 2,224,224,3))
unroll_feature = TimeDistributed(concat_model)(frame_sequence)`
And the error message I got
Using TensorFlow backend.
Traceback (most recent call last):
File "/home/rajat/Downloads/pycharm-2017.2.3/helpers/pydev/pydevd.py", line 1599, in
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/rajat/Downloads/pycharm-2017.2.3/helpers/pydev/pydevd.py", line 1026, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/rajat/Downloads/re3-tensorflow-master/keras_training/vision_model.py", line 52, in
unroll_feature = TimeDistributed(concat_model)(frame_sequence)
File "/home/rajat/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 602, in __call__
output = self.call(inputs, *kwargs)
File "/home/rajat/.local/lib/python2.7/site-packages/keras/layers/wrappers.py", line 188, in call
unroll=False)
File "/home/rajat/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2467, in rnn
outputs, _ = step_function(inputs[0], initial_states + constants)
File "/home/rajat/.local/lib/python2.7/site-packages/keras/layers/wrappers.py", line 179, in step
output = self.layer.call(x, *kwargs)
File "/home/rajat/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 2058, in call
output_tensors, _, _ = self.run_internal_graph(inputs, masks)
File "/home/rajat/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 2262, in run_internal_graph
assert str(id(x)) in tensor_map, 'Could not compute output ' + str(x)
AssertionError: Could not compute output Tensor("dense_1/Relu:0", shape=(?, 2048), dtype=float32)
Hi @rajatkoner08,
Please let me know if you were able to solve this problem and how to fix it. Thanks in advance. Madhu
When i add 'stateful' to LSTM, I get following Exception: If a RNN is stateful, a complete input_shape must be provided (including batch size).
Based on other threads #1125 #1130 I am using the option of "batch_input_shape" yet i am getting the error.
I raised the same in forum https://groups.google.com/forum/#!topic/keras-users/nwB3ilYY4ZQ
but no responseyou can find my complete code here:
https://github.com/anujgupta82/DeepNets/blob/master/LSTM/IMDB_Embedding_w2v_LSTM_3.ipynb
Needs to be built like so:
self.lstm_custom_1 = keras.layers.LSTM(128,batch_input_shape=batch_input_shape, return_sequences=False, stateful=True)
self.lstm_custom_1.build(batch_input_shape)
Using Keras for R with a Functional API I am observing a similar problem which I can't resolve referring to the advice given above, since the cases above refer to Keras for Python and are (for me) not easily transferred to Keras for R.
Since this thread has been closed long before, I have raised this topic anew under issue #13262 - hopefully there will be replies w.r.t. to Keras for R.
Most helpful comment
Figured it out from topology.py - the error message is misleading. The "Input" function takes argument "batch_shape", not "batch_input_shape".