Keras: Stacking multiple LSTM layers yields an error

Created on 26 May 2015  路  6Comments  路  Source: keras-team/keras

I tried to create a network with multiple LSTM layers. No matter what I try this or similar attempts will yield an error:

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM

nb_chars = 26 # a..z
nb_nodes = 50

model = Sequential()
model.add(Embedding(nb_chars, nb_chars))
model.add(LSTM(nb_chars, nb_nodes, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(LSTM(nb_nodes, nb_nodes, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(nb_nodes, nb_chars))
model.add(Activation('sigmoid'))
model.add(Dropout(0.5))

model.compile(loss='binary_crossentropy', optimizer='rmsprop')

Error:

Traceback (most recent call last):
  File "test.py", line 21, in <module>
    model.compile(loss='binary_crossentropy', optimizer='rmsprop')
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/models.py", line 71, in compile
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/models.py", line 155, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 115, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 140, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 233, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 115, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/recurrent.py", line 338, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Theano-0.7.0-py3.4.egg/theano/tensor/var.py", line 341, in dimshuffle
    pattern)
  File "/home/tener/.local/lib/python3.4/site-packages/Theano-0.7.0-py3.4.egg/theano/tensor/elemwise.py", line 141, in __init__
    (i, j, len(input_broadcastable)))
ValueError: new_order[2] is 2, but the input only has 2 axes.

If I replace one of LSTM with, say, Dense, it will work. I cannot figure out why - according to the documentation the inputs and outputs of both should match?

Most helpful comment

A LSTM layer, as per the docs, will return the last vector by default rather than the entire sequence. In order to return the entire sequence (which is necessary to be able to stack LSTM), use the constructor argument return_sequences=True.

All 6 comments

maybe try this
Dense -> TimeDistributedDense

A LSTM layer, as per the docs, will return the last vector by default rather than the entire sequence. In order to return the entire sequence (which is necessary to be able to stack LSTM), use the constructor argument return_sequences=True.

I don't want the entire sequence. My original question was not about TimeDistributedDense. Reading the docs it should be possible to stack two LSTM layers.

And possible it is. Again: just use the constructor argument return_sequences=True in your intermediate LSTM(s).

Hello fchollet,
I met the same problem as Tener. I added return_sequences=True in my code. The error still comes..

Here is my code:

model = Sequential()

model.add(Convolution2D(100, 1, 2, 5, border_mode='valid'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))

model.add(Flatten())
model.add(Dense(49600,peptidemaxLen, init='normal'))
model.add(Activation('softmax'))

model.add(LSTM(peptidemaxLen, peptidemaxLen, return_sequences=True))
model.add(Dropout(0.2))

Before adding LSTM layer, the code can be run successfully. However, the error comes after adding the LSTM layer, which says:
"ValueError: new_order[2] is 2, but the input only has 2 axes."

Thank you so much for your great help!!

LSTM layers have to have input in the size [samples, timesteps, features]

Was this page helpful?
0 / 5 - 0 ratings

Related issues

nryant picture nryant  路  3Comments

KeironO picture KeironO  路  3Comments

LuCeHe picture LuCeHe  路  3Comments

vinayakumarr picture vinayakumarr  路  3Comments

braingineer picture braingineer  路  3Comments