Keras: Stacking multiple LSTM layers yields an error

Created on 26 May 2015 · 6Comments · Source: keras-team/keras

I tried to create a network with multiple LSTM layers. No matter what I try this or similar attempts will yield an error:

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM

nb_chars = 26 # a..z
nb_nodes = 50

model = Sequential()
model.add(Embedding(nb_chars, nb_chars))
model.add(LSTM(nb_chars, nb_nodes, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(LSTM(nb_nodes, nb_nodes, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(nb_nodes, nb_chars))
model.add(Activation('sigmoid'))
model.add(Dropout(0.5))

model.compile(loss='binary_crossentropy', optimizer='rmsprop')

Error:

Traceback (most recent call last):
  File "test.py", line 21, in <module>
    model.compile(loss='binary_crossentropy', optimizer='rmsprop')
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/models.py", line 71, in compile
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/models.py", line 155, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 115, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 140, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 233, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 115, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/recurrent.py", line 338, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Theano-0.7.0-py3.4.egg/theano/tensor/var.py", line 341, in dimshuffle
    pattern)
  File "/home/tener/.local/lib/python3.4/site-packages/Theano-0.7.0-py3.4.egg/theano/tensor/elemwise.py", line 141, in __init__
    (i, j, len(input_broadcastable)))
ValueError: new_order[2] is 2, but the input only has 2 axes.

If I replace one of LSTM with, say, Dense, it will work. I cannot figure out why - according to the documentation the inputs and outputs of both should match?

Source

Tener

Most helpful comment

A LSTM layer, as per the docs, will return the last vector by default rather than the entire sequence. In order to return the entire sequence (which is necessary to be able to stack LSTM), use the constructor argument return_sequences=True.

fchollet on 26 May 2015

👍17

All 6 comments

maybe try this
Dense -> TimeDistributedDense

lileicc on 26 May 2015

fchollet on 26 May 2015

👍17

I don't want the entire sequence. My original question was not about TimeDistributedDense. Reading the docs it should be possible to stack two LSTM layers.

Tener on 26 May 2015

And possible it is. Again: just use the constructor argument return_sequences=True in your intermediate LSTM(s).

fchollet on 26 May 2015

Hello fchollet,
I met the same problem as Tener. I added return_sequences=True in my code. The error still comes..

Here is my code:

model = Sequential()

model.add(Convolution2D(100, 1, 2, 5, border_mode='valid'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))

model.add(Flatten())
model.add(Dense(49600,peptidemaxLen, init='normal'))
model.add(Activation('softmax'))

model.add(LSTM(peptidemaxLen, peptidemaxLen, return_sequences=True))
model.add(Dropout(0.2))

Before adding LSTM layer, the code can be run successfully. However, the error comes after adding the LSTM layer, which says:
"ValueError: new_order[2] is 2, but the input only has 2 axes."

Thank you so much for your great help!!