Keras: How to build a deep learning LSTM RNN in python Keras?

Created on 27 Jan 2018 · 4Comments · Source: keras-team/keras

i am trying to build a deep learning network based on LSTM RNN here is what is tried

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM
import numpy as np

train = np.loadtxt("TrainDatasetFinal.txt", delimiter=",")
test = np.loadtxt("testDatasetFinal.txt", delimiter=",")

y_train = train[:,7]
y_test = test[:,7]

train_spec = train[:,6]
test_spec = test[:,6]


model = Sequential()
model.add(LSTM(32, input_shape=(1415684, 8)))
model.add(LSTM(64, input_dim=1, input_length=1415684, return_sequences=True))

model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='rmsprop')

model.fit(train_spec, y_train, batch_size=2000, nb_epoch=11)
score = model.evaluate(test_spec, y_test, batch_size=2000)

but it gets me the following error
ValueError: Input 0 is incompatible with layer lstm_2: expected ndim=3, found ndim=2

Here is a sample from the dataset

(Patient Number, time in millisecond, accelerometer x-axis,y-axis, z-axis,magnitude, spectrogram,label (0 or 1))

1,15,70,39,-970,947321,596768455815000,0
1,31,70,39,-970,947321,612882670787000,0
1,46,60,49,-960,927601,602179976392000,0
1,62,60,49,-960,927601,808020878060000,0
1,78,50,39,-960,925621,726154800929000,0

in the dataset i am using the only the spectrogram as input feature and the label (0 or 1) as the output the total traing samples is 1,415,684

Source

deerdodo

Most helpful comment

@deerdodo In order to process your data set with a LSTM model you need to create sequences of measurements. Assuming that your training data are:

X_train (5 x 1 matrix)
array([[596768455815000],
       [612882670787000],
       [602179976392000],
       [808020878060000],
       [726154800929000]], dtype=int64)

Y_train (5 x 1 matrix)
array([[0],
       [0],
       [0],
       [0],
       [0]])

you need to create sequences of length L. If for example L=3 time steps, then:

n = X_train.shape[0]
L = 3
X_train_seq = []
Y_train_seq = []
for k in range(n - L + 1):
    X_train_seq.append(X_train[k : k + L])
    Y_train_seq.append(Y_train[k : k + L])

X_train_seq = np.array(X_train_seq)
Y_train_seq = np.array(Y_train_seq)

X_train_seq (3 x 3 x 1 matrix)
array([[[596768455815000],
        [612882670787000],
        [602179976392000]],
       [[612882670787000],
        [602179976392000],
        [808020878060000]],
       [[602179976392000],
        [808020878060000],
        [726154800929000]]], dtype=int64)
Y_train_seq (3 x 3 x 1 matrix)
array([[[0],
        [0],
        [0]],
       [[0],
        [0],
        [0]],
       [[0],
        [0],
        [0]]])

Now define a model that takes sequences of length 3 as input and produces a prediction at each time step:

model = Sequential()
model.add(LSTM(32, input_shape=(3, 1), return_sequences=True))
model.add(LSTM(1, return_sequences=True))
model.compile(loss='binary_crossentropy', optimizer='rmsprop')
model.summary()

and you are ready to train it model.fit(X_train_seq, Y_train_seq). You can also train a recurrent network to predict only one label per sequence (many to one) instead of a label for each time step (many to many). Take a look at the first image here http://karpathy.github.io/2015/05/21/rnn-effectiveness/

tomastheod-ITI on 31 Jan 2018

👍3

All 4 comments

The time dimension is getting dropped from the first LSTM layer. Try returning the full output sequence like so: model.add(LSTM(32, input_shape=(1415684, 8), return_sequences=True))

pavithrasv on 27 Jan 2018

i tried but it gets me a new error

ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (1415684, 1)

deerdodo on 27 Jan 2018

👍2

@deerdodo In order to process your data set with a LSTM model you need to create sequences of measurements. Assuming that your training data are:

X_train (5 x 1 matrix)
array([[596768455815000],
       [612882670787000],
       [602179976392000],
       [808020878060000],
       [726154800929000]], dtype=int64)

Y_train (5 x 1 matrix)
array([[0],
       [0],
       [0],
       [0],
       [0]])

you need to create sequences of length L. If for example L=3 time steps, then:

n = X_train.shape[0]
L = 3
X_train_seq = []
Y_train_seq = []
for k in range(n - L + 1):
    X_train_seq.append(X_train[k : k + L])
    Y_train_seq.append(Y_train[k : k + L])

X_train_seq = np.array(X_train_seq)
Y_train_seq = np.array(Y_train_seq)

X_train_seq (3 x 3 x 1 matrix)
array([[[596768455815000],
        [612882670787000],
        [602179976392000]],
       [[612882670787000],
        [602179976392000],
        [808020878060000]],
       [[602179976392000],
        [808020878060000],
        [726154800929000]]], dtype=int64)
Y_train_seq (3 x 3 x 1 matrix)
array([[[0],
        [0],
        [0]],
       [[0],
        [0],
        [0]],
       [[0],
        [0],
        [0]]])

Now define a model that takes sequences of length 3 as input and produces a prediction at each time step:

model = Sequential()
model.add(LSTM(32, input_shape=(3, 1), return_sequences=True))
model.add(LSTM(1, return_sequences=True))
model.compile(loss='binary_crossentropy', optimizer='rmsprop')
model.summary()

tomastheod-ITI on 31 Jan 2018

👍3

@tomastheod-ITI I am having a similar issue for multivariate using LSTM RNN model.
In fact, I posted a question on StackOverflow here about it comparing NN with RNN. But I realize that my use of LSTM should work with return_sequences set to True as I am expecting to make understand the LSTM that the input is a time series of multiple variables.
However, I am having several errors as @deerdodo with the dimensions.
I tried the following, and similar variations but still not working.

def Model_RNN_LSTM_2_keras(input_features, window_size, output_features):
    hidden_neurons = 300
    model.add(LSTM(hidden_neurons, return_sequences=True, input_shape=(window_size,input_features)))
    model.add(Dense(output_features))
    model.add(Activation("linear"))
    model.compile(loss="mean_squared_error", optimizer="rmsprop")
    print model.summary()
    return model

model = Model_RNN_LSTM_2_keras(X_train.shape[2], X_train.shape[1], y_train.shape[1])

*Look to the link above to be able to create X_train and y_train matrix.
As a summary, is a dataset with 4 features + 1 target, where 60 previous steps are used as input to predict the following 20 step with the features of that 20 steps (to predict the target on each step). 720 samples are used to train and 120 as test.

Any idea about how to solve the problem and improve the accuracy with RNN?