Dear all, I have been struggling to build an LSTM model with multiple(2) hidden layers of size: 512 with a fixed time step: 10 (subject to change). However, I only succeeded in building a single LSTM layer network whose performance was far poorer than my expectations. My model as of right now is as follows:
model = Sequential()
model.add(LSTM(512, input_shape(10,13))
model.add(Dense(1201))
As shown in the code, I am feeding 3D inputs of dim: (batch_size, time_step, input_dim) = (batch_size, 10, 13). Howe should I add another layer in this network? Keras documentation states that
model.add(LSTM(512))
should do as the model auto-detects the input shape to a hidden layer, but this gives the following error:
Exception: Input 0 is incompatible with layer lstm_2: expected ndim=3, found ndim=2
To clarify, I want each sequence of 10 inputs to output one label, instead of a sequence of 10 labels. Any guide to build this model would be much appreciated.
Thank you all very much for your help in advance!
model = Sequential()
model.add(LSTM(512, return_sequences=True, input_shape=(10,13))
model.add(LSTM(512))
model.add(Dense(1201))
Why do I need return_sequences=True for the first layer?
I only want the label at the end of each sequence of size 10
Then you can't stack another LSTM on top of it.
@msyim LSTM accepts input of shape (n_samples, n_timestamps, ...)
. Specifying return_sequences=True
makes LSTM
layer to return the full history including outputs at all times (i.e. the shape of output is (n_samples, n_timestamps, n_outdims)
), or the return value contains only the output at the last timestamp (i.e. the shape will be (n_samples, n_outdims)
), which is invalid as the input of the next LSTM
layer.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
So we could return_seq=True on the 1st layer, but on the 2nd put return_seq=False.
I know this, because I've done so. But the problem I encounter is that The output layer (Dense) gets error of "ValueError: Input 0 is incompatible with layer bidirectional_1: expected ndim=3, found ndim=4"
Any advice?
Most helpful comment