Hi there,
I'm building an RNN to assign an output label for each input element in the sequence for activity recognition based on location. In this toy model, the shape of each input location is 4x1; the shape of each output activity is 3x1. There are two hidden layers, the shape of each hidden component is 3x1.

My question is how to construct the model? Do I need to use Embedding layer? Should I use two layers of TimeDistributedDense or two layers of GRU/LSTM for my two hidden layers?
Please help and I hope I could contribute an example to the repo :)
My code snippet is shown below.
input_dim = 4
output_dim = 3
hidden_dim = 3
print('Build model...')
model = Sequential()
# TODO: add layers to model
print('Compile model...')
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
model.fit(X_train, Y_train, batch_size=1, nb_epoch=10)
print('Done')
Embedding layers are used for text vectorization. This is not your use case.
You could use one of these networks:
model = Sequential() # input has shape (samples, timesteps, locations)
model.add(LSTM(input_dim, output_dim, return_sequences=True))
model.add(Activation('time_distributed_softmax')) # output has shape (samples, timesteps, activities)
model = Sequential()
model.add(LSTM(input_dim, hidden_dim, return_sequences=True))
model.add(TimeDistributedDense(hidden_dim, output_dim))
model.add(Activation('time_distributed_softmax')) # output has shape (samples, timesteps, activities)
You can try replacing LSTM with GRU; if your data is simple (it seems to be) chances are it will work better.
Thanks for your answer. That was very helpful.
I just want to make sure I'm getting it right. I have two more questions:
model = Sequential() # input has shape (samples, timesteps, locations)
model.add(LSTM(input_dim, hidden_dim, return_sequences=True))
model.add(LSTM(hidden_dim, hidden_dim, return_sequences=True))
model.add(LSTM(hidden_dim, hidden_dim, return_sequences=True))
model.add(LSTM(hidden_dim, output_dim, return_sequences=True))
model.add(Activation('time_distributed_softmax')) # output has shape (samples, timesteps, activities)
Thanks!
Thanks for your help! I just found this paper: Gated Feedback Recurrent Neural Networks (http://arxiv.org/pdf/1502.02367.pdf)
Is the current implementation of TimeDistributedDense the same as the concept of Gated Feedback RNN in the paper?
No, TimeDistrubutedDense is exactly as it sounds, simply a Dense layer that feed all of its inputs forward in time; this distinction between Dense and TimeDistributedDense is simply that a Dense layer expects 2D input (batch_size, sample_size) whereas TimeDistributedDense expects 3D input (Batch_size, time_steps, sample_size). This should be used in conjunction with TimeDistributedSoftmax for the same reason (2D vs. 3D expected input).
There is a GRU layer, however: https://github.com/fchollet/keras/blob/master/keras/layers/recurrent.py#L156-253
@zxcvbn97, @fchollet
I'm working on almost the same problem with sentence labeling,
simple LSTM + TimeDistributedDense shows 95% accuracy on test dataset while training, but when I am trying to predict new sentences with model.predict(X_i) method almost all elements of sequence are classified wrong and it seems like network just learned some mapping. Do you have any ideas why it happens? Thank you.
@zxcvbn97 @fchollet
May I ask how to set input_dim, hidden_dim, and output_dim? Suppose my training data is 10000, 50, 40 (samples, timesteps, features), and I need output for each timestep with categorical labels (11 categories), thus 10000, 50, 11 (samples, timesteps, categories).
I tried the setting like this:
model = Sequential()
model.add(LSTM(input_dim=(50,40),output_dim=(128,1),return_sequences=True))
model.add(LSTM(input_dim=(128,1), output_dim=(50,11), return_sequences=True))
model.add(Activation('time_distributed_softmax'))
Unfortunately it does not work, but I don't know how to fix it.
Besides, I'm wondering how to do pre-training. Thanks a lot!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
@zxcvbn97 I think you have using the default accuracy for evaluation. Instead use metrics.categorical_accuracy in order to get the real accuracy for your case. Since it is multiclass problem.
Most helpful comment
No, TimeDistrubutedDense is exactly as it sounds, simply a Dense layer that feed all of its inputs forward in time; this distinction between Dense and TimeDistributedDense is simply that a Dense layer expects 2D input (batch_size, sample_size) whereas TimeDistributedDense expects 3D input (Batch_size, time_steps, sample_size). This should be used in conjunction with TimeDistributedSoftmax for the same reason (2D vs. 3D expected input).
There is a GRU layer, however: https://github.com/fchollet/keras/blob/master/keras/layers/recurrent.py#L156-253