Hey there,
I've been trying to connect a Dense layer to the LSTM which should be after it. The code below shows what I mean more concretely.
qrnn = Sequential()
qrnn.add(Dense(512, input_dim=X.shape[1]))
qrnn.add(Dropout(0.5))
qrnn.add(LSTM(512, return_sequences=True))
qrnn.add(Dropout(0.2))
qrnn.add(LSTM(512, return_sequences=False))
But I get this error
Incompatible shapes: layer expected input with ndim=3 but previous layer has output_shape (None, 512)
LSTMs expect a sequence of vectors of shape X.shape[1]. You need a TimeDistributedDense in front there instead of Dense. Or an Embedding layer if your inputs are sequences of categorical integers.
I just tried adding TimeDistributedDense (Embedding works in other scenarios that I've tested but this one doesn't have categorical integers). I still get the exact same error. X is a large matrix composed of some 1 000 000 samples of 119 sequences each. Basically numbers e.g 1.1, 1.2, 1.3 etc. And I want to predict the next one in the sequence as output. Could you give an example use case for the suggested layer type?
The following compiles for me:
# input shape: (nb_samples, timesteps, 10)
model = Sequential()
model.add(TimeDistributedDense(10,input_dim=10)) # output shape: (nb_samples, timesteps, 10)
model.add(LSTM(10, return_sequences=True)) # output shape: (nb_samples, timesteps, 10)
model.add(TimeDistributedDense(5)) # output shape: (nb_samples, timesteps, 5)
model.add(LSTM(10, return_sequences=False)) # output shape: (nb_samples, 10))
optimizer = RMSprop(lr=0.001,clipnorm=10)
model.compile(optimizer=optimizer, loss='mse')
Thanks for your time. I will give it a try and let you know
Good luck!
I write this transform layer to create input for LSTM or unroll LSTM output for Dense layer as well.
There only 1 issue, you must take into account that _nb_samples of input = nb_samples of output_, i.e. if you create a sequence of 20 length, then your nb_samples of output is divided by 20.
I am not so clear how keras infer the nb_samples of output, but it would be great if they allow nb_samples changing after feed input into the network.
p/s: This layer can acts as first Input Layer and do some reshaping your input.
class Transform(Layer):
'''This layer can be use as ``Input Layer`` or Shape Transform layer
Example:
input_shape: (128,)
transform_shape: (5,128) or (5,None)
=> (100,128) -> (20,5,128)
input_shape: (5, 128)
transform_shape: (128,)
=> (20,5,128) -> (100,128) <=> Flatten
* dimshuffle(0,2,1) != reshape(-1,2,1)
'''
def __init__(self, transform_shape=None, input_shape=None, **kwargs):
if not hasattr(transform_shape, '__len__'):
transform_shape = (transform_shape,)
self.transform_shape = transform_shape
if input_shape:
if not hasattr(input_shape, '__len__'):
self.input_shape = (input_shape,)
self.input_ndim = len(input_shape) + 1
kwargs['input_shape'] = input_shape
super(Transform, self).__init__(**kwargs)
@property
def output_shape(self):
nb_samples = None
if (hasattr(self.input_shape, '__len__') and None not in self.input_shape) or \
not hasattr(self.input_shape, '__len__'):
nb_samples = T.prod(self.input_shape) / T.prod(self.transform_shape)
if len(self.transform_shape) == 2 and self.transform_shape[1] is None:
self.transform_shape = (self.transform_shape[0], self.input_shape[1])
return (nb_samples,) + self.transform_shape
def get_output(self, train=False):
X = self.get_input(train)
# convert 2D data to 3D
if len(self.transform_shape) == 2 and self.transform_shape[1] is None:
self.transform_shape = (self.transform_shape[0], X.shape[1])
return T.reshape(X, (-1,) + self.transform_shape)
def get_config(self):
config = {"name": self.__class__.__name__,
"transform_shape": self.transform_shape}
base_config = super(Transform, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
Most helpful comment
LSTMs expect a sequence of vectors of shape
X.shape[1]. You need aTimeDistributedDensein front there instead ofDense. Or anEmbeddinglayer if your inputs are sequences of categorical integers.