nb_samples = 1
nb_timesteps = 1
nb_features = 1
nb_hidden = 1
i = Input(batch_shape=(nb_samples, nb_timesteps, nb_features))
o = Bidirectional(LSTM(nb_hidden, stateful=True))(i)
o = Dense(nb_classes, activation='softmax')(o)
model = Model(i, o)
TypeError Traceback (most recent call last)
<ipython-input-14-f240ae219281> in <module>()
5
6 i = Input(batch_shape=(nb_samples, nb_timesteps, nb_features))
----> 7 o = Bidirectional(LSTM(nb_hidden, stateful=True))(i)
8 o = Dense(nb_classes, activation='softmax')(o)
9 model = Model(i, o)
/home/carl/anaconda3/lib/python3.5/site-packages/keras/layers/wrappers.py in __init__(self, layer, merge_mode, weights, **kwargs)
164 config = layer.get_config()
165 config['go_backwards'] = not config['go_backwards']
--> 166 self.backward_layer = layer.__class__.from_config(config)
167 self.forward_layer.name = 'forward_' + self.forward_layer.name
168 self.backward_layer.name = 'backward_' + self.backward_layer.name
/home/carl/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py in from_config(cls, config)
869 output of get_config.
870 '''
--> 871 return cls(**config)
872
873 def count_params(self):
/home/carl/anaconda3/lib/python3.5/site-packages/keras/layers/recurrent.py in __init__(self, output_dim, init, inner_init, forget_bias_init, activation, inner_activation, W_regularizer, U_regularizer, b_regularizer, dropout_W, dropout_U, **kwargs)
675 if self.dropout_W or self.dropout_U:
676 self.uses_learning_phase = True
--> 677 super(LSTM, self).__init__(**kwargs)
678
679 def build(self, input_shape):
/home/carl/anaconda3/lib/python3.5/site-packages/keras/layers/recurrent.py in __init__(self, weights, return_sequences, go_backwards, stateful, unroll, consume_less, input_dim, input_length, **kwargs)
163 if self.input_dim:
164 kwargs['input_shape'] = (self.input_length, self.input_dim)
--> 165 super(Recurrent, self).__init__(**kwargs)
166
167 def get_output_shape_for(self, input_shape):
/home/carl/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py in __init__(self, **kwargs)
323 # to insert before the current layer
324 if 'batch_input_shape' in kwargs:
--> 325 batch_input_shape = tuple(kwargs['batch_input_shape'])
326 elif 'input_shape' in kwargs:
327 batch_input_shape = (None,) + tuple(kwargs['input_shape'])
TypeError: 'NoneType' object is not iterable
This seems to solve the problem. https://github.com/yukoba/keras/commit/1c8ea3171ba7841b09a6e89a0be9028b1b2a8d70
Nice!
it seems to work
I am getting a similar error for the newly implemented ConvLSTM2D class, when applying it statefully and wrapped with the Bidirectional wrapper. The bug fix before was applied to the get_config() method from the Recurrent class, but ConvLSTM2D inherits ConvRecurrent2D instead, which re-implements the get_config() method.
Could this same fix be applied to the ConvRecurrent2D get_config() method?
Does a stateful model makes sense if we use bidirectional lstm?
If I understand well, in stateful models we carry the states across different chunks of data. But I don't quite get it for bidirectional models, could anyone elaborate?
Thanks!
I don't think a stateful model makes sense in the case of a bidirectional lstm. Stateful models work because we can 'step' forward through the data in a markovian sense. A birectional model would require us to also step backwards, so the same 'chunk' of data could not be appled to the 'backward' work as it would have seen the history and not the future. You could probably run both of the networks statefully and independently and then concatenate the results though.
@Russ09 I agree, so I was wondering, what is Keras doing here, is it just taking the output of the backward network and using as input for the next step?
I haven't revisited this recently, but this issue suggests that Keras doesn't handle it and crashes, as expected, however a warning that bidirectional stateful doesn't work would maybe be more appropriate?
Does anyone understand what is happening for a stateful bidirectional layer? It doesn't crash now, but I'm not sure I understand how the output would make any sense.
Most helpful comment
Does anyone understand what is happening for a stateful bidirectional layer? It doesn't crash now, but I'm not sure I understand how the output would make any sense.