Keras: Bidirectional(LSTM(..., stateful=True)) crashes

Created on 18 Nov 2016  路  9Comments  路  Source: keras-team/keras

nb_samples = 1
nb_timesteps = 1
nb_features = 1
nb_hidden = 1

i = Input(batch_shape=(nb_samples, nb_timesteps, nb_features))
o = Bidirectional(LSTM(nb_hidden, stateful=True))(i)
o = Dense(nb_classes, activation='softmax')(o)
model = Model(i, o)
TypeError                                 Traceback (most recent call last)
<ipython-input-14-f240ae219281> in <module>()
      5 
      6 i = Input(batch_shape=(nb_samples, nb_timesteps, nb_features))
----> 7 o = Bidirectional(LSTM(nb_hidden, stateful=True))(i)
      8 o = Dense(nb_classes, activation='softmax')(o)
      9 model = Model(i, o)

/home/carl/anaconda3/lib/python3.5/site-packages/keras/layers/wrappers.py in __init__(self, layer, merge_mode, weights, **kwargs)
    164         config = layer.get_config()
    165         config['go_backwards'] = not config['go_backwards']
--> 166         self.backward_layer = layer.__class__.from_config(config)
    167         self.forward_layer.name = 'forward_' + self.forward_layer.name
    168         self.backward_layer.name = 'backward_' + self.backward_layer.name

/home/carl/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py in from_config(cls, config)
    869                 output of get_config.
    870         '''
--> 871         return cls(**config)
    872 
    873     def count_params(self):

/home/carl/anaconda3/lib/python3.5/site-packages/keras/layers/recurrent.py in __init__(self, output_dim, init, inner_init, forget_bias_init, activation, inner_activation, W_regularizer, U_regularizer, b_regularizer, dropout_W, dropout_U, **kwargs)
    675         if self.dropout_W or self.dropout_U:
    676             self.uses_learning_phase = True
--> 677         super(LSTM, self).__init__(**kwargs)
    678 
    679     def build(self, input_shape):

/home/carl/anaconda3/lib/python3.5/site-packages/keras/layers/recurrent.py in __init__(self, weights, return_sequences, go_backwards, stateful, unroll, consume_less, input_dim, input_length, **kwargs)
    163         if self.input_dim:
    164             kwargs['input_shape'] = (self.input_length, self.input_dim)
--> 165         super(Recurrent, self).__init__(**kwargs)
    166 
    167     def get_output_shape_for(self, input_shape):

/home/carl/anaconda3/lib/python3.5/site-packages/keras/engine/topology.py in __init__(self, **kwargs)
    323             # to insert before the current layer
    324             if 'batch_input_shape' in kwargs:
--> 325                 batch_input_shape = tuple(kwargs['batch_input_shape'])
    326             elif 'input_shape' in kwargs:
    327                 batch_input_shape = (None,) + tuple(kwargs['input_shape'])

TypeError: 'NoneType' object is not iterable

Most helpful comment

Does anyone understand what is happening for a stateful bidirectional layer? It doesn't crash now, but I'm not sure I understand how the output would make any sense.

All 9 comments

Nice!

it seems to work

I am getting a similar error for the newly implemented ConvLSTM2D class, when applying it statefully and wrapped with the Bidirectional wrapper. The bug fix before was applied to the get_config() method from the Recurrent class, but ConvLSTM2D inherits ConvRecurrent2D instead, which re-implements the get_config() method.

Could this same fix be applied to the ConvRecurrent2D get_config() method?

Does a stateful model makes sense if we use bidirectional lstm?
If I understand well, in stateful models we carry the states across different chunks of data. But I don't quite get it for bidirectional models, could anyone elaborate?
Thanks!

I don't think a stateful model makes sense in the case of a bidirectional lstm. Stateful models work because we can 'step' forward through the data in a markovian sense. A birectional model would require us to also step backwards, so the same 'chunk' of data could not be appled to the 'backward' work as it would have seen the history and not the future. You could probably run both of the networks statefully and independently and then concatenate the results though.

@Russ09 I agree, so I was wondering, what is Keras doing here, is it just taking the output of the backward network and using as input for the next step?

I haven't revisited this recently, but this issue suggests that Keras doesn't handle it and crashes, as expected, however a warning that bidirectional stateful doesn't work would maybe be more appropriate?

Does anyone understand what is happening for a stateful bidirectional layer? It doesn't crash now, but I'm not sure I understand how the output would make any sense.

Was this page helpful?
0 / 5 - 0 ratings