Hopefully this is not stupid. In the Tensorflow tutorial, the states of the RNN are updated after each time-step and used as the states for the next time-step
output, state = lstm(current_batch_of_words, state)
In yours the state of the bidirectional RNN seems to just be discarded. Is this the case, or am I missing something? If so, why?
Also, a small detail; you use outputs[-1] for
return tf.matmul(outputs[-1], _weights['out']) + _biases['out']
which corresponds according to the Tensorflow API
(outputs, output_state_fw, output_state_bw) = outputs
So shouldn't this be
return tf.matmul(outputs[0], _weights['out']) + _biases['out']
Instead?
The old version of Tensorflow didn't returned the states, so output[-1] is fine (see https://github.com/tensorflow/tensorflow/blob/r0.7/tensorflow/python/ops/rnn.py#L323).
However in newer version, it returns a tuple instead, so as you said, you need to change
outputs = rnn.bidirectionnal_rnn(...
by outputs, _, _ = rnn.bidirectionnal_rnn(...
@aymericdamien Speaking of BDLSTM example, is there any particular reason for n_hidden
to be 128? I'm hesitant to open an issue for this question since it's likely a triviality, but the number puzzles me.
I am also interested in the problem. maybe base on experience? @zandaqo
There is no particular reason, it depends on your data. A recommendation is to use multiple of 32 as it may speedup a little the computation when using float32.
ImportError: This module is deprecated. Use tf.nn.rnn_* instead.
it has been fix