Keras: Does lstm in keras add peephole connection? Which paper do these codes refer to?

Created on 14 Feb 2016  ·  13Comments  ·  Source: keras-team/keras

In keras, the lstm neural network is efficient for text classification. However, I can't understand the detail codes and don't know if the lstm neural network adds peephole connection. Please give me some reference documents. Opinions/Views would be highly appreciated!

Most helpful comment

So here is how we calculate the activations of an LSTM https://github.com/fchollet/keras/blob/master/keras/layers/recurrent.py#L443

i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i))

if I'm not wrong, the peephole should take a peek at the cell content and do something like:

i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i) + K.dot(c_tm1, self.U_i))

If I'm correct, the following class should do what you need:
Gist is here: https://gist.github.com/EderSantana/f07fa7a0371d0e1c4ef1

from keras.layers.recurrent import LSTM

class LSTMpeephole(LSTM):
    def __init__(self, **kwargs):
        super(LSTMpeephole, self).__init__(**kwargs)

    def build(self):
        super(LSTMpeephole, self).build()
        self.P_i = self.inner_init((self.output_dim, self.output_dim))
        self.P_f = self.inner_init((self.output_dim, self.output_dim))
        self.P_c = self.inner_init((self.output_dim, self.output_dim))
        self.P_o = self.inner_init((self.output_dim, self.output_dim))
        self.trainable_weights += [self.P_i, self.P_f, self.P_o]

    def step(self, x, states):
        assert len(states) == 2
        h_tm1 = states[0]
        c_tm1 = states[1]

        x_i = K.dot(x, self.W_i) + self.b_i
        x_f = K.dot(x, self.W_f) + self.b_f
        x_c = K.dot(x, self.W_c) + self.b_c
        x_o = K.dot(x, self.W_o) + self.b_o

        i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i) + K.dot(c_tm1, self.P_i))
        f = self.inner_activation(x_f + K.dot(h_tm1, self.U_f) + K.dot(c_tm1, self.P_f))
        c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1, self.U_c) + K.dot(c_tm1, self.P_c))
        o = self.inner_activation(x_o + K.dot(h_tm1, self.U_o) + K.dot(c_tm1, self.P_o))
        h = o * self.activation(c)
        return h, [h, c]

All 13 comments

@EderSantana @fchollet @dbonadiman Could you spend your valuable time solving this question? Thanks a lot.

So here is how we calculate the activations of an LSTM https://github.com/fchollet/keras/blob/master/keras/layers/recurrent.py#L443

i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i))

if I'm not wrong, the peephole should take a peek at the cell content and do something like:

i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i) + K.dot(c_tm1, self.U_i))

If I'm correct, the following class should do what you need:
Gist is here: https://gist.github.com/EderSantana/f07fa7a0371d0e1c4ef1

from keras.layers.recurrent import LSTM

class LSTMpeephole(LSTM):
    def __init__(self, **kwargs):
        super(LSTMpeephole, self).__init__(**kwargs)

    def build(self):
        super(LSTMpeephole, self).build()
        self.P_i = self.inner_init((self.output_dim, self.output_dim))
        self.P_f = self.inner_init((self.output_dim, self.output_dim))
        self.P_c = self.inner_init((self.output_dim, self.output_dim))
        self.P_o = self.inner_init((self.output_dim, self.output_dim))
        self.trainable_weights += [self.P_i, self.P_f, self.P_o]

    def step(self, x, states):
        assert len(states) == 2
        h_tm1 = states[0]
        c_tm1 = states[1]

        x_i = K.dot(x, self.W_i) + self.b_i
        x_f = K.dot(x, self.W_f) + self.b_f
        x_c = K.dot(x, self.W_c) + self.b_c
        x_o = K.dot(x, self.W_o) + self.b_o

        i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i) + K.dot(c_tm1, self.P_i))
        f = self.inner_activation(x_f + K.dot(h_tm1, self.U_f) + K.dot(c_tm1, self.P_f))
        c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1, self.U_c) + K.dot(c_tm1, self.P_c))
        o = self.inner_activation(x_o + K.dot(h_tm1, self.U_o) + K.dot(c_tm1, self.P_o))
        h = o * self.activation(c)
        return h, [h, c]

In addition you could take a look at this paper, and this Ph.D. thesis.

@EderSantana Thanks a lot. By the way, does LSTM in keras stems from (Hochreiter and Schmidhuber, 1997) or a variant (Graves, 2013) ?

@Imorton-zd We are using Graves 2013 (with forget gates with bias equal 1).
BTW, I don't see a lot of recent work talking about the peep connections. Did you read about it in a recent paper?

@EderSantana
In c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1, self.U_c) + K.dot(c_tm1, self.P_c)), we may not need to add K.dot(c_tm1, self.P_c)?

@dinghaoyang Have you made this peephole structure work? Or anyone made it work?

Here is a paper refering to peepholes.
http://www.jmlr.org/papers/volume3/gers02a/gers02a.pdf

Hi, Did anyone able to find a solution for this? According to my understanding, Keras LSTM cell is similar to Tensorflow

BasicLSTMCell(RNNCell)

However, I am looking for the implementation of Tensorflow

LSTMCell(RNNCell)

in Keras (LSTMCell : The class uses optional peep-hole connections, optional cell clipping, and
an optional projection layer). https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/rnn_cell_impl.py
https://research.google.com/pubs/archive/43905.pdf
Hasim Sak, Andrew Senior, and Francoise Beaufays.
"Long short-term memory recurrent neural network architectures for
large scale acoustic modeling." INTERSPEECH, 2014.

I am wondering without writing a custom layer, is there is any way to use this Tensorflow LSTMCell in Keras or is it already available?

I have the same question:

I am wondering without writing a custom layer, is there is any way to use this Tensorflow LSTMCell in Keras or is it already available?

And it seems there's a peephole implementation in tensorflow.keras, but not in keras.

@zixia I don't see a peephole implementation in tensorflow.keras either. So is there no built in support in Keras to include peephole connections with LSTM?

I guess lstm in keras doesn't implement peephole connection, according to the kera ducmentation, the lstm layer is based on the paper "Long Short-Term Memory layer - Hochreiter 1997", and i also check lassagne's documentation, it's lstm layer is based on "Graves, Alex: “Generating sequences with recurrent neural networks.” arXiv preprint arXiv:1308.0850 ", where peephole connection is added to the orgin model. And if you print the number of params of both implementation, you will find the params of keras is less the params of lassagne.

The Unify RNN Interface RFC confirms that tf.keras's LSTMCell is equivalent to TensorFlow's BasicLSTMCell, and the comment says "_No peephole, clipping, projection. Keras allows kernel_activation to be customized (default=hard_sigmoid)_".
There is no equivalent of TensorFlow 1's LSTMCell class, which has this comment: "_Support peephole, clipping and projection_".

So if you want peepholes with tf.keras, you have to create a custom cell that wraps the LSTMCell class. In TensorFlow 2.0, it was removed (or more precisely, it was moved to tf.compat.v1.nn.rnn_cell.LSTMCell, so technically you can still use that, although it's not ideal).

Was this page helpful?
0 / 5 - 0 ratings