Keras: Does lstm in keras add peephole connection? Which paper do these codes refer to?

Created on 14 Feb 2016 · 13Comments · Source: keras-team/keras

In keras, the lstm neural network is efficient for text classification. However, I can't understand the detail codes and don't know if the lstm neural network adds peephole connection. Please give me some reference documents. Opinions/Views would be highly appreciated!

Source

Imorton-zd

Most helpful comment

So here is how we calculate the activations of an LSTM https://github.com/fchollet/keras/blob/master/keras/layers/recurrent.py#L443

i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i))

if I'm not wrong, the peephole should take a peek at the cell content and do something like:

i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i) + K.dot(c_tm1, self.U_i))

If I'm correct, the following class should do what you need:
Gist is here: https://gist.github.com/EderSantana/f07fa7a0371d0e1c4ef1

from keras.layers.recurrent import LSTM

class LSTMpeephole(LSTM):
    def __init__(self, **kwargs):
        super(LSTMpeephole, self).__init__(**kwargs)

    def build(self):
        super(LSTMpeephole, self).build()
        self.P_i = self.inner_init((self.output_dim, self.output_dim))
        self.P_f = self.inner_init((self.output_dim, self.output_dim))
        self.P_c = self.inner_init((self.output_dim, self.output_dim))
        self.P_o = self.inner_init((self.output_dim, self.output_dim))
        self.trainable_weights += [self.P_i, self.P_f, self.P_o]

    def step(self, x, states):
        assert len(states) == 2
        h_tm1 = states[0]
        c_tm1 = states[1]

        x_i = K.dot(x, self.W_i) + self.b_i
        x_f = K.dot(x, self.W_f) + self.b_f
        x_c = K.dot(x, self.W_c) + self.b_c
        x_o = K.dot(x, self.W_o) + self.b_o

        i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i) + K.dot(c_tm1, self.P_i))
        f = self.inner_activation(x_f + K.dot(h_tm1, self.U_f) + K.dot(c_tm1, self.P_f))
        c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1, self.U_c) + K.dot(c_tm1, self.P_c))
        o = self.inner_activation(x_o + K.dot(h_tm1, self.U_o) + K.dot(c_tm1, self.P_o))
        h = o * self.activation(c)
        return h, [h, c]

EderSantana on 17 Feb 2016

👍5

All 13 comments

@EderSantana @fchollet @dbonadiman Could you spend your valuable time solving this question? Thanks a lot.

Imorton-zd on 17 Feb 2016

So here is how we calculate the activations of an LSTM https://github.com/fchollet/keras/blob/master/keras/layers/recurrent.py#L443

i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i))

if I'm not wrong, the peephole should take a peek at the cell content and do something like:

i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i) + K.dot(c_tm1, self.U_i))

If I'm correct, the following class should do what you need:
Gist is here: https://gist.github.com/EderSantana/f07fa7a0371d0e1c4ef1

from keras.layers.recurrent import LSTM

class LSTMpeephole(LSTM):
    def __init__(self, **kwargs):
        super(LSTMpeephole, self).__init__(**kwargs)

    def build(self):
        super(LSTMpeephole, self).build()
        self.P_i = self.inner_init((self.output_dim, self.output_dim))
        self.P_f = self.inner_init((self.output_dim, self.output_dim))
        self.P_c = self.inner_init((self.output_dim, self.output_dim))
        self.P_o = self.inner_init((self.output_dim, self.output_dim))
        self.trainable_weights += [self.P_i, self.P_f, self.P_o]

    def step(self, x, states):
        assert len(states) == 2
        h_tm1 = states[0]
        c_tm1 = states[1]

        x_i = K.dot(x, self.W_i) + self.b_i
        x_f = K.dot(x, self.W_f) + self.b_f
        x_c = K.dot(x, self.W_c) + self.b_c
        x_o = K.dot(x, self.W_o) + self.b_o

        i = self.inner_activation(x_i + K.dot(h_tm1, self.U_i) + K.dot(c_tm1, self.P_i))
        f = self.inner_activation(x_f + K.dot(h_tm1, self.U_f) + K.dot(c_tm1, self.P_f))
        c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1, self.U_c) + K.dot(c_tm1, self.P_c))
        o = self.inner_activation(x_o + K.dot(h_tm1, self.U_o) + K.dot(c_tm1, self.P_o))
        h = o * self.activation(c)
        return h, [h, c]

EderSantana on 17 Feb 2016

👍5

In addition you could take a look at this paper, and this Ph.D. thesis.

tboquet on 17 Feb 2016

@EderSantana Thanks a lot. By the way, does LSTM in keras stems from (Hochreiter and Schmidhuber, 1997) or a variant (Graves, 2013) ?

Imorton-zd on 18 Feb 2016

@Imorton-zd We are using Graves 2013 (with forget gates with bias equal 1).
BTW, I don't see a lot of recent work talking about the peep connections. Did you read about it in a recent paper?

EderSantana on 18 Feb 2016

👍2

@EderSantana
In c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1, self.U_c) + K.dot(c_tm1, self.P_c)), we may not need to add K.dot(c_tm1, self.P_c)?

dinghaoyang on 18 May 2016

@dinghaoyang Have you made this peephole structure work? Or anyone made it work?

ersinyar on 16 Sep 2016

Here is a paper refering to peepholes.
http://www.jmlr.org/papers/volume3/gers02a/gers02a.pdf

karlittoz on 25 Feb 2017

Hi, Did anyone able to find a solution for this? According to my understanding, Keras LSTM cell is similar to Tensorflow

BasicLSTMCell(RNNCell)

However, I am looking for the implementation of Tensorflow

LSTMCell(RNNCell)

in Keras (LSTMCell : The class uses optional peep-hole connections, optional cell clipping, and
an optional projection layer). https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/rnn_cell_impl.py
https://research.google.com/pubs/archive/43905.pdf
Hasim Sak, Andrew Senior, and Francoise Beaufays.
"Long short-term memory recurrent neural network architectures for
large scale acoustic modeling." INTERSPEECH, 2014.

I am wondering without writing a custom layer, is there is any way to use this Tensorflow LSTMCell in Keras or is it already available?

Rithmax on 19 Jul 2017

👍1

I have the same question:

I am wondering without writing a custom layer, is there is any way to use this Tensorflow LSTMCell in Keras or is it already available?

And it seems there's a peephole implementation in tensorflow.keras, but not in keras.

huan on 1 Jan 2018

👍3

@zixia I don't see a peephole implementation in tensorflow.keras either. So is there no built in support in Keras to include peephole connections with LSTM?

HansikaPH on 4 May 2018

I guess lstm in keras doesn't implement peephole connection, according to the kera ducmentation, the lstm layer is based on the paper "Long Short-Term Memory layer - Hochreiter 1997", and i also check lassagne's documentation, it's lstm layer is based on "Graves, Alex: “Generating sequences with recurrent neural networks.” arXiv preprint arXiv:1308.0850 ", where peephole connection is added to the orgin model. And if you print the number of params of both implementation, you will find the params of keras is less the params of lassagne.

xianhaoniyes on 16 Nov 2018

👍1

The Unify RNN Interface RFC confirms that tf.keras's LSTMCell is equivalent to TensorFlow's BasicLSTMCell, and the comment says "_No peephole, clipping, projection. Keras allows kernel_activation to be customized (default=hard_sigmoid)_".
There is no equivalent of TensorFlow 1's LSTMCell class, which has this comment: "_Support peephole, clipping and projection_".

So if you want peepholes with tf.keras, you have to create a custom cell that wraps the LSTMCell class. In TensorFlow 2.0, it was removed (or more precisely, it was moved to tf.compat.v1.nn.rnn_cell.LSTMCell, so technically you can still use that, although it's not ideal).