Keras: Order of weights in LSTM

Created on 28 Jun 2016 · 5Comments · Source: keras-team/keras

I'm trying to export an LSTM layer from Keras to a portable C implementation. I accept the possibility there is a bug in my code, but assuming there isn't, I can't figure out the order of the weights / gates in the LSTM layer.

model = Sequential()
model.add(LSTM(4,input_dim=5,input_length=N,return_sequences=True))
shapes = [x.shape for x in model.get_weights()]
print shapes

[(5, 4),
(4, 4),
(4,),
(5, 4),
(4, 4),
(4,),
(5, 4),
(4, 4),
(4,),
(5, 4),
(4, 4),
(4,)]

What I see is

Weights that handle inputs
Weights that handle recurrent / hidden outputs
bias
repeat

But which weight set goes to which gate? The third set of weights have biases initialized to 1.0, so I'm assuming that's the forget gates.

Looking in recurrent.py, I see something like this:
i = self.inner_activation(z0)
f = self.inner_activation(z1)
c = f * c_tm1 + i * self.activation(z2)
o = self.inner_activation(z3)

But i,f,c,o is not the order, because of the biases set to 1.0. So I'm kind of confused, and would appreciate the help.

stale

Source

bennythedataguy

Most helpful comment

If you are using Keras 2.2.0

When you print

print(model.layers[0].trainable_weights)

you should see three tensors: lstm_1/kernel, lstm_1/recurrent_kernel, lstm_1/bias:0
One of the dimensions of each tensor should be a product of

4 * number_of_units

where number_of_units is your number of neurons. Try:

units = int(int(model.layers[0].trainable_weights[0].shape[1])/4)
print("No units: ", units)

That is because each tensor contains weights for four LSTM units (in that order):

i (input), f (forget), c (cell state) and o (output)

Therefore in order to extract weights you can simply use slice operator:

W = model.layers[0].get_weights()[0]
U = model.layers[0].get_weights()[1]
b = model.layers[0].get_weights()[2]

W_i = W[:, :units]
W_f = W[:, units: units * 2]
W_c = W[:, units * 2: units * 3]
W_o = W[:, units * 3:]

U_i = U[:, :units]
U_f = U[:, units: units * 2]
U_c = U[:, units * 2: units * 3]
U_o = U[:, units * 3:]

b_i = b[:units]
b_f = b[units: units * 2]
b_c = b[units * 2: units * 3]
b_o = b[units * 3:]

Source: keras code

bartkowiaktomasz on 23 Jul 2018

👍6 ❤1 🎉1

All 5 comments

i don't know which version you are using now.
in keras/layers/recurrent.py, you can check the trainable_weights
https://github.com/fchollet/keras/blob/master/keras/layers/recurrent.py#L723
these weights will be appended to your model in sequence.

ymcui on 29 Jun 2016

model = Sequential()
model.add(LSTM(4,input_dim=5,input_length=N,return_sequences=True))
for e in zip(model.layers[0].trainable_weights, model.layers[0].get_weights()):
    print('Param %s:\n%s' % (e[0],e[1]))

Param lstm_3_W_i:
[[ 0.00069305, ...]]
Param lstm_3_U_i:
[[ 1.10000002, ...]]
Param lstm_3_b_i:
[ 0., ...]
Param lstm_3_W_c:
[[-1.38370085, ...]]
...

ChristianThomae on 1 Jul 2016

👍4

@ChristianThomae in the above solution it is no more giving the same details.Rather it is simply giving 'lstm_1/kernel:0 and lstm_1/recurrent_kernel:0 and so on. Any way to display or find out the exact order of the output for get_weights especally the order of I,F,C,O since I need some specific values to be extracted from these for further processing as well.

elhenceashima on 6 Feb 2018

If I understand https://github.com/keras-team/keras/blob/master/keras/layers/recurrent.py#L1863 correctly I would say the order is i, f, c, o for the kernel, recurrent_kernel and bias respectively.

kratzert on 23 Feb 2018