Keras: Order of weights in LSTM

Created on 28 Jun 2016  路  5Comments  路  Source: keras-team/keras

I'm trying to export an LSTM layer from Keras to a portable C implementation. I accept the possibility there is a bug in my code, but assuming there isn't, I can't figure out the order of the weights / gates in the LSTM layer.

model = Sequential()
model.add(LSTM(4,input_dim=5,input_length=N,return_sequences=True))
shapes = [x.shape for x in model.get_weights()]
print shapes

[(5, 4),
(4, 4),
(4,),
(5, 4),
(4, 4),
(4,),
(5, 4),
(4, 4),
(4,),
(5, 4),
(4, 4),
(4,)]

What I see is

  • Weights that handle inputs
  • Weights that handle recurrent / hidden outputs
  • bias
  • repeat

But which weight set goes to which gate? The third set of weights have biases initialized to 1.0, so I'm assuming that's the forget gates.

Looking in recurrent.py, I see something like this:
i = self.inner_activation(z0)
f = self.inner_activation(z1)
c = f * c_tm1 + i * self.activation(z2)
o = self.inner_activation(z3)

But i,f,c,o is not the order, because of the biases set to 1.0. So I'm kind of confused, and would appreciate the help.

stale

Most helpful comment

If you are using Keras 2.2.0

When you print

print(model.layers[0].trainable_weights)

you should see three tensors: lstm_1/kernel, lstm_1/recurrent_kernel, lstm_1/bias:0
One of the dimensions of each tensor should be a product of

4 * number_of_units

where number_of_units is your number of neurons. Try:

units = int(int(model.layers[0].trainable_weights[0].shape[1])/4)
print("No units: ", units)

That is because each tensor contains weights for four LSTM units (in that order):

i (input), f (forget), c (cell state) and o (output)

Therefore in order to extract weights you can simply use slice operator:

W = model.layers[0].get_weights()[0]
U = model.layers[0].get_weights()[1]
b = model.layers[0].get_weights()[2]

W_i = W[:, :units]
W_f = W[:, units: units * 2]
W_c = W[:, units * 2: units * 3]
W_o = W[:, units * 3:]

U_i = U[:, :units]
U_f = U[:, units: units * 2]
U_c = U[:, units * 2: units * 3]
U_o = U[:, units * 3:]

b_i = b[:units]
b_f = b[units: units * 2]
b_c = b[units * 2: units * 3]
b_o = b[units * 3:]

Source: keras code

All 5 comments

i don't know which version you are using now.
in keras/layers/recurrent.py, you can check the trainable_weights
https://github.com/fchollet/keras/blob/master/keras/layers/recurrent.py#L723
these weights will be appended to your model in sequence.

model = Sequential()
model.add(LSTM(4,input_dim=5,input_length=N,return_sequences=True))
for e in zip(model.layers[0].trainable_weights, model.layers[0].get_weights()):
    print('Param %s:\n%s' % (e[0],e[1]))
Param lstm_3_W_i:
[[ 0.00069305, ...]]
Param lstm_3_U_i:
[[ 1.10000002, ...]]
Param lstm_3_b_i:
[ 0., ...]
Param lstm_3_W_c:
[[-1.38370085, ...]]
...

@ChristianThomae in the above solution it is no more giving the same details.Rather it is simply giving 'lstm_1/kernel:0 and lstm_1/recurrent_kernel:0 and so on. Any way to display or find out the exact order of the output for get_weights especally the order of I,F,C,O since I need some specific values to be extracted from these for further processing as well.

If I understand https://github.com/keras-team/keras/blob/master/keras/layers/recurrent.py#L1863 correctly I would say the order is i, f, c, o for the kernel, recurrent_kernel and bias respectively.

If you are using Keras 2.2.0

When you print

print(model.layers[0].trainable_weights)

you should see three tensors: lstm_1/kernel, lstm_1/recurrent_kernel, lstm_1/bias:0
One of the dimensions of each tensor should be a product of

4 * number_of_units

where number_of_units is your number of neurons. Try:

units = int(int(model.layers[0].trainable_weights[0].shape[1])/4)
print("No units: ", units)

That is because each tensor contains weights for four LSTM units (in that order):

i (input), f (forget), c (cell state) and o (output)

Therefore in order to extract weights you can simply use slice operator:

W = model.layers[0].get_weights()[0]
U = model.layers[0].get_weights()[1]
b = model.layers[0].get_weights()[2]

W_i = W[:, :units]
W_f = W[:, units: units * 2]
W_c = W[:, units * 2: units * 3]
W_o = W[:, units * 3:]

U_i = U[:, :units]
U_f = U[:, units: units * 2]
U_c = U[:, units * 2: units * 3]
U_o = U[:, units * 3:]

b_i = b[:units]
b_f = b[units: units * 2]
b_c = b[units * 2: units * 3]
b_o = b[units * 3:]

Source: keras code

Was this page helpful?
0 / 5 - 0 ratings

Related issues

harishkrishnav picture harishkrishnav  路  3Comments

farizrahman4u picture farizrahman4u  路  3Comments

zygmuntz picture zygmuntz  路  3Comments

vinayakumarr picture vinayakumarr  路  3Comments

braingineer picture braingineer  路  3Comments