I'm trying to export an LSTM layer from Keras to a portable C implementation. I accept the possibility there is a bug in my code, but assuming there isn't, I can't figure out the order of the weights / gates in the LSTM layer.
model = Sequential()
model.add(LSTM(4,input_dim=5,input_length=N,return_sequences=True))
shapes = [x.shape for x in model.get_weights()]
print shapes
[(5, 4),
(4, 4),
(4,),
(5, 4),
(4, 4),
(4,),
(5, 4),
(4, 4),
(4,),
(5, 4),
(4, 4),
(4,)]
What I see is
But which weight set goes to which gate? The third set of weights have biases initialized to 1.0, so I'm assuming that's the forget gates.
Looking in recurrent.py, I see something like this:
i = self.inner_activation(z0)
f = self.inner_activation(z1)
c = f * c_tm1 + i * self.activation(z2)
o = self.inner_activation(z3)
But i,f,c,o is not the order, because of the biases set to 1.0. So I'm kind of confused, and would appreciate the help.
i don't know which version you are using now.
in keras/layers/recurrent.py, you can check the trainable_weights
https://github.com/fchollet/keras/blob/master/keras/layers/recurrent.py#L723
these weights will be appended to your model in sequence.
model = Sequential()
model.add(LSTM(4,input_dim=5,input_length=N,return_sequences=True))
for e in zip(model.layers[0].trainable_weights, model.layers[0].get_weights()):
print('Param %s:\n%s' % (e[0],e[1]))
Param lstm_3_W_i:
[[ 0.00069305, ...]]
Param lstm_3_U_i:
[[ 1.10000002, ...]]
Param lstm_3_b_i:
[ 0., ...]
Param lstm_3_W_c:
[[-1.38370085, ...]]
...
@ChristianThomae in the above solution it is no more giving the same details.Rather it is simply giving 'lstm_1/kernel:0 and lstm_1/recurrent_kernel:0 and so on. Any way to display or find out the exact order of the output for get_weights especally the order of I,F,C,O since I need some specific values to be extracted from these for further processing as well.
If I understand https://github.com/keras-team/keras/blob/master/keras/layers/recurrent.py#L1863 correctly I would say the order is i, f, c, o for the kernel, recurrent_kernel and bias respectively.
If you are using Keras 2.2.0
When you print
print(model.layers[0].trainable_weights)
you should see three tensors: lstm_1/kernel, lstm_1/recurrent_kernel, lstm_1/bias:0
One of the dimensions of each tensor should be a product of
4 * number_of_units
where number_of_units is your number of neurons. Try:
units = int(int(model.layers[0].trainable_weights[0].shape[1])/4)
print("No units: ", units)
That is because each tensor contains weights for four LSTM units (in that order):
i (input), f (forget), c (cell state) and o (output)
Therefore in order to extract weights you can simply use slice operator:
W = model.layers[0].get_weights()[0]
U = model.layers[0].get_weights()[1]
b = model.layers[0].get_weights()[2]
W_i = W[:, :units]
W_f = W[:, units: units * 2]
W_c = W[:, units * 2: units * 3]
W_o = W[:, units * 3:]
U_i = U[:, :units]
U_f = U[:, units: units * 2]
U_c = U[:, units * 2: units * 3]
U_o = U[:, units * 3:]
b_i = b[:units]
b_f = b[units: units * 2]
b_c = b[units * 2: units * 3]
b_o = b[units * 3:]
Source: keras code
Most helpful comment
If you are using Keras 2.2.0
When you print
you should see three tensors:
lstm_1/kernel, lstm_1/recurrent_kernel, lstm_1/bias:0One of the dimensions of each tensor should be a product of
where number_of_units is your number of neurons. Try:
That is because each tensor contains weights for four LSTM units (in that order):
Therefore in order to extract weights you can simply use slice operator:
Source: keras code