Hi there,
I have a problem with LSTMs. I need to get the cell states out of a LSTM for each time step. Unfortunaly it is only possible to get the output for each time step with return_sequences=True. return_state=True only gives me the cell state for the last time step...
Is there any hack/modification to get the cell states for each time step?
Greetings
During prediction you can get the states for each time step by unrolling the RNN - basically you do a for loop over the LSTMCell instead using TF/Theano scan ops which are called by K.rnn.
maxlen = 10
input_dim = 10
units = 5
inputs = Input((maxlen, input_dim))
rnn = LSTM(units, return_state=True)
states = [] # list of (h, c) tuples
outputs = []
state = None
def get_indexer(t):
return Lambda(lambda x, t: x[:, t, :], arguments={'t':t}, output_shape=lambda s: (s[0], s[2]))
def expand(x):
return K.expand_dims(x, 1)
expand_layer = Lambda(expand, output_shape=lambda s: (s[0], 1, s[1]))
for t in range(maxlen):
input_t = get_indexer(t)(inputs) # basically input_t = inputs[:, t, :]
input_t = expand_layer(input_t)
output_t, h, c = rnn(input_t, initial_state=state)
state = h, c
states.append(state)
outputs.append(output_t)
Caveat - Ignores masking.
Thanks for your answer, I think this will help me. Did you test your code? Unfortunately, it doesn't work: output_t, (h, c) = cell(input_t, (h, c))
TypeError: __call__() takes 2 positional arguments but 3 were given.
updated
Again, thank you so much! Now there's
File "...\engine\topology.py", line 717, in _add_inbound_node
output_tensors[i]._keras_shape = output_shapes[i]
AttributeError: 'tuple' object has no attribute '_keras_shape'
After each timestamp, here is an exemplary code for accessing all of the states:
import keras.backend as K
statesAll=[]
for layer in model.layers:
if getattr(layer,'stateful',False):
if hasattr(layer,'states'):
for state in layer.states:
statesAll.append(K.get_value(state))
Hmm.. seems Cells cant be called as such, I have updated to to use LSTM layer instead. Can you try now?
Thank you so much for your patience and support! It doesn't return any error, so I will try to modify it for my problem. Thank you! :)
@brain1995 @farizrahman4u Hi, I have another relevant question that how can I compute the output of every LSTMCell with each word of an sentence and optimize on mini-batch finally. Specifically, now the output_t has the batch dimension, but in each time-step I only want to get the single LSTMCell output. The reason is that I want to control if the word needs to participate in the LSTM update. And Finally I want to add all the losses of a batch to optimize. Just like the pseudo-code below:
maxlen = 10
input_dim = 10
units = 5
batch_size = 32
inputs = Input((maxlen, input_dim))
rnn = LSTM(units, return_state=True)
states = [] # list of (h, c) tuples
outputs = []
state = None
def get_indexer(t):
return Lambda(lambda x, t: x[:, t, :], arguments={'t':t}, output_shape=lambda s: (s[0], s[2]))
def expand(x):
return K.expand_dims(x, 1)
def decision(x): #Just an example, maybe more complex in application.
return np.random.choice([0,1])
expand_layer = Lambda(expand, output_shape=lambda s: (s[0], 1, s[1]))
for i in range(batch_size): #Here, I want to compute the LSTMCell by each sample
for t in range(maxlen):
input_t = get_indexer(t)(inputs) # my hope: input_t = inputs[i, t, :]
input_t = expand_layer(input_t)
des = decision(input_t)
if des == 0:
continue# if the word input has no contribution decided by `decision` function, it will not participate in the LSTM updating.
output_t, h, c = rnn(input_t, initial_state=state)
state = h, c
states.append(state)
outputs.append(output_t)
Therefore, where can I modify the example code above. Thanks!
@farizrahman4u is _state_ supposed to be a list? Also, where do you provide the desired training/testing set to the model? Thanks for the help!
What does the function "Lambda" mean above ?
I knew it, thanks .
@farizrahman4u Could you help me translate your chunk of code into tensorflow? I was trying several options but still got an error:
import keras
from keras.layers import Input, LSTM, Lambda
import tensorflow as tf
maxlen = 10
input_dim = 10
units = 5
inputs = Input((maxlen, input_dim), dtype = tf.float32)
rnn = LSTM(units, return_state=True)
def get_indexer(t):
return Lambda(lambda x, t: x[:, t, :], arguments={'t':t}, output_shape=lambda s: (s[0], s[2]))
def expand(x):
return keras.backend.expand_dims(x, 1)
expand_layer = Lambda(expand, output_shape=lambda s: (s[0], 1, s[1]))
state = tf.Variable(tf.zeros([10]))
states = tf.TensorArray(dtype=tf.float32, size=0, dynamic_size=True, name='states')
iters = tf.constant(10, name='iters')
def cond(i, iters, states):
return tf.less(i, iters)
def body(i, iters, states):
input_t = get_indexer(i)(inputs) # basically input_t = inputs[:, t, :]
input_t = expand_layer(input_t)
output_t, h, c = rnn(input_t, initial_state=state)
temp_state = h, c
assign_op = tf.assign(state, temp_state)
states = states.write(step, state)
return states
states = tf.while_loop(cond, body, [0, iters, states])
ValueError: Initializer for variable while_18/lstm_6/kernel/ is from inside a control-flow construct, such as a loop or conditional. When creating a variable inside a loop or conditional, use a lambda as the initializer.
I have exactly the same question !!!
And I don鈥檛 know how to use K.rnn to do this? Can you give me a example?
Please tell me how to use K.rnn to get those states from Lambda layer
a burning question for a newbie
or how to input actual values into this Lambda layer
During prediction you can get the states for each time step by unrolling the RNN - basically you do a for loop over the LSTMCell instead using TF/Theano scan ops which are called by K.rnn.
maxlen = 10 input_dim = 10 units = 5 inputs = Input((maxlen, input_dim)) rnn = LSTM(units, return_state=True) states = [] # list of (h, c) tuples outputs = [] state = None def get_indexer(t): return Lambda(lambda x, t: x[:, t, :], arguments={'t':t}, output_shape=lambda s: (s[0], s[2])) def expand(x): return K.expand_dims(x, 1) expand_layer = Lambda(expand, output_shape=lambda s: (s[0], 1, s[1])) for t in range(maxlen): input_t = get_indexer(t)(inputs) # basically input_t = inputs[:, t, :] input_t = expand_layer(input_t) output_t, h, c = rnn(input_t, initial_state=state) state = h, c states.append(state) outputs.append(output_t)Caveat - Ignores masking.
Is the below results I get from model.predict is the state of each time steps ???
time_steps = 11
input_dim = 17
units = 128
inputs = Input((time_steps, input_dim))
rnn = GRU(units, return_state=True)
states = [] # list of (h, c) tuples
outputs = []
state = None
def get_indexer(t):
return Lambda(lambda x, t: x[:, t, :],
arguments={'t':t},
output_shape=lambda s: (s[0], s[2]))
def expand(x):
return K.expand_dims(x, 1)
expand_layer = Lambda(expand, output_shape=lambda s: (s[0], 1, s[1]))
for t in range(time_steps):
input_t = get_indexer(t)(inputs) # basically input_t = inputs[:, t, :]
input_t = expand_layer(input_t)
output_t, h = rnn(input_t, initial_state=state)
state = h
states.append(state)
outputs.append(output_t)
modelX = Model(inputs,states)
every_time_step_states = modelX.predict(My_real_data_input)
Most helpful comment
During prediction you can get the states for each time step by unrolling the RNN - basically you do a for loop over the LSTMCell instead using TF/Theano scan ops which are called by K.rnn.
Caveat - Ignores masking.