Keras: How to use TimeDistributed if I have multiple inputs

Created on 24 Jun 2016  路  23Comments  路  Source: keras-team/keras

TimeDistributed works fine if there is only one input as is in this exampe at the bottom of the page. But when there are multiple inputs, TimeDistributed seems not working.

Say, if my model has 3 inputs,
seq_inputs= [Input(shape=(TIME_STEPS, FEATURE_LENGTH)) for i in range(3)] outputs=TimeDistributed(model)(seq_inputs)
the reported error is: TypeError: can only concatenate tuple (not "list") to tuple

So I changed the last to outputs=TimeDistributed(model)(*seq_inputs), but there is still an error saying that TypeError: call() takes at most 3 arguments (4 given)

################# below is my code

from keras.models import Sequential, Model, Graph
from keras.layers import Input, Convolution2D, MaxPooling2D, LSTM, Dense, BatchNormalization, ZeroPadding2D, Flatten, merge, Masking, Dropout, TimeDistributed, Reshape, Lambda, Embedding
from keras import backend

NUM_INPUTS=3
TIME_STEPS=20

model = Sequential()
model.add(Dense(32, input_dim=784))

inputs = [Input(shape=(32,)) for i in range(NUM_INPUTS)]
temps=[model(x) for x in inputs]
merged=merge(temps, mode='concat')

merged_model=Model(input=inputs, output=merged)

merged_model(inputs)

pdb.set_trace()

seq_inputs = (Input(shape=(TIME_STEPS, 32)) for i in range(NUM_INPUTS))
outputs=TimeDistributed(merged_model)(*seq_inputs)

stale

Most helpful comment

If you guys need it, I could enable multi input support to the TimeDistributed wrapper.

All 23 comments

Have you found a solution yet?

num_inputs = 3
input_dim = 784
input_length = 20
output_dim = 32

model = Sequential()
model.add(Dense(output_dim, input_dim=input_dim))

merged_input = Input((num_inputs, input_dim))
temps = [model(merged_input[:, x, :]) for x in range(num_inputs)]
merged = merge(temps, 'concat')
merged_model = Model(input=merged_input, output=merged)

seq_inputs = [Input(input_length, input_dim) for x in range(num_inputs)]
seq = map(Reshape((input_length, 1, input_dim)), seq_inputs)
seq = merge(seq, 'concat', concat_axis=2)
outputs = TimeDistributed(merged_model)(seq)

@farizrahman4u
thanks, my solution looks similar.
I merge the inputs and then use Lambda functions (for slicing) to retrieve all parts.

If you guys need it, I could enable multi input support to the TimeDistributed wrapper.

I think it is not necessary, as everything for this exist. Also this thread easily pops up when search in the Internet.

3432

@farizrahman4u I believe the trick of merging multiple input tensors would not work if the shape of the input tensor differ from each other. It would be nice if there is native and robust support of using multiple input tensors with TimeDistributed.

They should have the same number of timesteps either way.

@farizrahman4u I think that would be useful and for the sake of consistency.
Or at least it should be mentioned in the TimeDistributed documentation/error message that it doesn't support multiple inputs. This thread is easy to find once you figure out what the cause of the problem is, but it's not obvious right away that there should be this problem. It is easy to assume that you can pass multiple inputs here as anywhere else.

Is multi input support planned to be implemented? I'm currently packing multi-inputs into a single input, which is not very ideal/good design.

@farizrahman4u
Thank you for your solution. I tried it but it fails


With theano backend, I had

TypeError: ('Not a Keras tensor:', Subtensor{::, int64, ::}.0)

because of line 13

...
temps = [model(merged_input[:, x, :]) for x in range(num_inputs)]
...

With tensorflow backend, I had

AttributeError: 'NoneType' object has no attribute 'inbound_nodes'

because of line 15

...
merged_model = Model(input=merged_input, output=merged)
...

I am new to keras and use

keras: 2.0.2
theano 0.10.0dev1.dev-RELEASE
tensorflow 1.1.0
ubuntu 16.04

Any update on multi-input support for TimeDistributed?
I would like to implement a hierarchical model to classify each sentence based on the context information provided by the whole document. Thus, I designed a model where the first recurrent layer works at sentence level and the second one at document level. I want to include further sentence-level information (e.g. sentence type or category), concatenate it to the output of the first layer and use the resulting augmented tensor to feed the second recurrent layer. Here is my code:

# input sentence
in_sentence = Input(shape=(MAX_LENGTH,), dtype='int32')
# additional sentence-level information (fixed length array - already "embedded")
in_info = Input(shape=(INFO_LENGTH,), dtype='float32')
# first layer (sentence-level)
# word embedding
embedded_sentence = Embedding(len(vocab) + 1,
                            EMBEDDING_DIM,
                            weights=[embedding_matrix],
                     trainable = False)(in_sentence)

recurrent_sentence = GRU(hidden_dim_1)(embedded_sentence)
recurrent_sentence_and_info = concatenate([recurrent_sentence, in_info])
encoded_model = Model([in_sentence, in_info], recurrent_sentence_and_info)

#second layer (document-level)
sequence_input = Input(shape=(MAX_SENTENCES, MAX_LENGTH), dtype='int32')
info_seq_input = Input(shape=(MAX_SENTENCES, INFO_LENGTH), dtype='float32')
seq_encoded = TimeDistributed(encoded_model)([sequence_input, info_seq_input])

# Encode entire sentence
seq_encoded = GRU(hidden_dim_2,return_sequences=True)(seq_encoded)

# Prediction
prediction = Dense(NUM_CLASSES, activation='softmax')(seq_encoded)
model = Model([sequence_input,info_seq_input], prediction)

For the sake of completeness, MAX_LENGTH is different from INFO_LENGTH, in general.
When I run
seq_encoded = TimeDistributed(encoded_model)([sequence_input, info_seq_input])
I get the following assertion error: assert len(input_shape) >= 3

Without additional sentence-level information, TimeDistributed has a single input and everything seems to work fine. Is there any work around to include multiple input?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

I am using Bucketing to group together batches of different length. This is time series data where I have multiple time series of the same length (but variable length across batches) as input to the different layers at the beginning of the model. Thus the input shape for my input layer is (None, 1) because I only have one column of data per input. How can I apply @farizrahman4u original solution without an input length?

I've tried submitting None as a dimension and get the following error:

ValueError: Tried to convert 'shape' to a tensor and failed. Error: None values not supported.

If the multi-input of TimeDistributed has different shape, such as there has three input A, B, C, only A is a 5D tensor, the B and C are not, but I have to input them together to a custom_conv function through Lambda layer which will be wrapped by the TimeDistributed, so does it have possibility to support to input a list with different shapes.

@karenyun Can you elaborate? How you pass a 5D tensor and 2 non-5D tensors exactly?

Here is a simple code that shows the problem of trying to combine a multi-input model with LSTM. It fails on the TimeDistributed line. Any ideas on how to fix it?

from keras.models import Sequential, Model
from keras.layers import Conv2DTranspose, Input, InputLayer, Reshape, Lambda, Concatenate
from keras.layers import TimeDistributed, Conv2D, MaxPooling2D, Flatten, Dropout, Dense
from keras.layers.recurrent import LSTM

sequence_len = 50
input_image_dim = (128,) * 2 + (3,)
input_vector_dim = (5,)
output_dim = 4

# simple multi-input model
input_x1 = Input(shape=input_image_dim, name='image_input')
x = input_x1
x = Conv2D(128, (4, 4), strides=5, activation='relu')(x)
x1 = Flatten()(x)

input_x2 = Input(shape=input_vector_dim, name='vector_input')
x = input_x2
x2 = Dense(32)(x)

x = Concatenate()([x1, x2])
x = Dense(32)(x)
model = Model([input_x1, input_x2], x, name='pre_lstm')
print(model.summary())

# pass in a model into LSTM
input_x = Input(shape=(sequence_len,) + input_image_dim, name='encoder_input_time_dist')
input_v = Input(shape=(sequence_len,) + input_image_dim, name='encoder_input_time_dist')

x = TimeDistributed(model)([input_x, input_v])
x = LSTM(256, return_sequences=True, dropout=0.5)(x)
x = Dense(32)(x)
model = Model([input_x1, input_x2], x, name='pre_lstm')
print(model.summary())

@iretiayo Have you found any solution to your problem? I'm currently looking at the exact same architectural problem. As you, I'm having a 2D vector and an image as input, which is then fed to a LSTM. If you have any solution using a different approach, it would also be great if you could share it! My only idea of how to solve that is by using T input networks with shared weight, which then are used as a sequence to feed to the lstm layer.

@raharth Have you found any solution?
Here is the part of my code that throws out the error:

@raharth Have you found any solution?
Here is the part of my code that throws out the error:

Unfortunately I don't see your code. As far as I remember I actually used shared weights to solve it, but it was a whole mess and really hacky. I actually decided to build the same architecture using pytorch, which is way more flexible and cleaner for an architecture like that. Since it doesn't compile the graph it doesn't care about what you did with the tensor before feeding it to a specific layer, so you just need to take care that the shape actually matches what it expects.

@farizrahman4u Thanks a lot for the proposed solution. I tried Lambda layer, but it seems the nested model can't be trained when it is embedded in Lambda layer. Do you have any suggestion regarding this issue?
Here, you can find the snippet of my code that I am stuck. Any help is highly appreciated.

RepeatVector can help, e.g.

TimeDistributed( Dense(tensorflow.shape(word_embeddings)[2] ) )(Concatenate()([
    word_embeddings,  
    RepeatVector( tensorflow.shape(word_embeddings)[1] )(sentence_embedding)
]))

I was able to solve this problem using the RepeatVector layer.

from keras.models import Model
from keras.layers import Input, Dense, BatchNormalization, LSTM, TimeDistributed #, Conv1D, LeakyReLU, MaxPool1D,
from keras.layers import Concatenate, RepeatVector
...
core_input_1 = Input(shape=(self.core_timesteps, self.core_input_1_dim), name='core_input_1')
core_branch_1 = BatchNormalization(momentum=0.0, name='core_1_bn')(core_input_1)
core_branch_1 = LSTM(self.core_nodes[0], activation='relu', name='core_1_lstm_1', return_sequences=True)(core_branch_1)
core_branch_1 = LSTM(self.core_nodes[1], activation='relu', name='core_1_lstm_2')(core_branch_1)

core_input_2 = Input(shape=(self.core_timesteps, self.core_input_2_dim), name='core_input_2')
core_branch_2 = BatchNormalization(momentum=0.0, name='core_2_bn')(core_input_2)
core_branch_2 = LSTM(self.core_nodes[0], activation='relu', name='core_2_lstm_1', return_sequences=True)(core_branch_2)
core_branch_2 = LSTM(self.core_nodes[1], activation='relu', name='core_2_lstm_2')(core_branch_2)

merged = Concatenate()([core_branch_1, core_branch_2])

full_branch = RepeatVector(self.output_timesteps)(merged)        
full_branch = LSTM(self.core_nodes[1], activation='relu', name='final_lstm', return_sequences=True)(full_branch)

full_branch = TimeDistributed(Dense(self.output_dim, name='td_dense', activation='relu'))(full_branch)
full_branch = TimeDistributed(Dense(self.output_dim, name='td_dense'))(full_branch)

Note that return sequences before the concat is False and that the RepeatVector layer repeats the concat vector the same number of timesteps as I want output by the TimeDistributed layer at the end. This is a multi-variate mutli-step forecasting model.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

vinayakumarr picture vinayakumarr  路  3Comments

MarkVdBergh picture MarkVdBergh  路  3Comments

amityaffliction picture amityaffliction  路  3Comments

snakeztc picture snakeztc  路  3Comments

farizrahman4u picture farizrahman4u  路  3Comments