Keras: How to use TimeDistributed if I have multiple inputs

Created on 24 Jun 2016 · 23Comments · Source: keras-team/keras

TimeDistributed works fine if there is only one input as is in this exampe at the bottom of the page. But when there are multiple inputs, TimeDistributed seems not working.

Say, if my model has 3 inputs,
seq_inputs= [Input(shape=(TIME_STEPS, FEATURE_LENGTH)) for i in range(3)] outputs=TimeDistributed(model)(seq_inputs)
the reported error is: TypeError: can only concatenate tuple (not "list") to tuple

So I changed the last to outputs=TimeDistributed(model)(*seq_inputs), but there is still an error saying that TypeError: call() takes at most 3 arguments (4 given)

################# below is my code

from keras.models import Sequential, Model, Graph
from keras.layers import Input, Convolution2D, MaxPooling2D, LSTM, Dense, BatchNormalization, ZeroPadding2D, Flatten, merge, Masking, Dropout, TimeDistributed, Reshape, Lambda, Embedding
from keras import backend

NUM_INPUTS=3
TIME_STEPS=20

model = Sequential()
model.add(Dense(32, input_dim=784))

inputs = [Input(shape=(32,)) for i in range(NUM_INPUTS)]
temps=[model(x) for x in inputs]
merged=merge(temps, mode='concat')

merged_model=Model(input=inputs, output=merged)

merged_model(inputs)

pdb.set_trace()

seq_inputs = (Input(shape=(TIME_STEPS, 32)) for i in range(NUM_INPUTS))
outputs=TimeDistributed(merged_model)(*seq_inputs)

stale

Source

yataozhong

👍4

Most helpful comment

If you guys need it, I could enable multi input support to the TimeDistributed wrapper.

farizrahman4u on 7 Aug 2016

👍20

All 23 comments

Have you found a solution yet?

ghost on 5 Aug 2016

num_inputs = 3
input_dim = 784
input_length = 20
output_dim = 32

model = Sequential()
model.add(Dense(output_dim, input_dim=input_dim))

merged_input = Input((num_inputs, input_dim))
temps = [model(merged_input[:, x, :]) for x in range(num_inputs)]
merged = merge(temps, 'concat')
merged_model = Model(input=merged_input, output=merged)

seq_inputs = [Input(input_length, input_dim) for x in range(num_inputs)]
seq = map(Reshape((input_length, 1, input_dim)), seq_inputs)
seq = merge(seq, 'concat', concat_axis=2)
outputs = TimeDistributed(merged_model)(seq)

farizrahman4u on 7 Aug 2016

👍1

@farizrahman4u
thanks, my solution looks similar.
I merge the inputs and then use Lambda functions (for slicing) to retrieve all parts.

ghost on 7 Aug 2016

👍1

If you guys need it, I could enable multi input support to the TimeDistributed wrapper.

farizrahman4u on 7 Aug 2016

👍20

I think it is not necessary, as everything for this exist. Also this thread easily pops up when search in the Internet.

ghost on 9 Aug 2016

3432

farizrahman4u on 9 Aug 2016

@farizrahman4u I believe the trick of merging multiple input tensors would not work if the shape of the input tensor differ from each other. It would be nice if there is native and robust support of using multiple input tensors with TimeDistributed.

nixingyang on 21 Dec 2016

👍3

They should have the same number of timesteps either way.

farizrahman4u on 21 Dec 2016

@farizrahman4u I think that would be useful and for the sake of consistency.
Or at least it should be mentioned in the TimeDistributed documentation/error message that it doesn't support multiple inputs. This thread is easy to find once you figure out what the cause of the problem is, but it's not obvious right away that there should be this problem. It is easy to assume that you can pass multiple inputs here as anywhere else.

daniilsorokin on 1 Feb 2017

👍6

Is multi input support planned to be implemented? I'm currently packing multi-inputs into a single input, which is not very ideal/good design.

calclavia on 7 Apr 2017

👍3

@farizrahman4u
Thank you for your solution. I tried it but it fails

With theano backend, I had

TypeError: ('Not a Keras tensor:', Subtensor{::, int64, ::}.0)

because of line 13

...
temps = [model(merged_input[:, x, :]) for x in range(num_inputs)]
...

With tensorflow backend, I had

AttributeError: 'NoneType' object has no attribute 'inbound_nodes'

because of line 15

...
merged_model = Model(input=merged_input, output=merged)
...

I am new to keras and use

keras: 2.0.2
theano 0.10.0dev1.dev-RELEASE
tensorflow 1.1.0
ubuntu 16.04

QuentinFresnel on 15 Jun 2017

Any update on multi-input support for TimeDistributed?
I would like to implement a hierarchical model to classify each sentence based on the context information provided by the whole document. Thus, I designed a model where the first recurrent layer works at sentence level and the second one at document level. I want to include further sentence-level information (e.g. sentence type or category), concatenate it to the output of the first layer and use the resulting augmented tensor to feed the second recurrent layer. Here is my code:

# input sentence
in_sentence = Input(shape=(MAX_LENGTH,), dtype='int32')
# additional sentence-level information (fixed length array - already "embedded")
in_info = Input(shape=(INFO_LENGTH,), dtype='float32')
# first layer (sentence-level)
# word embedding
embedded_sentence = Embedding(len(vocab) + 1,
                            EMBEDDING_DIM,
                            weights=[embedding_matrix],
                     trainable = False)(in_sentence)

recurrent_sentence = GRU(hidden_dim_1)(embedded_sentence)
recurrent_sentence_and_info = concatenate([recurrent_sentence, in_info])
encoded_model = Model([in_sentence, in_info], recurrent_sentence_and_info)

#second layer (document-level)
sequence_input = Input(shape=(MAX_SENTENCES, MAX_LENGTH), dtype='int32')
info_seq_input = Input(shape=(MAX_SENTENCES, INFO_LENGTH), dtype='float32')
seq_encoded = TimeDistributed(encoded_model)([sequence_input, info_seq_input])

# Encode entire sentence
seq_encoded = GRU(hidden_dim_2,return_sequences=True)(seq_encoded)

# Prediction
prediction = Dense(NUM_CLASSES, activation='softmax')(seq_encoded)
model = Model([sequence_input,info_seq_input], prediction)

For the sake of completeness, MAX_LENGTH is different from INFO_LENGTH, in general.
When I run
seq_encoded = TimeDistributed(encoded_model)([sequence_input, info_seq_input])
I get the following assertion error: assert len(input_shape) >= 3

Without additional sentence-level information, TimeDistributed has a single input and everything seems to work fine. Is there any work around to include multiple input?

ChiaraMasiero on 27 Jun 2017

👍12

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

stale[bot] on 25 Sep 2017

I am using Bucketing to group together batches of different length. This is time series data where I have multiple time series of the same length (but variable length across batches) as input to the different layers at the beginning of the model. Thus the input shape for my input layer is (None, 1) because I only have one column of data per input. How can I apply @farizrahman4u original solution without an input length?

I've tried submitting None as a dimension and get the following error:

ValueError: Tried to convert 'shape' to a tensor and failed. Error: None values not supported.

ghost on 28 Apr 2018

If the multi-input of TimeDistributed has different shape, such as there has three input A, B, C, only A is a 5D tensor, the B and C are not, but I have to input them together to a custom_conv function through Lambda layer which will be wrapped by the TimeDistributed, so does it have possibility to support to input a list with different shapes.

karenyun on 1 May 2018

@karenyun Can you elaborate? How you pass a 5D tensor and 2 non-5D tensors exactly?

dmadeka on 27 Jun 2018

Here is a simple code that shows the problem of trying to combine a multi-input model with LSTM. It fails on the TimeDistributed line. Any ideas on how to fix it?

from keras.models import Sequential, Model
from keras.layers import Conv2DTranspose, Input, InputLayer, Reshape, Lambda, Concatenate
from keras.layers import TimeDistributed, Conv2D, MaxPooling2D, Flatten, Dropout, Dense
from keras.layers.recurrent import LSTM

sequence_len = 50
input_image_dim = (128,) * 2 + (3,)
input_vector_dim = (5,)
output_dim = 4

# simple multi-input model
input_x1 = Input(shape=input_image_dim, name='image_input')
x = input_x1
x = Conv2D(128, (4, 4), strides=5, activation='relu')(x)
x1 = Flatten()(x)

input_x2 = Input(shape=input_vector_dim, name='vector_input')
x = input_x2
x2 = Dense(32)(x)

x = Concatenate()([x1, x2])
x = Dense(32)(x)
model = Model([input_x1, input_x2], x, name='pre_lstm')
print(model.summary())

# pass in a model into LSTM
input_x = Input(shape=(sequence_len,) + input_image_dim, name='encoder_input_time_dist')
input_v = Input(shape=(sequence_len,) + input_image_dim, name='encoder_input_time_dist')

x = TimeDistributed(model)([input_x, input_v])
x = LSTM(256, return_sequences=True, dropout=0.5)(x)
x = Dense(32)(x)
model = Model([input_x1, input_x2], x, name='pre_lstm')
print(model.summary())

iretiayo on 9 Sep 2018

👍3

@iretiayo Have you found any solution to your problem? I'm currently looking at the exact same architectural problem. As you, I'm having a 2D vector and an image as input, which is then fed to a LSTM. If you have any solution using a different approach, it would also be great if you could share it! My only idea of how to solve that is by using T input networks with shared weight, which then are used as a sequence to feed to the lstm layer.

raharth on 13 Oct 2018

👍3

@raharth Have you found any solution?
Here is the part of my code that throws out the error:

mohammadAbdolhosseiniMoghaddam on 15 Feb 2019

@raharth Have you found any solution?
Here is the part of my code that throws out the error:

Unfortunately I don't see your code. As far as I remember I actually used shared weights to solve it, but it was a whole mess and really hacky. I actually decided to build the same architecture using pytorch, which is way more flexible and cleaner for an architecture like that. Since it doesn't compile the graph it doesn't care about what you did with the tensor before feeding it to a specific layer, so you just need to take care that the shape actually matches what it expects.

raharth on 15 Feb 2019

@farizrahman4u Thanks a lot for the proposed solution. I tried Lambda layer, but it seems the nested model can't be trained when it is embedded in Lambda layer. Do you have any suggestion regarding this issue?
Here, you can find the snippet of my code that I am stuck. Any help is highly appreciated.

mohammadAbdolhosseiniMoghaddam on 20 Feb 2019

RepeatVector can help, e.g.

TimeDistributed( Dense(tensorflow.shape(word_embeddings)[2] ) )(Concatenate()([
    word_embeddings,  
    RepeatVector( tensorflow.shape(word_embeddings)[1] )(sentence_embedding)
]))

chrishmorris on 9 Sep 2019

I was able to solve this problem using the RepeatVector layer.

from keras.models import Model
from keras.layers import Input, Dense, BatchNormalization, LSTM, TimeDistributed #, Conv1D, LeakyReLU, MaxPool1D,
from keras.layers import Concatenate, RepeatVector
...
core_input_1 = Input(shape=(self.core_timesteps, self.core_input_1_dim), name='core_input_1')
core_branch_1 = BatchNormalization(momentum=0.0, name='core_1_bn')(core_input_1)
core_branch_1 = LSTM(self.core_nodes[0], activation='relu', name='core_1_lstm_1', return_sequences=True)(core_branch_1)
core_branch_1 = LSTM(self.core_nodes[1], activation='relu', name='core_1_lstm_2')(core_branch_1)

core_input_2 = Input(shape=(self.core_timesteps, self.core_input_2_dim), name='core_input_2')
core_branch_2 = BatchNormalization(momentum=0.0, name='core_2_bn')(core_input_2)
core_branch_2 = LSTM(self.core_nodes[0], activation='relu', name='core_2_lstm_1', return_sequences=True)(core_branch_2)
core_branch_2 = LSTM(self.core_nodes[1], activation='relu', name='core_2_lstm_2')(core_branch_2)

merged = Concatenate()([core_branch_1, core_branch_2])

full_branch = RepeatVector(self.output_timesteps)(merged)        
full_branch = LSTM(self.core_nodes[1], activation='relu', name='final_lstm', return_sequences=True)(full_branch)

full_branch = TimeDistributed(Dense(self.output_dim, name='td_dense', activation='relu'))(full_branch)
full_branch = TimeDistributed(Dense(self.output_dim, name='td_dense'))(full_branch)

Note that return sequences before the concat is False and that the RepeatVector layer repeats the concat vector the same number of timesteps as I want output by the TimeDistributed layer at the end. This is a multi-variate mutli-step forecasting model.