Keras: Input 0 is incompatible with layer dense_1: expected ndim=2, found ndim=3

Created on 20 Apr 2016 · 11Comments · Source: keras-team/keras

Hi,

I am trying to merge three embedded layers by concatenation and then apply Dense Layer on the merged layer - very similar to multiple-input, multiple output example provided in Functional API section (except that I have three layers to be merged instead of two and I am merging the embedded input layers rather than an LSTM).

The merge is happening fine but when I try to add the dense layer with an activation, I get the following error:

Exception: Input 0 is incompatible with layer dense_1: expected ndim=2, found ndim=3

Following is the relevant code excerpt:

> l1 = Input(shape = (1000,), dtype='int32', name = 'l1')
> e1 = Embedding(output_dim=60,input_dim=1000,input_length=1000) (l1)
> 
> l2 = Input(shape = (1000,), dtype='int32', name = 'l2')
> e2 = Embedding(output_dim=60,input_dim=1000,input_length=1000) (l2)
> 
> l3 = Input(shape = (20,), dtype='int32', name = 'l3')
> e3 = Embedding(output_dim=60,input_dim=20,input_length=20) (l3)
> 
> I = merge([e1,e2,e3], mode='concat',concat_axis=1)
> layer = Dense(60, activation = 'tanh') (I)

And following is the full error trace:

File "/home/.eclipse/org.eclipse.platform_4.5.1_1473617060_linux_gtk_x86_64/plugins/org.python.pydev_4.5.4.201601292234/pysrc/pydevd.py", line 1524, in <module> globals = debugger.run(setup['file'], None, None, is_module) File "/home/.eclipse/org.eclipse.platform_4.5.1_1473617060_linux_gtk_x86_64/plugins/org.python.pydev_4.5.4.201601292234/pysrc/pydevd.py", line 931, in run pydev_imports.execfile(file, globals, locals) # execute the script File "/home/.eclipse/org.eclipse.platform_4.5.1_1473617060_linux_gtk_x86_64/plugins/org.python.pydev_4.5.4.201601292234/pysrc/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/home/workspace/src/main.py", line 75, in <module> relTP_model = relTP() File "/home/workspace/src/main.py", line 45, in relTP h_a = Dense(96, input_shape=(3,), activation = 'relu') (I) File "/usr/local/lib/python3.4/dist-packages/keras/engine/topology.py", line 441, in __call__ self.assert_input_compatibility(x) File "/usr/local/lib/python3.4/dist-packages/keras/engine/topology.py", line 382, in assert_input_compatibility str(K.ndim(x)))

I am not able to understand what is causing the error? Can someone please help me to resolve this?

Thanks

Source

code-ball

Most helpful comment

Hi Parunach,
You are trying to process a sequence with a dense layer. Because of that you get a dimension mismatch. It should work, if you set
return_sequences=False
in the LSTM.

bruderjakob12 on 28 Sep 2016

👍5 👀1

All 11 comments

Your embedding layers will map 1000/1000/20 integer to 1000/1000/20 vectors with length equal to 60, therefore you will got Tensor with shape = (?, 2020, 60). Dense layer expect ndim = 2 input. That's why you got such error. Can you give an example to illustrate what you want to do? (e.g. input, output)

joelthchao on 20 Apr 2016

Hi Joel,

Thanks for your answer. It seems this is what I want and so I need to figure out how to modify this to make it work.

Basically, I have three different inputs (each has multiple labels) and so each input comes as one-hot vector. I want to first embed them and then concatenate them. And then add a conventional NN layer on that new merged input. Output will be softmax/sigmoid activation of the new layer.

Please let me know if you think a way of resolving the error.

Thanks

code-ball on 20 Apr 2016

@code-ball Let's formulate your task. x_1 = N^1000, x_2 = N^1000, x_3 = N^20. After embedding, you will get e_1 = R^(1000x60), e_2 = R^(1000x60), e_3 = R^(20x60). Then, after concatenation and dense layer (60 -> 60), you want an output y = R^(2020x60), right?

joelthchao on 20 Apr 2016

Hi,

I think representing it this way will simplify my question a lot -- thanks for putting it like this.
Ok, so this is not what I want. Following is task.

x1 = N^1000, x_2 = N^1000, x_3 = N^20. In this case, N is no. of available samples.
Now for each sample (out of N), I want the following:
a.) e_1 = vector of length 60 (1x60) ; e_2 = (1x 60) and e_3 = (1x 60). Essentially I want to embed 1000/20 dimensional one-hot input vectors all to 60-dimensional embedded vector. And hence the embedding weight matrix will be of size (1000 x 60), (1000 X 60) and (20 x 60) but I guess Keras handles those internally
b.) After concatenation, the size of new input (embedded) vector will be (1 x 180)
c.) dense layer should output y = R^(1 x 60) where the weight matrix will be (180 x 60).

Hence, finally I should get an output vector of size (1x60). I hope this clears the formulation.

Thanks

code-ball on 20 Apr 2016

First, you need to create three batches by yourself. Each batch consists of batch_size integers from corresponding x_i. Following is the model:

l1 = Input(batch_shape=(None,), dtype='int32', name='l1')
l2 = Input(batch_shape=(None,), dtype='int32', name='l2')
l3 = Input(batch_shape=(None,), dtype='int32', name='l3')

e1 = Embedding(output_dim=60, input_dim=1000, input_length=1)(l1)
e2 = Embedding(output_dim=60, input_dim=1000, input_length=1)(l2)
e3 = Embedding(output_dim=60, input_dim=20, input_length=1)(l3)

merged = merge([e1,e2,e3], mode='concat', concat_axis=1)
flatten = Reshape((180,)) (merged)
activation = Dense(60, activation='tanh')(flatten)

This will produce Tensor("Tanh:0", shape=(?, 60), dtype=float32). Please test it and see if it works. P.S. Not sure why we need to reshape merged. Its shape is good BUT its _keras_shape is not good.

joelthchao on 20 Apr 2016

Hi Joel,

This seems to serve the purpose of what I had described. Although, this distorts the overall shape which is creating problems for me in further computations. I will share more details once I have done my investigation and start another topic. But this surely is very helpful and thanks a lot for such quick help and turnaround. I really appreciate it!

Cheers!

code-ball on 20 Apr 2016

🎉3

@joelthchao Thanks Joel! That works for me!

ylqfp on 25 Apr 2016

Hi Joel

I have similar problems with RNNs. I have

look_back = 10
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)

# reshape input to be [samples, time steps, features]                                                                            
trainX = np.reshape(trainX, (trainX.shape[0], trainX.shape[1], 1))
testX = np.reshape(testX, (testX.shape[0], testX.shape[1], 1))
print 'Reshape', trainX.shape, testX.shape

# create and fit the LSTM network                                                                                                
model = Sequential()
model.add(LSTM(10, batch_input_shape=(batch_size, look_back, 1), stateful=True, return_sequences=True))
model.add(Dense(1))

When I try this, it gives me the same error in the Dense layer creation. Any suggestions?

parunach on 19 Sep 2016

Hi Parunach,
You are trying to process a sequence with a dense layer. Because of that you get a dimension mismatch. It should work, if you set
return_sequences=False
in the LSTM.

bruderjakob12 on 28 Sep 2016

👍5 👀1

Hi All
Could you please answer me

This is my source code

import os
import pickle
import numpy as np
from keras.models import Sequential
import gensim
from keras.layers.recurrent import LSTM, SimpleRNN
from sklearn.model_selection import train_test_split
import theano
theano.config.optimizer = "None"
import sys

sys.setdefaultencoding("ISO-8859-1")

conversation.pickle.encode('utf-8').strip()

with open('new2.pickle', 'rb') as f:
(vec_x, vec_y) = pickle.load(f)

vec_x = np.array(vec_x, dtype=np.object)
vec_y = np.array(vec_y, dtype=np.object)

x_train, x_test, y_train, y_test = train_test_split(vec_x, vec_y, test_size=0.2, random_state=1)

model = Sequential()
model.add(LSTM(output_dim=300, input_shape=x_train.shape[1:], return_sequences=True, init='glorot_normal', inner_init='glorot_normal', activation='sigmoid'))
model.add(LSTM(output_dim=300, input_shape=x_train.shape[1:], return_sequences=True, init='glorot_normal', inner_init='glorot_normal', activation='sigmoid'))
model.add(LSTM(output_dim=300, input_shape=x_train.shape[1:], return_sequences=True, init='glorot_normal', inner_init='glorot_normal', activation='sigmoid'))
model.add(LSTM(output_dim=300, input_shape=x_train.shape[1:], return_sequences=True, init='glorot_normal', inner_init='glorot_normal', activation='sigmoid'))
model.compile(loss='cosine_proximity', optimizer='adam', metrics=['accuracy'])

model.fit(x_train, y_train, nb_epoch=500, validation_data=(x_test, y_test))
model.save('LSTM5000.h5')

predictions = model.predict(x_test)
mod = gensim.models.Word2Vec.load('word2vec.bin')
[mod.most_similar([predictions[10][i]])[0] for i in range(15)]

This is the error

chatbotlstmtrain.py:29: UserWarning: Update your LSTM call to the Keras 2 API: LSTM(recurrent_initializer="glorot_normal", activation="sigmoid", return_sequences=True, units=300, kernel_initializer="glorot_normal", input_shape=())
model.add(LSTM(output_dim=300, input_shape=x_train.shape[1:], return_sequences=True, init='glorot_normal', inner_init='glorot_normal', activation='sigmoid'))
Traceback (most recent call last):
File "chatbotlstmtrain.py", line 29, in
model.add(LSTM(output_dim=300, input_shape=x_train.shape[1:], return_sequences=True, init='glorot_normal', inner_init='glorot_normal', activation='sigmoid'))
File "/usr/local/lib/python3.5/dist-packages/keras/models.py", line 464, in add
layer(x)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py", line 482, in __call__
return super(RNN, self).__call__(inputs, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 559, in __call__
self.assert_input_compatibility(inputs)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 458, in assert_input_compatibility
str(K.ndim(x)))
ValueError: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=1

sarancruzer on 4 Jan 2018

👎3

Hi Parunach,
You are trying to process a sequence with a dense layer. Because of that you get a dimension mismatch. It should work, if you set
return_sequences=False
in the LSTM.

Thank you so much your approach helped. but my question here is that what if we need to keep the return_sequesnces= True?
I have a attention LSTM Autoencoder, before adding attention it was working fine, but when I added the attention it raises this error!
I m not sure how can I fix it because I need to keep the return_sequences=True
This is the link for my question https://stackoverflow.com/questions/55637807/exception-input-0-is-incompatible-with-layer-dense-1-expected-ndim-2-found-nd
which nobody answered it.

Thanks