Keras: Exception: A target array with shape (395216, 24) was passed for an output of shape (None, 1)

Created on 17 Jul 2016 · 18Comments · Source: keras-team/keras

I have used same imdb_lstm.py for my data set and my program is given below and also for error please look at an attached image. How to correct this. (23 classes and 41 features)

print('Build model...')
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen, dropout=0.2))
model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(Dense(1))
model.add(Activation('sigmoid'))

try using different optimizers and different optimizer configs

model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])

print('Train...')
print(X_train.shape)
print(y_train.shape)
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=15,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test,
batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)

untitled

stale

Source

vinayakumarr

Most helpful comment

The issue is dead obvious. Shape of your Y is (batch_size, 24) (It should be actually 23, make sure your labels are zero index based), whereas output_shape of your model is (None, 1).

The model is a binary classifier, it can only work with 2 classes. Since you have 23 classes, the output_dim of the final Dense layer should be 23, not 1, Followed by a softmax activation instead of sigmoid. The output of your model would be a probability distribution over the 23 classes.

farizrahman4u on 17 Jul 2016

👍15 😄3

All 18 comments

The issue is dead obvious. Shape of your Y is (batch_size, 24) (It should be actually 23, make sure your labels are zero index based), whereas output_shape of your model is (None, 1).

farizrahman4u on 17 Jul 2016

👍15 😄3

could you please explain step step of this example and also is it possible to do like this

model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(Dense(23))
model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(Dense(1))

To create two lstm layers (two memory blocks with two cells)

vinayakumarr on 17 Jul 2016

Why do you add Dense(1) at the end? You have 23 classes.

And adding LSTM after Dense ? LSTM is an RNN, it works on sequences.

Please go through the docs.

farizrahman4u on 17 Jul 2016

No. It's not dense(1), dense(23). Becoz I want to create the following network topologies:one memory block with one cell , two memory block with two cell, three memory block with three cell. Also could you please explain that above program becoz I am beginner to this.

vinayakumarr on 17 Jul 2016

Whatever maybe your model, your models input shape and output shape should be compatible with your training data. The final dense layer should have output_dim=23, not the mid layers.

farizrahman4u on 17 Jul 2016

Is this what you are trying to do:

model.add(LSTM(128, return_sequences=True, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(TimeDistributed(Dense(100)))
model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(Dense(23))

farizrahman4u on 17 Jul 2016

what is TimeDistributed(Dense(100)))

vinayakumarr on 18 Jul 2016

Also how to find training and test accuracy?

vinayakumarr on 18 Jul 2016

Please checkout the docs. TimeDistributed wrapper runs a layer over a sequence of vectors. In other words it applies a layer over every timestep in a timeseries.
I strongly recommend that you get your deep learning bascis right and read the Keras docs thoroughly.

farizrahman4u on 18 Jul 2016

IMDB dataset Explanation
IMDB dataset user reviews are classified into positive or negative review.
First, it convert words to ids (assign a unique id to every word from the dataset), so a sentence is represented as a list of ids (integers).
Then, we embed that list for representing every id as a vector (that will learn relations between ids, this embedding will be learned during training too) so instead of a list of ids, we get a list of vectors.
Then we apply LSTM network to classify these sequences.

Is that right?

vinayakumarr on 18 Jul 2016

Yes.

farizrahman4u on 18 Jul 2016

i had a similar question. my program is as below(labelNum is 48):

model = Sequential()
model.add(Embedding(max_features, embedding_size, input_length=maxlen))
model.add(Dropout(0.25))
model.add(Convolution1D(nb_filter=nb_filter,
filter_length=filter_length,
border_mode='valid',
activation='relu',
subsample_length=1))
model.add(MaxPooling1D(pool_length=pool_length))
model.add(LSTM(lstm_output_size))
model.add(Dense(labelNum))
model.add(Activation('softmax'))
model.compile(loss = 'categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])

and problem is that:

Exception: Error when checking model target: expected activation_1 to have shape (None, 48) but got array with shape (100, 1)

thank you !

wulongfeng on 22 Jul 2016

You haven't posted the actual code that causes the error, as this error arises from the fit function. But to save your time.. you haven't converted your labels to categorical data.

farizrahman4u on 22 Jul 2016

yes, that's the problem, thank you!!!

wulongfeng on 22 Jul 2016

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.

stale[bot] on 23 May 2017

CodeDogandCat on 10 Aug 2018

@vinayakumarr
you must use fit_generator() instead of fit()

asxasxdscsd on 31 Mar 2019

Hi guys! Thanks for anyone to answer the following question.

    model = Sequential()
    model.add(LSTM(units=self.lstm_output_dim,
                   input_dim=input_dim,
                   activation=self.activation_lstm,
                   input_length=66,
                   dropout=self.drop_out,
                   return_sequences=True))

    for i in range(self.lstm_layer-2):
        model.add(LSTM(units=self.lstm_output_dim,
                   activation=self.activation_lstm,
                   dropout=self.drop_out,
                   return_sequences=True))

    model.add(LSTM(64, activation=self.activation_lstm, dropout=self.drop_out))

    for i in range(self.dense_layer-1):
        model.add(Flatten())
        model.add(Dense(units=self.lstm_output_dim, activation=self.activation_dense, use_bias=True))
        model.add(Dropout(self.drop_out))

    model.add(Flatten())
    model.add(Dense(2))

    model.compile(loss=self.loss, optimizer=self.optimizer, metrics=['accuracy'])


    X_train = np.reshape(np.array(trainX), (1, len(trainX), len(trainX[0])))
    Y_train = np.reshape(np.array(trainY), (1, len(trainY), len(trainY[0])))

    y_binary = to_categorical(Y_train)
    history = model.fit(x=X_train, y=y_binary, epochs=self.epochs, batch_size=self.batch_size, validation_data=(testX, testY))

ValueError: A target array with shape (1, 66, 2) was passed for an output of shape (None, 2) while using as loss categorical_crossentropy. This loss expects targets to have the same shape as the output.

The input data trainX is 66*53, and trainY is 66*1. But it appeared the errors, please tell me why? Thanks!