I have used same imdb_lstm.py for my data set and my program is given below and also for error please look at an attached image. How to correct this. (23 classes and 41 features)
print('Build model...')
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen, dropout=0.2))
model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print('Train...')
print(X_train.shape)
print(y_train.shape)
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=15,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test,
batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)

The issue is dead obvious. Shape of your Y is (batch_size, 24) (It should be actually 23, make sure your labels are zero index based), whereas output_shape of your model is (None, 1).
The model is a binary classifier, it can only work with 2 classes. Since you have 23 classes, the output_dim of the final Dense layer should be 23, not 1, Followed by a softmax activation instead of sigmoid. The output of your model would be a probability distribution over the 23 classes.
could you please explain step step of this example and also is it possible to do like this
model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(Dense(23))
model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(Dense(1))
To create two lstm layers (two memory blocks with two cells)
Why do you add Dense(1) at the end? You have 23 classes.
And adding LSTM after Dense ? LSTM is an RNN, it works on sequences.
Please go through the docs.
No. It's not dense(1), dense(23). Becoz I want to create the following network topologies:one memory block with one cell , two memory block with two cell, three memory block with three cell. Also could you please explain that above program becoz I am beginner to this.
Whatever maybe your model, your models input shape and output shape should be compatible with your training data. The final dense layer should have output_dim=23, not the mid layers.
Is this what you are trying to do:
model.add(LSTM(128, return_sequences=True, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(TimeDistributed(Dense(100)))
model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2)) # try using a GRU instead, for fun
model.add(Dense(23))
what is TimeDistributed(Dense(100)))
Also how to find training and test accuracy?
Please checkout the docs. TimeDistributed wrapper runs a layer over a sequence of vectors. In other words it applies a layer over every timestep in a timeseries.
I strongly recommend that you get your deep learning bascis right and read the Keras docs thoroughly.
IMDB dataset Explanation
IMDB dataset user reviews are classified into positive or negative review.
First, it convert words to ids (assign a unique id to every word from the dataset), so a sentence is represented as a list of ids (integers).
Then, we embed that list for representing every id as a vector (that will learn relations between ids, this embedding will be learned during training too) so instead of a list of ids, we get a list of vectors.
Then we apply LSTM network to classify these sequences.
Is that right?
Yes.
i had a similar question. my program is as below(labelNum is 48):
model = Sequential()
model.add(Embedding(max_features, embedding_size, input_length=maxlen))
model.add(Dropout(0.25))
model.add(Convolution1D(nb_filter=nb_filter,
filter_length=filter_length,
border_mode='valid',
activation='relu',
subsample_length=1))
model.add(MaxPooling1D(pool_length=pool_length))
model.add(LSTM(lstm_output_size))
model.add(Dense(labelNum))
model.add(Activation('softmax'))
model.compile(loss = 'categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
and problem is that:
Exception: Error when checking model target: expected activation_1 to have shape (None, 48) but got array with shape (100, 1)
thank you !
You haven't posted the actual code that causes the error, as this error arises from the fit function. But to save your time.. you haven't converted your labels to categorical data.
yes, that's the problem, thank you!!!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.
??
@vinayakumarr
you must use fit_generator() instead of fit()
Hi guys! Thanks for anyone to answer the following question.
model = Sequential()
model.add(LSTM(units=self.lstm_output_dim,
input_dim=input_dim,
activation=self.activation_lstm,
input_length=66,
dropout=self.drop_out,
return_sequences=True))
for i in range(self.lstm_layer-2):
model.add(LSTM(units=self.lstm_output_dim,
activation=self.activation_lstm,
dropout=self.drop_out,
return_sequences=True))
model.add(LSTM(64, activation=self.activation_lstm, dropout=self.drop_out))
for i in range(self.dense_layer-1):
model.add(Flatten())
model.add(Dense(units=self.lstm_output_dim, activation=self.activation_dense, use_bias=True))
model.add(Dropout(self.drop_out))
model.add(Flatten())
model.add(Dense(2))
model.compile(loss=self.loss, optimizer=self.optimizer, metrics=['accuracy'])
X_train = np.reshape(np.array(trainX), (1, len(trainX), len(trainX[0])))
Y_train = np.reshape(np.array(trainY), (1, len(trainY), len(trainY[0])))
y_binary = to_categorical(Y_train)
history = model.fit(x=X_train, y=y_binary, epochs=self.epochs, batch_size=self.batch_size, validation_data=(testX, testY))
ValueError: A target array with shape (1, 66, 2) was passed for an output of shape (None, 2) while using as loss categorical_crossentropy. This loss expects targets to have the same shape as the output.
The input data trainX is 66*53, and trainY is 66*1. But it appeared the errors, please tell me why? Thanks!
Most helpful comment
The issue is dead obvious. Shape of your Y is
(batch_size, 24)(It should be actually 23, make sure your labels are zero index based), whereasoutput_shapeof your model is(None, 1).The model is a binary classifier, it can only work with 2 classes. Since you have 23 classes, the
output_dimof the finalDenselayer should be 23, not 1, Followed by asoftmaxactivation instead ofsigmoid. The output of your model would be a probability distribution over the 23 classes.