Keras: LSTM with mean_squared_error doesn't reduce the loss over time

Created on 2 Jul 2015  路  2Comments  路  Source: keras-team/keras

Hello,

I am trying to use LSTM on this terribly simple data - just saw-like sequence of two columns from 1 to 10. The generation of data can be found here (pandas required).

I am trying this very simple (but for this task it should be sufficient) model:

from keras.models import Sequential
from keras.layers.core import Activation
from keras.layers.recurrent import LSTM

model = Sequential()
model.add(LSTM(2, 150, return_sequences=False))
model.add(Dense(150, 2))
model.add(Activation('softmax'))
model.compile(loss="mean_squared_error", optimizer="rmsprop")

Model compiles and everything seems work OK (dimensions and so on), but during fitting:

model.fit(X_train, y_train, batch_size=450, nb_epoch=40, validation_split=0.05)

I get all the time the exactly same value of loss function on end of each epoch.

What is interesting is the fact that the (still the same) result of loss function does not depend on the number of training samples (it's same for 10000 and for 50000). And when I compute the MSE of predicted sample model.predict(X_test) and y_test, I get exactly the same number.

Any idea what is wrong?

By the way, when I use the "categorical_crossentropy", the results seems better (although not scaled correctly - but it at least get the "shape" of the data.

All 2 comments

You should use cross entropy loss with softmax. Mean squared error should be reserved for regression tasks.

Yes. Changing softmax to linear solved it. Thank you for the answer.

Was this page helpful?
0 / 5 - 0 ratings