I have a list of sequential values. I want to feed them into a RNN to predict the next value in the sequence.
[ 0.43589744 0.44230769 0.49358974 ..., 0.71153846 0.70833333 0.69230769]
I keep getting an accuracy of 1.0. I found a similar issue for classification but no methods used there worked for me.
I get a decreasing loss but my accuracy is always 1.0
How can I fix this?
model = Sequential()
model.add(SimpleRNN(1, 100))
model.add(Dense(100, 1, activation = "sigmoid"))
model.compile(loss="mean_squared_error", optimizer = "sgd")
Epoch 0
1517/1517 [==============================] - 0s - loss: 0.0726 - acc: 1.0000 - val_loss: 0.0636 - val_acc: 1.0000
Epoch 1
1517/1517 [==============================] - 0s - loss: 0.0720 - acc: 1.0000 - val_loss: 0.0629 - val_acc: 1.0000
...
You're working on a time-series type regression problem here -- all the acc: 1.0000 means is that when the true value is > 0.5, so is your prediction, and vice versa. Focus on the loss field, this is the mse that you actually care about. When you call .fit(), I recommend setting show_accuracy = False.
As Luke points out, accuracy is not relevant at all for a regression problem.
But even if your targets were binary labels, since your output is a scalar (instead of a binary categorical vector), you would need to set class_mode='binary' in compile for the accuracy metric to make sense. By default it's set to categorical.
When my model finishes training, how should I measure the accuracy?
sklearn.metrics.accuracy_score or manually compare y_tests with model.predict(x_tests)?
Edit: Nevermind this accuracy question, I forgot about model.evaluate(...)
Furthermore, am I using RNNs correctly? Is nb_timesteps inconsequential for RNNs or do I need to define it with something like an Embedding layer or the .shape of the x_trains?
Thanks for quick replies.
Embedding layer is usually designed for words. Google word2vec if you're not familiar with that.
I think you're using RNNs correctly from my experience.
If your loss is 0, it means your accuracy is 100%. But again computing an accuracy does not make sense for continuous output. If you're predicting if the time series is going UP or DOWN, you can derive an accuracy because it's a classification problem.
Most helpful comment
You're working on a time-series type regression problem here -- all the
acc: 1.0000means is that when the true value is > 0.5, so is your prediction, and vice versa. Focus on thelossfield, this is themsethat you actually care about. When you call.fit(), I recommend settingshow_accuracy = False.