Keras: improve LSTM accuracy

Created on 2 Jul 2019  路  7Comments  路  Source: keras-team/keras

I'm trying to build LSTM architecture to predict sickness rate. I'm actually stuck in 40% accuracy, I'm new in machine learning and I tried several tips like changing the optimzer, the layer node number and the dropout value without any improving. So could you guys help me with some advice.

the x array is composed of 10 columns

the y array is only one column the sickness rate

here is my model
def lstm_model(): model = Sequential() model.add(LSTM(10, input_shape=(1,10), return_sequences= True)) model.add(Dropout(0.2)) model.add(LSTM(100, return_sequences= True)) model.add(LSTM(100, return_sequences= False)) model.add(Dropout(0.2)) model.add(Dense(50,kernel_constraint=NonNeg(),kernel_initializer='normal' ,activation="relu")) model.add(Dense(1,activation="linear")) model.compile(optimizer='adam',loss='mean_squared_error',metrics=['accuracy']) return model lstm = lstm_model()
this is the output of . evaluate()
1275/1275 [==============================] - 1s 526us/sample - loss: 0.0015 - acc: 0.3930 0.0014869439909029204 0.3930161
and thank you in advance

Most helpful comment

Adding to @RooieRakkert 's answer, there are two methods you can use to check how if your model is performing well for regression task:

  • If you're using root_mean_squared metric, make sure that training, validation, and testing error are low and close to each other in magnitude.
  • Use R^2 (coefficient of determination) metric from sklearn library. The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get an R^2 score of 0.0. So you can check if your R^2 score is close to 1 then it's a good model.

All 7 comments

Hi there,

You have a mismatch:

  1. Your model has an output activation linear and loss `mean_square_error. It is not constrained to predict between the interval [0, 1].

  2. Accuracy as a metric is assuming probabilities.

If you are trying to do classification, change the loss to crossentropy and the output activation to sigmoid/softmax (binary/categorical). If you are trying to do regression, accuracy metric is misplaced.

Best

Thanks @briannemsick for your reply to clarify more my problem is a regression one .I'm trying to predict the skinnes rate (between 0-100%) using historical information of this disease . So which activation and loss function should i use for the out put?

In that case linear and mean_square_error are both fine, accuracy is not a valid metric in this case (not a classification problem). Consider using mean_square_error (the loss function) or mean_absolute_error as a metric.

So if I understood well the accuracy doesn't have any sens for my case.Then based on your experience from which value of mean_absolute_error I can say that I have efficient model with good prediction.

@cabiste007, I think you should have a look at what mean_square_error actual means and/or how it is computed. In its essence, MSE measures the average squared error of our predictions. For every prediction, the square difference between its prediction and its target value is computed. All squared differences are then averaged.

An acceptable MSE value will be different for different datasets. We cannot quantify an 'optimal MSE score'.
To consider an MSE score to be 'good', you must consider the values you are predicting, as well as the distribution of the variables in your original dataset.

As an example, let's consider two mock datasets. I'll only compare these datasets, based on their variable ranges (or scale). Both datasets are well distributed.

1) First dataset has reasonable large values, such as car prices. Cars in this mock dataset are all somewhere in the range of 1000 to 80.000 usd.
If our model has an MSE of 50 we could say that the average squared difference between the estimate values and the predicted values is about 50.
Our model's performance is quite good, we will only predict values that are not that far off the truth.

2) Second dataset has smaller values, such as lengths of persons (in cm). Our variables are all somewhere in the range of 120 to 250 cm.
If our second model as an MSE of 50 as well, we have to notice that this is quite a large error; as the possible range of our values is smaller, a MSE of 50 suddenly makes the average squared error for our predictions a lot further away from the truth than those in our first model.

Adding to @RooieRakkert 's answer, there are two methods you can use to check how if your model is performing well for regression task:

  • If you're using root_mean_squared metric, make sure that training, validation, and testing error are low and close to each other in magnitude.
  • Use R^2 (coefficient of determination) metric from sklearn library. The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get an R^2 score of 0.0. So you can check if your R^2 score is close to 1 then it's a good model.

@briannemsick , @RooieRakkert and @dabasajay I'm so grateful thnak to you everything is clear in my mind right now. So I need to change the metrics to mean_absolute_error, mean_square_error or R^2 and run fit() and wait until my model converge close 0.0.
So now I implemented this architecture now:
def lstm_model(): model = Sequential() model.add(LSTM(10, input_shape=(1,10), return_sequences= True)) model.add(Dropout(0.2)) model.add(LSTM(100, return_sequences= True)) model.add(LSTM(100, return_sequences= False)) model.add(Dense(1,activation="linear")) model.compile(optimizer='adam',loss='mean_squared_error',metrics=['mean_squared_error']) return model lstm = lstm_model() newlstmhis = lstm.fit(xtrr,ytr,epochs=1000 , validation_data=(xtstt, ytst),verbose=2, shuffle=True)

but my model didn't converge knowing that using this ANN converge and give me mse=0.8 which is reasonable for my case (my values are between 0%-100%) :
import numpy numpy.random.seed(8) def build_dropout_model2(rate): model = Sequential() model.add(Dense(1000, input_shape=(10,), activation="relu")) model.add(Dense(500, activation="relu")) model.add(Dense(1)) model.summary() model.compile(loss="mean_squared_error", optimizer="adam", metrics=["mean_squared_error"]) return model model2 = build_dropout_model2()

Based on your experience what should I change in the lstm model to make a good prediction and thank you

Was this page helpful?
0 / 5 - 0 ratings