Keras: Different accuracy score between keras.model.evaluate and sklearn.accuracy_score

Created on 15 Mar 2018 · 3Comments · Source: keras-team/keras

I have a similar problem with this Kaggle tutorial: https://www.kaggle.com/eliotbarr/text-mining-with-sklearn-keras-mlp-lstm-cnn, so I will refer to it.

If you look to the code block number 30 and 31:

print('Train...')
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=1,
          validation_data=(X_test, Y_test))
score, acc = model.evaluate(X_test, Y_test,
                            batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)

and

print("Generating test predictions...")
preds = model.predict_classes(X_test, verbose=0)
print('prediction 8 accuracy: ', accuracy_score(test['Rating'], preds+1))

I suppose the accuracy scores should be the same, but in fact, they are different. How can it be possible?
One accuracy is calculated by model.evaluate and other one is calculated by accuracy_score (sklearn).

Source

vinhqdang

Most helpful comment

thanks for your reply sir.
but after getting the predictions .i calculated accuracy as like as how keras calculates.
accuracy1 =K.mean(K.equal(K.argmax(y_true, axis=-1), K.argmax(y_pred, axis=-1)))
both are same accuray1,scikit accuracy.
but not same as keras evaluate accuracy

ramesh720 on 16 Oct 2018

👍4

All 3 comments

Furthermore, I saw a huge difference on val_acc while training and the accuracy in the final step. Even I use the same test set for training and evaluation.

model.fit(X_train, Y_train, batch_size=batch_size, epochs=nb_epoch,
              validation_data=(X_test, Y_test))

Here is my training process:

Epoch 16/20
16256/16256 [==============================] - 3s - loss: 0.1011 - acc: 0.9604 - val_loss: 0.1191 - val_acc: 0.9545
Epoch 17/20
16256/16256 [==============================] - 4s - loss: 0.0986 - acc: 0.9615 - val_loss: 0.1224 - val_acc: 0.9536
Epoch 18/20
16256/16256 [==============================] - 3s - loss: 0.0965 - acc: 0.9622 - val_loss: 0.1197 - val_acc: 0.9550
Epoch 19/20
16256/16256 [==============================] - 3s - loss: 0.0946 - acc: 0.9631 - val_loss: 0.1213 - val_acc: 0.9542
Epoch 20/20
16256/16256 [==============================] - 3s - loss: 0.0929 - acc: 0.9634 - val_loss: 0.1288 - val_acc: 0.9519

the val_acc reaches to 0.95.

But when I do:

preds = model.predict_classes(X_test, verbose=0)
print('prediction 8 accuracy: ', accuracy_score(test['Rating'], preds+1))

the accuracy score is only 0.68.

I note that Y_test and y_test are the same

Y_test = np_utils.to_categorical(y_test, nb_classes)

vinhqdang on 15 Mar 2018

Both the accuracy measures are different.

sklearn accuracy is pretty straightforward.
y_pred = [0, 2, 1, 3]
y_true = [0, 1, 2, 3]
y_equals = [1,0,0,1]
sklearn accuracy = 0.5 which is the confidence.