Flair: Micro average accuracy for multiclass classification

Created on 16 May 2019 · 3Comments · Source: flairNLP/flair

Hi,
I am aware that accuracy is computed without taking into account true negatives (tn) as per issue #483 . However, the following is the output produced for a 3 class sentiment classification task (N|P|NEU, all examples are predicted and assigned exactly one class):

Testing using best model ...
loading file flair-exps/eu/models100/best-model.pt
MICRO_AVG: acc 0.5221 - f1-score 0.6861
MACRO_AVG: acc 0.5197 - f1-score 0.6837
N          tp: 203 - fp: 108 - fn: 101 - tn: 757 - precision: 0.6527 - recall: 0.6678 - accuracy: 0.4927 - f1-score: 0.6602
NEU        tp: 300 - fp: 129 - fn: 147 - tn: 593 - precision: 0.6993 - recall: 0.6711 - accuracy: 0.5208 - f1-score: 0.6849
P          tp: 299 - fp: 130 - fn: 119 - tn: 621 - precision: 0.6970 - recall: 0.7153 - accuracy: 0.5456 - f1-score: 0.7060

It seems to me that the micro-averaged accuracy is not correctly computed in this case. I would expect that MICRO AVG acc to be equal to MICRO AVG f1-score. In fact, if we compute accuracy (correct predictions/total predictions) with the above numbers (203+300+299)/ 1169 = 0.6861.
Shouldn't it be MICRO_AVG: acc 0.6861 - f1-score 0.6861 instead of MICRO_AVG: acc 0.5221 - f1-score 0.6861 ?

I think the problem is in training_utils.py#L120, because when calling self.accuracy(None), the total number predictions are computed as the sum of tps,fps and fns, which is not the actual number of samples in the test set.

Again I'm in a multiclass single-label text classification scenario. I haven't tested other tasks.

In any case, thanks for the great work!

bug wontfix

Source

isanvicente

👍1

Most helpful comment

Hello @isanvicente thanks for reporting this - we'll take a closer look!

alanakbik on 17 May 2019

👍2

All 3 comments

Hello @isanvicente thanks for reporting this - we'll take a closer look!

alanakbik on 17 May 2019

👍2

I found a similar problem, the way of calculating accuracy in the metrics.py should be (tp+tn)/(tp+tn+fp+fn)

hjian42 on 19 Jul 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.