would you please add some code to the documentation which helps us to calculate F1 score for BIO test and train files?
I want to have a comparison between Prepared Datasets by flair and Trained Model by myself. so I need to inject test data to the model and get the f1 score like here
You can call the tagger evaluate method on a list of sentences from the corpus to get these results.
from flair.datasets import ColumnCorpus
from flair.models import SequenceTagger
corpus: ColumnCorpus = ColumnCorpus(path_to_BIO_corpus, column_format={0: 'text', 1: 'ner'})
tagger: SequenceTagger = SequenceTagger.load('ner')
result, _ = tagger.evaluate(corpus.test)
print(result.detailed_results)
Output will look like:
MICRO_AVG: acc 0.6259 - f1-score 0.7699
MACRO_AVG: acc 0.5408 - f1-score 0.6944
LOC tp: 3 - fp: 2 - fn: 2 - tn: 3 - precision: 0.6000 - recall: 0.6000 - accuracy: 0.4286 - f1-score: 0.6000
MISC tp: 6 - fp: 4 - fn: 4 - tn: 6 - precision: 0.6000 - recall: 0.6000 - accuracy: 0.4286 - f1-score: 0.6000
ORG tp: 30 - fp: 7 - fn: 13 - tn: 30 - precision: 0.8108 - recall: 0.6977 - accuracy: 0.6000 - f1-score: 0.7500
PER tp: 48 - fp: 13 - fn: 7 - tn: 48 - precision: 0.7869 - recall: 0.8727 - accuracy: 0.7059 - f1-score: 0.8276
This works for any corpus or tagger. Simply change the path or name of the model/corpus.
@Cameilk: Thank you for reply.
would you please tell me a:b means a,b? (for example tagger: SequenceTagger). I didnt see this syntax in python
variable: type = value
It allows you to define a type hint for the variable (since Python 3.6). It is not required but it can be useful in some scenarios
@CamielK thanks for answering this and preparing the example!
A small correction:
Instead of
result, _ = tagger.evaluate(corpus.test)
we need
result, _ = tagger.evaluate([corpus.test])
I have a problem.
Problem : RuntimeError: The expanded size of the tensor (300) must match the existing size (0) at non-singleton dimension 1. Target sizes: [9, 300]. Tensor sizes: [9, 0]
Most helpful comment
You can call the tagger evaluate method on a list of sentences from the corpus to get these results.
Output will look like:
This works for any corpus or tagger. Simply change the path or name of the model/corpus.