Describe the bug
The reported metrics after training always report precision=1.0.
To Reproduce
Training code:
from torch.optim.adam import Adam
from flair.data import Corpus
from flair.datasets import TREC_6
from flair.embeddings import TransformerDocumentEmbeddings
from flair.models import TextClassifier
from flair.trainers import ModelTrainer
# 1. get the corpus
corpus: Corpus = TREC_6()
# 2. create the label dictionary
label_dict = corpus.make_label_dictionary()
# 3. initialize transformer document embeddings (many models are available)
document_embeddings = TransformerDocumentEmbeddings('distilbert-base-uncased', fine_tune=True)
# 4. create the text classifier
classifier = TextClassifier(document_embeddings, label_dictionary=label_dict)
# 5. initialize the text classifier trainer with Adam optimizer
trainer = ModelTrainer(classifier, corpus, optimizer=Adam)
# 6. start the training
trainer.train('/tmp/taggers/trec',
learning_rate=3e-5, # use very small learning rate
mini_batch_size=16,
mini_batch_chunk_size=4, # optionally set this if transformer is too much for your machine
max_epochs=5, # terminate after 5 epochs
)
```
Example of produced report:
```text
2020-07-09 09:50:21,395 Testing using best model ...
2020-07-09 09:50:21,395 loading file /tmp/taggers/trec/best-model.pt
2020-07-09 09:50:27,486 0.964
2020-07-09 09:50:27,487
Results:
- F-score (micro) 0.9823
- F-score (macro) 0.9745
- Accuracy 0.964
By class:
precision recall f1-score support
DESC 1.0000 0.9931 0.9965 145
ENTY 1.0000 0.8750 0.9333 96
ABBR 1.0000 0.8889 0.9412 9
HUM 1.0000 0.9851 0.9925 67
NUM 1.0000 0.9915 0.9957 117
LOC 1.0000 0.9762 0.9880 84
micro avg 1.0000 0.9653 0.9823 518
macro avg 1.0000 0.9516 0.9745 518
weighted avg 1.0000 0.9653 0.9818 518
samples avg 1.0000 0.9820 0.9880 518
2020-07-09 09:50:27,487 ----------------------------------------------------------------------------------------------------
Expected behavior
Reports correct metrics.
Screenshots
N/A
Environment (please complete the following information):
Additional context
Same problem with other datasets.
Thanks for reporting this! This seems to be an error in the evaluation routine that occurs if no label_type is passed to the model. Can you run the above code with
# 4. create the text classifier
classifier = TextClassifier(document_embeddings, label_dictionary=label_dict, label_type='question_type')
I will put in a PR shortly that fixes this if no label type is passed.
(Edit: label_type instead of label_name)
This does not seem to solve the problem.
Here is what I have tested following the suggested code:
from torch.optim.adam import Adam
from flair.data import Corpus
from flair.datasets import TREC_6
from flair.embeddings import TransformerDocumentEmbeddings
from flair.models import TextClassifier
from flair.trainers import ModelTrainer
# 1. get the corpus
corpus: Corpus = TREC_6()
# 2. create the label dictionary
label_dict = corpus.make_label_dictionary()
# 3. initialize transformer document embeddings (many models are available)
document_embeddings = TransformerDocumentEmbeddings('distilbert-base-uncased', fine_tune=True)
# 4. create the text classifier
classifier = TextClassifier(document_embeddings, label_dictionary=label_dict, label_type='question_type')
# 5. initialize the text classifier trainer with Adam optimizer
trainer = ModelTrainer(classifier, corpus, optimizer=Adam)
# 6. start the training
trainer.train('/tmp/taggers/trec',
learning_rate=3e-5, # use very small learning rate
mini_batch_size=16,
mini_batch_chunk_size=4, # optionally set this if transformer is too much for your machine
max_epochs=5, # terminate after 5 epochs
)
Results:
2020-07-09 11:45:16,849 ----------------------------------------------------------------------------------------------------
2020-07-09 11:45:16,850 Testing using best model ...
2020-07-09 11:45:16,850 loading file /tmp/taggers/trec/best-model.pt
2020-07-09 11:45:22,217 0.964
2020-07-09 11:45:22,218
Results:
- F-score (micro) 0.9823
- F-score (macro) 0.9845
- Accuracy 0.964
By class:
precision recall f1-score support
ENTY 1.0000 0.8947 0.9444 95
DESC 1.0000 0.9653 0.9823 144
ABBR 1.0000 1.0000 1.0000 10
HUM 1.0000 0.9851 0.9925 67
NUM 1.0000 1.0000 1.0000 120
LOC 1.0000 0.9756 0.9877 82
micro avg 1.0000 0.9653 0.9823 518
macro avg 1.0000 0.9701 0.9845 518
weighted avg 1.0000 0.9653 0.9820 518
samples avg 1.0000 0.9820 0.9880 518
2020-07-09 11:45:22,218 ----------------------------------------------------------------------------------------------------
Argh, you're right. I just pushed a PR that I believe fixes this. Could you try installing from master?
pip install --upgrade git+https://github.com/flairNLP/flair.git
From the master it looks like better:
2020-07-09 12:46:46,615 ----------------------------------------------------------------------------------------------------
2020-07-09 12:46:46,615 Testing using best model ...
2020-07-09 12:46:46,616 loading file /tmp/taggers/trec/best-model.pt
2020-07-09 12:46:51,939 0.97
2020-07-09 12:46:51,939
Results:
- F-score (micro) 0.97
- F-score (macro) 0.9665
- Accuracy 0.97
By class:
precision recall f1-score support
DESC 0.9384 0.9928 0.9648 138
ENTY 0.9882 0.8936 0.9385 94
ABBR 1.0000 0.8889 0.9412 9
HUM 0.9846 0.9846 0.9846 65
NUM 0.9739 0.9912 0.9825 113
LOC 0.9877 0.9877 0.9877 81
micro avg 0.9700 0.9700 0.9700 500
macro avg 0.9788 0.9564 0.9665 500
weighted avg 0.9709 0.9700 0.9697 500
samples avg 0.9700 0.9700 0.9700 500
2020-07-09 12:46:51,939 ----------------------------------------------------------------------------------------------------
I have also encountered this problem
By class:
precision recall f1-score support
4 1.0000 0.4470 0.6179 1322
1 1.0000 0.5561 0.7148 3064
3 1.0000 0.5726 0.7282 2513
2 1.0000 0.3269 0.4927 1661
5 1.0000 0.7080 0.8290 2325
0 1.0000 0.8719 0.9316 4677
micro avg 1.0000 0.6427 0.7825 15562
macro avg 1.0000 0.5804 0.7190 15562
weighted avg 1.0000 0.6427 0.7672 15562
samples avg 1.0000 0.7220 0.8147 15562
and I have tried to input:
pip install --upgrade git+https://github.com/flairNLP/flair.git
but here the error comes out:
ERROR: Command errored out with exit status 128: git clone -q https://github.com/flairNLP/flair.git /tmp/pip-req-build-f1vp5jsw Check the logs for full command output.
can you help me @alanakbik
Try doing a fresh pip install flair. We just released a new flair version so you don't have to install from master.
Try doing a fresh pip install flair. We just released a new flair version so you don't have to install from master.
Okay, that works, thank you
Most helpful comment
From the master it looks like better: