Describe the bug
I am trying to create a sentence classifier and measure F1 scores, but there is an error when I tried to use F1Measure in a class as follows. I also tried "validation_metrics": "f1" for trainer in the configuration but it seems to have no effect. (What does the option mean?)
I did try to find a tutorial online about using metrics in AllenNLP but could not find one.
To Reproduce
I defined a model class as follows, and trained it on a dataset:
@Model.register("sentence_classifier")
class SentenceClassifier(Model):
def __init__(self, vocab, model_text_field_embedder, internal_text_encoder, classifier_feedforward,
initializer, regularizer):
super(SentenceClassifier, self).__init__(vocab, regularizer)
self.model_text_field_embedder = model_text_field_embedder
self.num_class = self.vocab.get_vocab_size("labels")
self.internal_text_encoder = internal_text_encoder
self.classifier_feedforward = classifer_feedforward
self.metrics = {
"accuracy": CategoricalAccuracy(),
"f1" : F1Measure(positive_label=1)
}
self.loss = torch.nn.CrossEntropyLoss()
initializer(self)
Run command: python3 run.py train experiments/newsgroups_with_cuda.json --include-package newsgroups.dataset_readers --include-package newsgroups.models -s serialization_gpu
Error:
File "run.py", line 15, in <module>
main(prog="python run.py")
File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/commands/__init__.py", line 70, in main
args.func(args)
File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/commands/train.py", line 102, in train_model_from_args
args.recover)
File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/commands/train.py", line 132, in train_model_from_file
return train_model(params, serialization_dir, file_friendly_logging, recover)
File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/commands/train.py", line 320, in train_model
metrics = trainer.train()
File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/training/trainer.py", line 720, in train
train_metrics = self._train_epoch(epoch)
File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/training/trainer.py", line 517, in _train_epoch
description = self._description_from_metrics(metrics)
File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/training/trainer.py", line 812, in _description_from_metrics
metrics.items() if not name.startswith("_")]) + " ||"
File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/training/trainer.py", line 812, in <listcomp>
metrics.items() if not name.startswith("_")]) + " ||"
TypeError: must be real number, not tuple
I did investigate a little bit and the metrics variable contains the following dict at error:
{
"f1": (0.0, 0.0, 0.0),
"loss": 4.58939790725708
}
So the reason for the crash is metrics["f1"] is a tuple, which fails the ', '.join() function in Trainer._description_from_metrics().
Expected behavior
The trainer should report accuracy and F1 score.
System (please complete the following information):
in your Model you must have defined a get_metrics function, and I assume it just does something like
return {name: metric.get_metric() for name, metric in self.metrics.items()}
if you want to use F1 (which as you noticed, returns a tuple), you'll have to explicitly pull the value out of the tuple inside your call to get_metrics, something like
return {
# f1 get_metric returns (precision, recall, f1)
"f1": self.metrics["f1"].get_metric(reset=reset)[2],
"accuracy": self.metrics["accuracy"].get_metric(reset=reset)
}
Most helpful comment
in your
Modelyou must have defined aget_metricsfunction, and I assume it just does something likeif you want to use F1 (which as you noticed, returns a tuple), you'll have to explicitly pull the value out of the tuple inside your call to
get_metrics, something like