Allennlp: How to use F1 score? Error when using F1Measure.

Created on 4 Oct 2018 · 1Comment · Source: allenai/allennlp

Describe the bug
I am trying to create a sentence classifier and measure F1 scores, but there is an error when I tried to use F1Measure in a class as follows. I also tried "validation_metrics": "f1" for trainer in the configuration but it seems to have no effect. (What does the option mean?)
I did try to find a tutorial online about using metrics in AllenNLP but could not find one.

To Reproduce
I defined a model class as follows, and trained it on a dataset:

@Model.register("sentence_classifier")
class SentenceClassifier(Model):
    def __init__(self, vocab, model_text_field_embedder, internal_text_encoder, classifier_feedforward,
                       initializer, regularizer):
        super(SentenceClassifier, self).__init__(vocab, regularizer)
        self.model_text_field_embedder = model_text_field_embedder
        self.num_class = self.vocab.get_vocab_size("labels")
        self.internal_text_encoder = internal_text_encoder
        self.classifier_feedforward = classifer_feedforward
        self.metrics = {
            "accuracy": CategoricalAccuracy(),
            "f1" : F1Measure(positive_label=1)
        }
        self.loss = torch.nn.CrossEntropyLoss()
        initializer(self)

Run command: python3 run.py train experiments/newsgroups_with_cuda.json --include-package newsgroups.dataset_readers --include-package newsgroups.models -s serialization_gpu

Error:

  File "run.py", line 15, in <module>
    main(prog="python run.py")
  File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/commands/__init__.py", line 70, in main
    args.func(args)
  File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/commands/train.py", line 102, in train_model_from_args
    args.recover)
  File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/commands/train.py", line 132, in train_model_from_file
    return train_model(params, serialization_dir, file_friendly_logging, recover)
  File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/commands/train.py", line 320, in train_model
    metrics = trainer.train()
  File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/training/trainer.py", line 720, in train
    train_metrics = self._train_epoch(epoch)
  File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/training/trainer.py", line 517, in _train_epoch
    description = self._description_from_metrics(metrics)
  File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/training/trainer.py", line 812, in _description_from_metrics
    metrics.items() if not name.startswith("_")]) + " ||"
  File "/home/ducbui/miniconda3/lib/python3.6/site-packages/allennlp/training/trainer.py", line 812, in <listcomp>
    metrics.items() if not name.startswith("_")]) + " ||"
TypeError: must be real number, not tuple

I did investigate a little bit and the metrics variable contains the following dict at error:

{ 
"f1": (0.0, 0.0, 0.0),
"loss": 4.58939790725708
}

So the reason for the crash is metrics["f1"] is a tuple, which fails the ', '.join() function in Trainer._description_from_metrics().

Expected behavior
The trainer should report accuracy and F1 score.

System (please complete the following information):

OS: Linux
Python version: 3.6.6
AllenNLP version: v0.6.1
PyTorch version: 0.4.1

Source

ducalpha

Most helpful comment

in your Model you must have defined a get_metrics function, and I assume it just does something like

return {name: metric.get_metric() for name, metric in self.metrics.items()}

if you want to use F1 (which as you noticed, returns a tuple), you'll have to explicitly pull the value out of the tuple inside your call to get_metrics, something like

return {
    # f1 get_metric returns (precision, recall, f1)
    "f1": self.metrics["f1"].get_metric(reset=reset)[2],
    "accuracy": self.metrics["accuracy"].get_metric(reset=reset)
}

joelgrus on 4 Oct 2018

👍5

>All comments

in your Model you must have defined a get_metrics function, and I assume it just does something like

return {name: metric.get_metric() for name, metric in self.metrics.items()}

if you want to use F1 (which as you noticed, returns a tuple), you'll have to explicitly pull the value out of the tuple inside your call to get_metrics, something like

return {
    # f1 get_metric returns (precision, recall, f1)
    "f1": self.metrics["f1"].get_metric(reset=reset)[2],
    "accuracy": self.metrics["accuracy"].get_metric(reset=reset)
}

joelgrus on 4 Oct 2018

👍5

Was this page helpful?

0 / 5 - 0 ratings