Ignite: Saving double execution cost during training

Created on 2 Jun 2020 · 3Comments · Source: pytorch/ignite

Hi there,

thanks for the great library, it's proven quite useful for me already. In line with #1059, I've noticed that all the examples presume that the training loader is iterated over twice per epoch, but I find that during prototyping and when inference is expensive, it would be preferable to use the already computed predictions for metric computations. I've experimented with using the Trainer's output_transform (called on every iteration) as the place to store intermediate results in global variables

    # called on every iteration
    def store_training_items(x, y, y_pred, loss):
        # store in global variable
        ypred_epoch.extend(y_pred.cpu().numpy().tolist())
        return loss.item()

    trainer = create_supervised_trainer(model, optimizer, criterion,
                                        device=device, output_transform=store_training_items)

and then computing the metrics in a function decorated with @trainer.on(Events.EPOCH_COMPLETED). While this works, it's not very elegant and also does not integrate with existing logging solutions out of the box. Do you have any suggestions on what a better way might look like? I'd especially appreciate if logging could be easily integrated. Thank you in advance!

question

Source

CreateRandom

Most helpful comment

Thanks for the detailed and fast suggestions, this works like a charm! I was able to attach metrics directly to the trainer and to compute them online by means of the output_transform @vfdev-5 included.

 metrics = {'accuracy': Accuracy(), 'p': Precision()}
    # attach metrics to the trainer
 for name, metric in metrics.items():
     metric.attach(trainer, name)

  # attach any logger directly to trainer (log results after each batch)
  npt_logger.attach(trainer,
                      log_handler=OutputHandler(tag="training",
                                                metric_names='all'),
                      event_name=Events.EPOCH_COMPLETED)

Attaching the logger directly to the trainer rather than the evaluator then allowed me to log the metrics easily. Really appreciate it.

CreateRandom on 3 Jun 2020

👍2

All 3 comments

@CreateRandom thank you for your comments 😊 Very happy to read this.

First, your approach is a clever use of what ignite allows!! However, it can’t work with huge dataset.

I think you should try to attach metrics on trainer and not on an evaluator. Using output_transform, you should keep loss, y_pred, etc. and use the metrics in a one pass mode.

Atm I can’t provide code but if necessary, I can do an example code tomorrow !

HTH

sdesrozis on 2 Jun 2020

Yes, @CreateRandom thanks for a great feedback !

Please, correct me if I misunderstood your point, you would like to compute training metrics during the training phase with permanently updated model.

You can do this with the engine following your snippet (personally, I would define my training step, but it is almost the same). For example, we will compute Accuracy during the training:


import torch
import torch.nn as nn
from torch.optim import SGD
from torchvision.models import resnet18
from ignite.engine import create_supervised_trainer, Events
from ignite.metrics import Accuracy

device = "cuda"
model = resnet18(num_classes=10).to(device)
optimizer = SGD(model.parameters(), lr=0.01)
criterion = nn.CrossEntropyLoss()


def custom_output_transform(x, y, y_pred, loss):
    return {
        "y": y,
        "y_pred": y_pred,
        "loss": loss.item()
    }

trainer = create_supervised_trainer(model, optimizer, criterion, device, output_transform=custom_output_transform)
# Attach metric to compute:
accuracy = Accuracy()
accuracy.attach(trainer, "train_acc")

num_iters = 10
bs = 4
data = [(torch.rand(bs, 3, 32, 32), torch.randint(0, 10, size=(bs, ))) for _ in range(num_iters)]


@trainer.on(Events.ITERATION_COMPLETED)
def log_progess():
    print(".", end=" ")


@trainer.on(Events.EPOCH_COMPLETED)
def log_training_metric():
    print(trainer.state.epoch, trainer.state.metrics, "num samples in accuracy metric:", accuracy._num_examples)

trainer.run(data, max_epochs=5)

and this gives

. . . . . . . . . . 1 {'train_acc': 0.075} num samples in accuracy metric: 40
. . . . . . . . . . 2 {'train_acc': 0.225} num samples in accuracy metric: 40
. . . . . . . . . . 3 {'train_acc': 0.3} num samples in accuracy metric: 40
. . . . . . . . . . 4 {'train_acc': 0.675} num samples in accuracy metric: 40
. . . . . . . . . . 5 {'train_acc': 0.975} num samples in accuracy metric: 40

If you need to cumulate predictions and targets, this can be done with EpochMetric

HTH

PS: @sdesrozis is very fast !

vfdev-5 on 2 Jun 2020

😄1

 metrics = {'accuracy': Accuracy(), 'p': Precision()}
    # attach metrics to the trainer
 for name, metric in metrics.items():
     metric.attach(trainer, name)

  # attach any logger directly to trainer (log results after each batch)
  npt_logger.attach(trainer,
                      log_handler=OutputHandler(tag="training",
                                                metric_names='all'),
                      event_name=Events.EPOCH_COMPLETED)

Attaching the logger directly to the trainer rather than the evaluator then allowed me to log the metrics easily. Really appreciate it.

CreateRandom on 3 Jun 2020

👍2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

How do I attach both train and validation metrics to evaluator engine?

milongo · 3Comments

Metrics for GANs

vfdev-5 · 3Comments

[question] how to address the case where different models in a project output variable number of outputs

walkacross · 3Comments

Examples are not working

karfly · 4Comments

create_supervised_trainer fails if model device is diff from arg device

vfdev-5 · 4Comments