Pytorch-lightning: Problem to handle val_loss from validation_step

Created on 28 Feb 2020 · 6Comments · Source: PyTorchLightning/pytorch-lightning

🐛 Bug

The version from a master branch does not handle dict result from validation_step.

Code sample

With such function:

    def validation_step(self, batch, batch_idx):
        x, y = batch
        y_pred = self.model(x)
       return {'val_loss': self.criterion(y_pred, y.view(-1, 1))}

I got error:
RuntimeError: Early stopping conditioned on metricval_loss` which is not available. Available metrics are: ```

If I change this function to the next one:

    def validation_step(self, batch, batch_idx):
        x, y = batch
        y_pred = self.model(x)
        return self.criterion(y_pred, y.view(-1, 1))

Then early stopping seems start working.

But for both cases the callback ModelCheckpoint not works and show a warning:

RuntimeWarning: Can save best model only with val_loss available, skipping.
  ' skipping.', RuntimeWarning)

PyTorch 1.4
Ubuntu 18.04
Pytorch over conda
pytorch_lightning installed as pip install git+https://github.com/PytorchLightning/pytorch-lightning.git@master --upgrade

help wanted question

Source

johngull

All 6 comments

Hi! thanks for your contribution!, great first issue!

github-actions[bot] on 28 Feb 2020

This is not a bug.
Solution: You should implement validation_end and return a dict containing 'val_loss' there.

EarlyStopping and ModelCheckpoint do not monitor metrics returned by validation_step. The error over the whole validation set is what is needed, and this is done by collecting all metrics from validation_step and merging them in validation_end.
https://pytorch-lightning.readthedocs.io/en/0.6.0/lightning-module.html

awaelchli on 29 Feb 2020

👍1

@awaelchli Thank you for the explanation.

Can you please also help me with the logger understanding. Will it automatically log all metrics from training_end and validation_end? Or I need manually log it in these functions?

johngull on 29 Feb 2020

See here: https://pytorch-lightning.readthedocs.io/en/latest/experiment_reporting.html
In the output dict add another entry called "log" which is another dict with all metrics you would like to log. Example:

def validation_end(self, outputs):
   loss = some_loss()
   ...

   logs = {'val_loss': loss}  
   output = {
      'val_loss': loss,  # for early stopping, model checkpoint
      'log': logs,  # will be consumed by logger
   }
   return output

awaelchli on 29 Feb 2020

@Borda could you remove the bug label?

awaelchli on 29 Feb 2020

👍1

@awaelchli Thank you for the help.
I am closing the issue as this is really not a bug.

johngull on 29 Feb 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings