Pytorch-lightning: Val_loss not available

Created on 7 Oct 2019 · 2Comments · Source: PyTorchLightning/pytorch-lightning

Describe the bug
When I train my network, which has validation steps defined similar to the doc example

def validation_step(self, batch, batch_nb):
        x = torch.squeeze(batch['x'], dim=0).float()
        y = torch.squeeze(batch['y'], dim=0).long()

        output = self.forward(x)
        return {'batch_val_loss': self.loss(output, y),
                'batch_val_acc': accuracy(output, y)}

    def validation_end(self, outputs):
        avg_loss = torch.stack([x['batch_val_loss'] for x in outputs]).mean()
        avg_acc = torch.stack([x['batch_val_acc'] for x in outputs]).mean()

        return {'val_loss': avg_loss, 'val_acc': avg_acc}

with my cusotm EarlyStopCallback

early_stop_callback = EarlyStopping(monitor='val_loss', patience=5)

    tt_logger = TestTubeLogger(
        save_dir=log_dir,
        name="default",
        debug=False,
        create_git_tag=False
    )

    trainer = Trainer(logger=tt_logger,
                      row_log_interval=10,
                      checkpoint_callback=checkpoint_callback,
                      early_stop_callback=early_stop_callback,
                      gradient_clip_val=0.5,
                      gpus=gpus,
                      check_val_every_n_epoch=1,
                      max_nb_epochs=99999,
                      train_percent_check=train_frac,
                      log_save_interval=100,
                     )

the program cannot see my validation metrics:

Early stopping conditioned on metric `val_loss` which is not available. Available metrics are: loss,epoch,batch_nb,v_nb <class 'RuntimeWarning'>

In a previous release running on Windows (now I am on macOS), this behaviour was not happening. But in the previous version, TestTubeLogger was not present

Desktop (please complete the following information):

OS: macOS
Version: latest

bug / fix

Source

Menion93

Most helpful comment

Thank you this fixed my issue


def validation_end(self, outputs):
        avg_loss = torch.stack([x['batch_val_loss'] for x in outputs]).mean()
        avg_acc = torch.stack([x['batch_val_acc'] for x in outputs]).mean()

        return {
          'val_loss': avg_loss,
          'val_acc': avg_acc, 
          'progress_bar':{'val_loss': avg_loss, 'val_acc': avg_acc }}

Menion93 on 7 Oct 2019

👍2

All 2 comments

the early stop metrics come from “progress_bar” entry:

So, add a key “progress_bar” and val_loss in there

I’ll update the docs

williamFalcon on 7 Oct 2019

Thank you this fixed my issue


def validation_end(self, outputs):
        avg_loss = torch.stack([x['batch_val_loss'] for x in outputs]).mean()
        avg_acc = torch.stack([x['batch_val_acc'] for x in outputs]).mean()

        return {
          'val_loss': avg_loss,
          'val_acc': avg_acc, 
          'progress_bar':{'val_loss': avg_loss, 'val_acc': avg_acc }}

Menion93 on 7 Oct 2019

👍2

Was this page helpful?

0 / 5 - 0 ratings