Pytorch-lightning: RuntimeError: No `loss` value in the dictionary returned from `model.training_step()` with pytorch lightning

Created on 18 Aug 2020 · 6Comments · Source: PyTorchLightning/pytorch-lightning

❓ Questions and Help

What is your question?

I am trying to run a custom dataset using Pytorch lightning but am not able to do so due to the following error.
The input is an array with the shape as (n, m). Can anyone tell me what am I doing wrong?

Traceback (most recent call last): File "TestNet.py", line 285, in <module> trainer.fit(model) File /path/to/pytorch/lib64/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1003, in fit results = self.single_gpu_train(model) File "/path/to/pytorch/lib64/python3.6/site-packages/pytorch_lightning/trainer/distrib_parts.py", line 186, in single_gpu_train results = self.run_pretrain_routine(model) File "/path/to/pytorch/lib64/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1213, in run_pretrain_routine self.train() File "/path/to/pytorch/lib64/python3.6/site-packages/pytorch_lightning/trainer/training_loop.py", line 370, in train self.run_training_epoch() File "/path/to/pytorch/lib64/python3.6/site-packages/pytorch_lightning/trainer/training_loop.py", line 452, in run_training_epoch batch_output = self.run_training_batch(batch, batch_idx) File "/path/to/pytorch/lib64/python3.6/site-packages/pytorch_lightning/trainer/training_loop.py", line 632, in run_training_batch self.hiddens File "/path/to/pytorch/lib64/python3.6/site-packages/pytorch_lightning/trainer/training_loop.py", line 783, in optimizer_closure training_step_output = self.process_output(training_step_output, train=True) File "/path/to/pytorch/lib64/python3.6/site-packages/pytorch_lightning/trainer/logging.py", line 159, in process_output 'Nolossvalue in the dictionary returned frommodel.training_step().' RuntimeError: Nolossvalue in the dictionary returned frommodel.training_step(). Exception ignored in: <object repr() failed> Traceback (most recent call last): File "/path/to/pytorch/lib64/python3.6/site-packages/tqdm/std.py", line 1086, in __del__ File "/path/to/pytorch/lib64/python3.6/site-packages/tqdm/std.py", line 1293, in close File "/path/to/pytorch/lib64/python3.6/site-packages/tqdm/std.py", line 1471, in display File "/path/to/pytorch/lib64/python3.6/site-packages/tqdm/std.py", line 1089, in __repr__ File "/path/to/pytorch/lib64/python3.6/site-packages/tqdm/std.py", line 1433, in format_dict TypeError: 'NoneType' object is not iterable

Code

I have the training_step and the other functions as below

`def training_step(self, train_batch, batch_idx):
x, y = train_batch
logits = self.forward(x)
loss = F.l1_loss(logits, y)
tensorboard_logs = {'train_loss': loss}
return {'train_loss': loss, 'log': tensorboard_logs}

def validation_step(self, val_batch, batch_idx):
    x, y = val_batch
    logits = self.forward(x)
    loss = F.l1_loss(logits, y)
    return {'val_loss': loss}

def validation_epoch_end(self, outputs):
    # called at the end of the validation epoch
    # outputs is an array with what you returned in validation_step for each batch
    # outputs = [{'loss': batch_0_loss}, {'loss': batch_1_loss}, ..., {'loss': batch_n_loss}] 
    avg_loss = torch.stack([x['val_loss'] for x in outputs]).mean()
    tensorboard_logs = {'val_loss': avg_loss}
    return {'avg_val_loss': avg_loss, 'log': tensorboard_logs}

def test_step(self, test_batch, batch_nb):
    x, y = test_batch
    logits = self.forward(x)
    loss = F.l1_loss(logits, y)
    correct = torch.sum(logits == y.data)

    # I want to visualize my predictions vs my actuals so here I'm going to 
    # add these lines to extract the data for plotting later on
    predictions_pred.append(logits)
    predictions_actual.append(y.data)
    return {'test_loss': loss, 'test_correct': correct, 'logits': logits}

def test_epoch_end(self, outputs):
    # called at the end of the test epoch
    # outputs is an array with what you returned in test_step for each batch
    # outputs = [{'loss': batch_0_loss}, {'loss': batch_1_loss}, ..., {'loss': batch_n_loss}] 
    avg_loss = torch.stack([x['test_loss'] for x in outputs]).mean()
    logs = {'test_loss': avg_loss}      
    return {'avg_test_loss': avg_loss, 'log': logs, 'progress_bar': logs }    

def train_dataloader(self):
    train_dataset = TensorDataset(torch.tensor(new_x_train).float(), torch.tensor(new_y_train).float())
    train_loader = DataLoader(dataset = train_dataset, batch_size = 32)
    return train_loader

def val_dataloader(self):
    val_dataset = TensorDataset(torch.tensor(new_x_val).float(), torch.tensor(new_y_val).float())
    val_loader = DataLoader(dataset = val_dataset, batch_size = 32)
    return val_loader

def test_dataloader(self):
    test_dataset = TensorDataset(torch.tensor(new_x_test).float(), torch.tensor(new_y_test).float())
    test_loader = DataLoader(dataset = test_dataset, batch_size = 32)
    return test_loader

def configure_optimizers(self):
    optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
    return optimizer`

I have defined the custom dataset outside of the Pytorch Lightning Module with new_x_train, new_y_train as input, and label for the training set. The naming is similar for validation as well as the test set.

What's your environment?

OS: Linux
Packaging pip

question

Source

GuptaVishu2002

Most helpful comment

should be:

def training_step(self, train_batch, batch_idx):
    x, y = train_batch
    logits = self.forward(x)
    loss = F.l1_loss(logits, y)
    tensorboard_logs = {'train_loss': loss}
    return {'loss': loss, 'log': tensorboard_logs}

the loss should be attached to the key 'loss' in training_step which tells which value to minimize.
https://pytorch-lightning.readthedocs.io/en/stable/api/pytorch_lightning.core.html#pytorch_lightning.core.LightningModule.training_step

rohitgr7 on 18 Aug 2020

👍2

All 6 comments

Hi! thanks for your contribution!, great first issue!

github-actions[bot] on 18 Aug 2020

should be:

def training_step(self, train_batch, batch_idx):
    x, y = train_batch
    logits = self.forward(x)
    loss = F.l1_loss(logits, y)
    tensorboard_logs = {'train_loss': loss}
    return {'loss': loss, 'log': tensorboard_logs}

rohitgr7 on 18 Aug 2020

👍2

@GuptaVishu2002 does the solution fix your issue?

ananyahjha93 on 19 Aug 2020

Yes! It fixed my issue! Thank You so much!

GuptaVishu2002 on 19 Aug 2020

When I am using the same code for cross_entropy loss it is working fine. But getting error for mse_loss. Why this is happening?
return {'train_loss': loss, 'log': tensorboard_logs} is fine for cross_entropy but showing error for mse_loss.

However, if I changed the train_loss to loss then it is working fine. Why?

mostafiz67 on 19 Oct 2020

However, if I changed the train_loss to loss then it is working fine. Why?

because loss is a special key that will be used to tell which value to optimize. you need to return a value with loss key in the dictionary or just return a loss tensor. For logging check the new API.

rohitgr7 on 19 Oct 2020

Was this page helpful?

0 / 5 - 0 ratings