pytorch-lightning: build from master
Traceback (most recent call last):
File "main.py", line 140, in <module>
main(hparams)
File "main.py", line 72, in main
trainer.fit(model)
File "/mnt/lustre/maxiao1/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 881, in fit
self.ddp_train(task, model)
File "/mnt/lustre/maxiao1/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/distrib_data_parallel.py", line 539, in ddp_train
self.run_pretrain_routine(model)
File "/mnt/lustre/maxiao1/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1091, in run_pretrain_routine
self.train()
File "/mnt/lustre/maxiao1/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 376, in train
self.run_training_epoch()
File "/mnt/lustre/maxiao1/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 510, in run_training_epoch
self.run_training_epoch_end(epoch_output)
File "/mnt/lustre/maxiao1/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 535, in run_training_epoch_end
epoch_output = model.training_epoch_end(epoch_output)
File "/mnt/lustre/maxiao1/PVM/models/baseline.py", line 335, in training_epoch_end
avg_loss = torch.stack([x['loss'] for x in outputs]).mean()
File "/mnt/lustre/maxiao1/PVM/models/baseline.py", line 335, in <listcomp>
avg_loss = torch.stack([x['loss'] for x in outputs]).mean()
KeyError: 'loss'
This is my code:
def training_step(self, batch, batch_idx):
...
return {'loss': loss, "train_acc": acc}
def training_epoch_end(self, outputs):
avg_loss = torch.stack([x['loss'] for x in outputs]).mean()
avg_acc = torch.stack([x['train_acc'] for x in outputs]).mean()
logs = {'loss': avg_loss, 'train_acc': avg_acc}
progress_bar = {'train_loss': avg_loss, 'train_acc': avg_acc}
results = {
'log': logs,
'progress_bar': progress_bar
}
return results
Try: avg_loss = torch.stack([x['batch_loss'] for x in outputs]).mean()
Thanks, it works
but 'train_acc' key doesn't exist, neither do batch_train_acc. How to access other keys returned in training_step?
As of now in lightning you can access them using x['callback_metrics']['loss'] and x['callback_metrics']['train_acc'], but I think it should be handled in a similar way we do this with validation_epoch_end and test_epoch_end.
Hi! One hint: for me it works with "loss" under windows but not under ubuntu.
Weird!! Why is this think platform dependent?? :thinking:
@Pet222 , are u sure that versions on ubuntu and windows are same?
Hey @williamFalcon is this intended behaviour? I was surprised to see this breaking change being introduced with no warning.
If it is intended, why not have consistent behaviour over validation_epoch_end and test_epoch_end.
If it is not intended, as it seems due to the "bug fix" tag, are you working on it or should I make a PR for this?
what is the behavior? that the "loss" key is not in training_epoch_end? If so, that's a bug because it should be there
@williamFalcon , on the latest version, the loss key was changed to the batch_loss. I think it was changed here
Yes, the fact that you need to access it through 'callback metrics'.
Got it!
On Tue, 30 Jun 2020 at 12:44, William Falcon notifications@github.com
wrote:
what is the behavior? that the "loss" key is not in training_epoch_end? If
so, that's a bug because it should be there—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/PyTorchLightning/pytorch-lightning/issues/2372#issuecomment-651740702,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABKWP6XTUJDTEDJ2NZQ3RKTRZHFY5ANCNFSM4OJKX4KQ
.>
Best Regards,
Miguel Vera
+351 915 198 452
miguel.coimbra.[email protected]
Github/Captainvera http://www.github.com/captainvera
@captainvera would love a PR :)
@captainvera @xiadingZ sorry about that! it was a bad bug.
Made a PR #2428 and added tests to make sure this doesn't happen again!
try master now!
we’ll push a new minor again since this is a key bug (and we have a few other key bugs)
Well, that was fast, thanks!