Pytorch-lightning: Runtime Error if validation_step is defined, but valid_loader isn't provided to Trainer

Created on 19 Aug 2020  ·  7Comments  ·  Source: PyTorchLightning/pytorch-lightning

🐛 Bug

If validation_step is defined in your LightningModule, the model will not train unless you provide a validation loader to the trainer.

You get this warning (as expected):

UserWarning: you defined a validation_step but have no val_dataloader. Skipping validation loop

But then this error, which prevents training:

/usr/local/lib/python3.6/dist-packages/pytorch_lightning/callbacks/progress.py in on_sanity_check_start(self, trainer, pl_module)
    294         super().on_sanity_check_start(trainer, pl_module)
    295         self.val_progress_bar = self.init_sanity_tqdm()
--> 296         self.val_progress_bar.total = convert_inf(trainer.num_sanity_val_steps * len(trainer.val_dataloaders))
    297         self.main_progress_bar = tqdm(disable=True)  # dummy progress bar
    298 

TypeError: object of type 'NoneType' has no len()

To Reproduce

  1. Define LightningModule with validation_step
  2. Train.fit() with only training loader.

Code sample

https://colab.research.google.com/drive/1-pyGmHMAJaIg86T7s4y2PKxOX79dZq91?usp=sharing

Expected behavior

Should still give user warning, but should train, skipping the validation step.

Priority P0 bug / fix help wanted information needed

Most helpful comment

Hi @awaelchli , the issue has been fixed by #3197 .

All 7 comments

Hi! thanks for your contribution!, great first issue!

Pretty sure this will be solved by this PR #2892 automatically. But we should remember to add a test for this warning.

i don't think #2892 will make it into 0.9 because it has a lot going on...

Can we get this into 0.9.0? this will require a new PR

The issue corresponding to #2892 has been fixed by #2917. But the code sample still got errors. It seems that the problem can be fixed by change the initial test_dataloaders and val_dataloaders from
https://github.com/PyTorchLightning/pytorch-lightning/blob/7cca3859a7b97a9ab4a6c6fb5f36ff94bff7f218/pytorch_lightning/trainer/trainer.py#L383-L384
to

self.test_dataloaders = []
self.val_dataloaders = []

Should a new issue be created?

But the code sample still got errors. It seems that the problem can be fixed by change the initial test_dataloaders and val_dataloaders from

@manipopopo I cannot reproduce it on master. What exactly is the remaining issue, how do I reproduce it?

The code in your google colab link now runs without the reported error if I install from master branch. Closing this.
If there is something else that needs to be fixed, please open a new issue so I can take a look.

Hi @awaelchli , the issue has been fixed by #3197 .

Was this page helpful?
0 / 5 - 0 ratings

Related issues

srush picture srush  ·  3Comments

iakremnev picture iakremnev  ·  3Comments

chuong98 picture chuong98  ·  3Comments

as754770178 picture as754770178  ·  3Comments

edenlightning picture edenlightning  ·  3Comments