Pytorch-lightning: IndexError with multiple validation loaders and fast_dev_run

Created on 6 Jul 2020  路  2Comments  路  Source: PyTorchLightning/pytorch-lightning

馃悰 Bug

An IndexError when using multiple validation datasets and fast_dev_run=True

To Reproduce

Steps to reproduce the behavior:

  1. Use multiple val_dataloaders
  2. Use fast_dev_run=True

Code sample

https://colab.research.google.com/drive/107nKJxF4ttWPtQbo8-Wb0RG3Sa_fxjQP?usp=sharing

Traceback

Traceback (most recent call last):
  File "/home/luca/Repositories/set-operations/src/run_experiment.py", line 73, in <module>
    trainer.fit(model,)
  File "/home/luca/.cache/pypoetry/virtualenvs/set-operations-GbjOlTQ2-py3.7/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 979, in fit
    self.single_gpu_train(model)
  File "/home/luca/.cache/pypoetry/virtualenvs/set-operations-GbjOlTQ2-py3.7/lib/python3.7/site-packages/pytorch_lightning/trainer/distrib_parts.py", line 185, in single_gpu_train
    self.run_pretrain_routine(model)
  File "/home/luca/.cache/pypoetry/virtualenvs/set-operations-GbjOlTQ2-py3.7/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1156, in run_pretrain_routine
    self.train()
  File "/home/luca/.cache/pypoetry/virtualenvs/set-operations-GbjOlTQ2-py3.7/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 370, in train
    self.run_training_epoch()
  File "/home/luca/.cache/pypoetry/virtualenvs/set-operations-GbjOlTQ2-py3.7/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 470, in run_training_epoch
    self.run_evaluation(test_mode=False)
  File "/home/luca/.cache/pypoetry/virtualenvs/set-operations-GbjOlTQ2-py3.7/lib/python3.7/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 409, in run_evaluation
    eval_results = self._evaluate(self.model, dataloaders, max_batches, test_mode)
  File "/home/luca/.cache/pypoetry/virtualenvs/set-operations-GbjOlTQ2-py3.7/lib/python3.7/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 270, in _evaluate
    dl_max_batches = max_batches[dataloader_idx]
IndexError: list index out of range

                              Exception ignored in: <function tqdm.__del__ at 0x7fe5848ba710>
Traceback (most recent call last):
  File "/home/luca/.cache/pypoetry/virtualenvs/set-operations-GbjOlTQ2-py3.7/lib/python3.7/site-packages/tqdm/std.py", line 1086, in __del__
  File "/home/luca/.cache/pypoetry/virtualenvs/set-operations-GbjOlTQ2-py3.7/lib/python3.7/site-packages/tqdm/std.py", line 1293, in close
  File "/home/luca/.cache/pypoetry/virtualenvs/set-operations-GbjOlTQ2-py3.7/lib/python3.7/site-packages/tqdm/std.py", line 1471, in display
  File "/home/luca/.cache/pypoetry/virtualenvs/set-operations-GbjOlTQ2-py3.7/lib/python3.7/site-packages/tqdm/std.py", line 1089, in __repr__
  File "/home/luca/.cache/pypoetry/virtualenvs/set-operations-GbjOlTQ2-py3.7/lib/python3.7/site-packages/tqdm/std.py", line 1433, in format_dict
TypeError: cannot unpack non-iterable NoneType object

Process finished with exit code 1

Reason

If fast_dev_run=True here max_batches is set to [1]
https://github.com/PyTorchLightning/pytorch-lightning/blob/afdfba1dc6061c5e1ee6eaf215500d6a56e95482/pytorch_lightning/trainer/evaluation_loop.py#L376-L377

Thus, later on, it does not pass this test and it remains stuck to [1]:
https://github.com/PyTorchLightning/pytorch-lightning/blob/afdfba1dc6061c5e1ee6eaf215500d6a56e95482/pytorch_lightning/trainer/evaluation_loop.py#L256-L257

Then, the loop iterates over all the dataloaders, causing a IndexError at line 270 at the second iteration:
https://github.com/PyTorchLightning/pytorch-lightning/blob/afdfba1dc6061c5e1ee6eaf215500d6a56e95482/pytorch_lightning/trainer/evaluation_loop.py#L260-L270

Possible solution

  • Let fast_dev_run=True use all validation loaders
  • Modify the evaluation for loop to use only the first val loader

Environment

  • CUDA:

    • GPU:

    • available: False

    • version: 10.1

  • Packages:

    • numpy: 1.18.5

    • pyTorch_debug: False

    • pyTorch_version: 1.5.1+cu101

    • pytorch-lightning: 0.8.4

    • tensorboard: 2.2.2

    • tqdm: 4.41.1

  • System:

    • OS: Linux

    • architecture:



      • 64bit


      • -


    • processor: x86_64

    • python: 3.6.9

    • version: 1 SMP Wed Feb 19 05:26:34 PST 2020

bug / fix help wanted

Most helpful comment

Let fast_dev_run=True use all validation loaders

This is a better choice since Dataset of different dataloaders can be different and we need to check all of them using fast_dev_run.

All 2 comments

Let fast_dev_run=True use all validation loaders

This is a better choice since Dataset of different dataloaders can be different and we need to check all of them using fast_dev_run.

@lucmos seems you digged in... mind send a PR?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

baeseongsu picture baeseongsu  路  3Comments

anthonytec2 picture anthonytec2  路  3Comments

monney picture monney  路  3Comments

williamFalcon picture williamFalcon  路  3Comments

srush picture srush  路  3Comments