Pytorch-lightning: Run full validation epoch before training

Created on 3 May 2020  ·  8Comments  ·  Source: PyTorchLightning/pytorch-lightning

❓ Questions and Help

What is your question?

How can I manually trigger a full validation step before training?

I want to compute and log the validation metrics before I start the training (ideally also updating the progress bar dictionary).

The reason why I want to do this is that I am fine-tuning a pre-trained model, and I want to check the performances before training.

question

Most helpful comment

Hi @awaelchli
I understand that I can run the test before training**, but that's a bit different from what I am trying to achieve.

Currently, the Trainer class accepts the num_sanity_val_steps which allows users to define how many validation steps to execute before running.
It would be great to be able to set this to a special value, say -1, to tell the Trainer to run a full validation epoch before training instead of just some fixed number of steps.
Then inside on_sanity_check_end_called people would be able to use metrics_callback to do whatever they want with the validation results*.

**: Currently you can't directly call test() on a pretrained model unless you manually call prepare_data() before the test() call, see https://github.com/PyTorchLightning/pytorch-lightning/issues/1562.

*: Currently the trainer.metrics_callback inside the on_sanity_check_end_called is not populated with the validation metrics.

All 8 comments

To evaluate the performance of an existing model in your case, it is best practice to implement the test methods in the lightning module and then invoke the Trainer.test(). So I imagine your workflow will roughly be

model = YourLightningModule.load_from_checkpoint(...)

trainer = Trainer(options for testing)
Trainer.test()  # in the future, this will return results directly
# for now:
metrics = trainer.progress_bar_metrics  # for example to print them

# now training
model = YourLightningModule.load_from_checkpoint(...) 
trainer = Trainer(args for training)
trainer.fit()

There is a reason why the validation is tied to the training and cannot be run so easily from the outside. The validation is conceptually not the same as the test and does not reflect the true performance of a model, because we do things like early stopping based on validation loss etc.

Hi @awaelchli
I understand that I can run the test before training**, but that's a bit different from what I am trying to achieve.

Currently, the Trainer class accepts the num_sanity_val_steps which allows users to define how many validation steps to execute before running.
It would be great to be able to set this to a special value, say -1, to tell the Trainer to run a full validation epoch before training instead of just some fixed number of steps.
Then inside on_sanity_check_end_called people would be able to use metrics_callback to do whatever they want with the validation results*.

**: Currently you can't directly call test() on a pretrained model unless you manually call prepare_data() before the test() call, see https://github.com/PyTorchLightning/pytorch-lightning/issues/1562.

*: Currently the trainer.metrics_callback inside the on_sanity_check_end_called is not populated with the validation metrics.

I'm agree with @simonepri. it's good to run full validation_sanity_check before training.
If we can know full num_sanity_val_steps before declaring trainer, it gonna be simple, however i don't think it's possible.
There is not option for such feature for now, isn't it?

yes maybe we could think about the option num_sanity_val_steps=-1
@PyTorchLightning/core-contributors

I agree. It's good practice to run the validation before the train, and I'll surely use it!

It's pretty straight forward, I guess I'm not the first to write this:)

class run_validation_on_start(Callback):
    def __init__(self):
        pass

    def on_train_start(self, trainer: Trainer, pl_module):
        return trainer.run_evaluation(test_mode=False)

@awaelchli yes, perfect. let’s do it that way.

num_sanity_val_steps=-1

PR for 0.8.0?

happy to make it but it would be easier to do after #1920.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jcreinhold picture jcreinhold  ·  3Comments

justusschock picture justusschock  ·  3Comments

baeseongsu picture baeseongsu  ·  3Comments

Vichoko picture Vichoko  ·  3Comments

williamFalcon picture williamFalcon  ·  3Comments