Pytorch-lightning: Hyperparameters for DataModules

Created on 1 Oct 2020  路  8Comments  路  Source: PyTorchLightning/pytorch-lightning

馃殌 Feature

Add a save_hyperparameters function to LightningDataModule and let the Trainer log it together with the model's hyperparameters.

Motivation

DataModules are a great way to decouple the model code from the data it runs on.
People using datasets where pre-processing is not as fixed as in the common benchmarks need their DataModules to be configurable.
Examples would be:

  • size of sliding windows
  • maximum sequence length
  • type of scaling (min-max, standardization, etc.)

Logging these hyperparameters is just as important for evaluating your model performance as the model's hyperparameters.
Therefore, they should be automatically logged by the trainer, too.

Pitch

You are still searching for the perfect way to pre-process your data for maximum performance?
Keep all your efforts in order by logging the hyperparameters of your LightningDataModule.

Alternatives

Right now, I manually define a hyperparameter dictionary as a member of my DataModule.
Afterward, I call update on the hparams property of my LightningModule.
This is pretty low-level code at the top-level of my script.

Additional context

Code example for current solution:

class MyDataModule(pl.LightningDataModule):
    def __init__(self,
                 fd,
                 batch_size,
                 max_rul=125,
                 window_size=30,
                 percent_fail_runs=None,
                 percent_broken=None,
                 feature_select=None):

        ...

        self.hparams = {'fd': self.fd,
                        'batch_size': self.batch_size,
                        'window_size': self.window_size,
                        'max_rul': self.max_rul,
                        'percent_broken': self.percent_broken,
                        'percent_fail_runs': self.percent_fail_runs}

...

data = datasets.MyDataModule(...)
model.hparams.update(data.hparams)
API / design data / DataModule enhancement help wanted

All 8 comments

Hi! thanks for your contribution!, great first issue!

yup, been battling with myself whether or not this is necessary. But I think it is...do you want to submit a PR for this?

Sure, I'll give it a shot.

@tilman151 I like this idea!.

What do you think about an add_datamodule_specific_args() static method in LighningDataModule as well? Do you see the use case? For defining defaults for parameters like train/val/test split, positive/negative balance in the datamodule.

@adriantre I think you can still define a static method similar to that of LightningModule which should work.

@SurajDonthi Yes, you are right of course 馃槄

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

@edenlightning I am really sorry to bother you, but could you explain why this issue was closed? The associated PR was still under review.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

iakremnev picture iakremnev  路  3Comments

williamFalcon picture williamFalcon  路  3Comments

srush picture srush  路  3Comments

edenlightning picture edenlightning  路  3Comments

chuong98 picture chuong98  路  3Comments