Pytorch-lightning: Evaluate Hydra for composing config and command line arguments

Created on 9 Feb 2020  Â·  24Comments  Â·  Source: PyTorchLightning/pytorch-lightning

🚀 Feature

We can evaluate hydra for configuration.

Hydra is an open-source Python framework that simplifies the development of research and other complex applications. The key feature is the ability to dynamically create a hierarchical configuration by composition and override it through config files and the command line. The name Hydra comes from its ability to run multiple similar jobs - much like a Hydra with multiple heads.

Motivation

The image on the left is from the PyTorch ImageNet training example. Despite being a minimal example, the number of command-line flags is already high. Some of these flags logically describe the same component and should ideally be grouped (for example, flags related to distributed training) — but there is no easy way to group those flags together and consume them as a group.

Hydra provides config file composition with overrides, command line completions and parameter sweep.

https://medium.com/pytorch/hydra-a-fresh-look-at-configuration-for-machine-learning-projects-50583186b710

Pitch

Evaluate whether hydra is more powerful than argparse. Check whether the overhead of it doesn't make stuff too complex.

discussion enhancement help wanted won't fix

Most helpful comment

Yeah, the way I'm currently using Hydra is:

  • In config.yaml
# Put all other args here
batch_size: 128
path: my_data/train

# can organize as needed
model:
  - pretrained: albert-based-uncased
  - num_layers: 2
  - dropout: 0.5

optim:
  - lr: 3e-4
  - scheduler: CosineLR

# Group all trainer args under this
trainer:
  - gpus: 1
  - val_percent_check: 0.1
  .... Put all trainer args here ...

and then in training

class MyModule(LightningModule):
    self.__init__(hparams: DictConfig):
         self.hparams = hparams
         self.model = BaseModel(version=self.hparams.model.version, layers= self.hparams.model.layer)
    self.__configure_optimizer(self):
         opt = Adam(lr=self.hparams.optim.lr)
.....
module = MyModule(hparams=cfg)
trainer = Trainer(**OmegaConf.to_container(cfg.trainer, resolve=True))
trainer.fit(module)

All 24 comments

In lightning, hparams needs to be flat and so it cannot contain a dict which can be problematic in certain cases. For example, I would like to define optimizer parameters within hparams['optimizer'].

Using hydra could probably solve this issue.

@hadim @sai-prasanna this sounds awesome. I'd love to support hydra. Mind looking into it and submit a proposal here?

cc: @Borda

My current ML project is using Hydra. It is pretty awesome !

Have a look: https://github.com/nicolas-chaulet/deeppointcloud-benchmarks

I think that having parameters organised in groups would make it easier to navigate, on the other hand, passing dict as not very safe...
Ahat about having classes this kind of parameter groups which would ensure default values in resolve missing params as not it is handled by defaults...

class ParamaterGroup(ABC):
    pass

class OptomizerParams(ParamaterGroup):
    def __init__(self, param1: int = 123, ...):
        self.param1 = param1
        ...

@PyTorchLightning/core-contributors ^^

I like the idea of ParamaterGroup (or using hydra API would be ok too).

ParamaterGroup should also have a way to serialize itself to common formats such as JSON/YAML/TOML.

That being said I think it's important to keep allowing dict type for hparams for more simple scenarios. We could simply convert hparams to ParamaterGroup whenever a dict is detected during init. We should also be careful when a nested dict is detected; is it a new ParameterGroup or just a parameter with a dict type?.

YAML is great and compares to JSON you can write a comment inside... lol @williamFalcon ^^ ?

I would look into the current configuration flow and try to evaluate the pros and cons of hydra over it this weekend.

Not a current lightning user, but our team is evaluating switching over.

When you look at hudra, you might want to take a look at jssonet too. From what I can tell, hydra has a pretty mediocre use of variables, compared to jsonnet where you can cleanly define variables and do simple math and boolean logic on them. Jsonnet also implements composing multiple configs, just like hydra. And it can implement command line overrides via something like jsonargparse.

Anyway, its useful for allennlp so might be worth taking a look to see if it is the right thing for pytorch-lightning

I looked at hydra and this pretty high level and also opinionated the way they handle the configuration.

That being said the configuration object they use is based on omegaconf which looks really really nice (similar to what @Borda proposes).

So I would propose to use/support omegaconf. Hydra can then be used by anyone who wants, there is no need to integrate it to PL IMO.

I am a Allennlp contributor. Allennlp provides abstractions to make models generic. i.e. Say you have a text classifier, instead of putting an LSTM inside it, you would rather recieve a abstract Seq2SeqEncoder via constructor and use it's interface. LSTM would be a subclass of Seq2Seq abstract class and registered with a string name to it. This allows one to use configuration to control what model is actually built. It primarily relies on constructor dependency injection. This is a opinionated way of doing things. IMO it won't suite pytorch lightning's ideology of providing training boiler plate and leaving the rest to the implementer (who can choose to do dep injection or not).

I agree with @hadim 's assessment that being too opinionated is wrong choice for lightning. Instead it would be better to remove impediments for users to use their own config system, while maybe providing a sane default maybe as an example.

I have two questions for regular users of lightning.

  1. What flaws do people see in the current configuration flow in lightning and What currently prevent the user from using their configuration file/argparse system of choice in the current design?

  2. Does lightning have to concern itself with stuff like hyperparameter search?

  1. Ideally lightning should support any hyperparameter configuration/ search library. Would welcome a PR to make that happen.

  2. @sai-prasanna you can pass in models into the constructor in lightning right now, but why wouldn’t you just import the model in the lightningmodule and init in the constructor?

from allennlp import lstm

class MyS2S(pl.LightningModule):

def __init__(self, hparams):
self.encoder = lstm()
...

@williamFalcon In allennlp it is done that way to search registry using with type signature of objects in the constructor.

@Model.register("seq2seq")
class Seq2Seq(Model):
   def __init__(self, encoder: Seq2SeqEncoder):
         ...
@Seq2SeqEncoder.register("lstm")
class LSTM(Seq2SeqEncoder):
   def __init__(self, h: int):
       ...

Now in a configuration

{
  "model": {
    "type": "seq2seq"
    "encoder": {
        "type": "lstm"
        "h": 1000
    }
}

This allows multiple experiments to be made by configuration changes, without having to write a mapping between the configuration and the actual class.
Anyway, I would prefer plt should be agnostic to all this as it is now.

@williamFalcon I used plt few months back, so was a bit hazy. Since in the current work flow, creating models is explicit there is no need for anything more. Anyone wishing to use hydra can use that instead of argparse to construct the models easily. We can have an example of using hydra for some complex configuration case instead of argparse and be done with that.
Would that do?

In AllenNLP there is effort now to get pytorch-lightning work as its trainer abstraction.

https://github.com/allenai/allennlp/pull/3860

I think AllenNLP's jsonnet configuration works well for it. Would be interesting to see how this synthesis goes. AllenNLP though focused on NLP, has very good dependency injection + configuration management abstractions that make writing models that are easily ammenable to experimentation via simple configuration changes.

With this integration, one can use the powerful config system in allennlp for many different types of tasks (maybe with few changes here and there).

Hi all!
The hydra support is ready after merging #1152!
I plan to send PR containing the following changes:

  1. LightningModule explicitly accepts DictConfig as the hparams
  2. log_hyperparams converts the hparams to built-in dict using OmegaConfig.to_container(self.hparams, resolve=True)

Please correct me if I'm wrong and suggest the other features to add!

Yeah, the way I'm currently using Hydra is:

  • In config.yaml
# Put all other args here
batch_size: 128
path: my_data/train

# can organize as needed
model:
  - pretrained: albert-based-uncased
  - num_layers: 2
  - dropout: 0.5

optim:
  - lr: 3e-4
  - scheduler: CosineLR

# Group all trainer args under this
trainer:
  - gpus: 1
  - val_percent_check: 0.1
  .... Put all trainer args here ...

and then in training

class MyModule(LightningModule):
    self.__init__(hparams: DictConfig):
         self.hparams = hparams
         self.model = BaseModel(version=self.hparams.model.version, layers= self.hparams.model.layer)
    self.__configure_optimizer(self):
         opt = Adam(lr=self.hparams.optim.lr)
.....
module = MyModule(hparams=cfg)
trainer = Trainer(**OmegaConf.to_container(cfg.trainer, resolve=True))
trainer.fit(module)

I've tried a few different approaches in using hydra with PL, and I believe this is the cleanest approach:

  1. https://github.com/yukw777/leela-zero-pytorch/blob/d90d5fc93e86647638a59a7957c6c930ec4268fe/leela_zero_pytorch/train.py#L15-L47 (note the use of strict=False)
  2. https://github.com/yukw777/leela-zero-pytorch/tree/master/leela_zero_pytorch/conf

Example command:

python -m leela_zero_pytorch.train. network=huge train.dataset.dir_path=some/dataset/train pl_trainer.gpus=-1

Pros:

  1. We don't need more substantial changes on the PL side to integrate Hydra.
  2. You can specify all the PL related parameters in the config files or in the command line.

Cons:

  1. Not all the PL options are displayed in the help message.
  2. You do have to know a bit more about hydra (I don't think this is a big problem.)
  3. Some custom parsing logic and validation (e.g. --gpus) have to be ported.

I tried mixing argparse with hydra: https://github.com/yukw777/leela-zero-pytorch/blob/983d9568ed34ed06ebde47ecb65b1e3b2d3a37c0/leela_zero_pytorch/train.py#L17-L52

This way Hydra doesn't own the logging directory structure.

Friends,

I did something like @lkhphuc:

  • config.yaml
#configs/config.yaml
dataset:
  train_path: "resources/dataset/train.jsonl"
  test_path: "resources/dataset/train.jsonl"
  val_path: "resources/dataset/train.jsonl"

train:
  batch_size: 32
test:
  batch_size: 64
val:
  batch_size: 32

preprocessing:
  max_length: 64

  • entry point
@hydra.main(config_path="configs/config.yaml", strict=False)
def dev_run(cfg: DictConfig):

    print(cfg.pretty())

    model = JointEncoder(hparams=cfg, ... )
    trainer = Trainer(fast_dev_run=True)
    trainer.fit(model)


if __name__ == "__main__":
    dev_run()
  • LightningModule
class JointEncoder(LightningModule):
    """Encodes the code and docstring into an same space of embeddings."""

    def __init__(self,
                 hparams: DictConfig,
                 code_encoder,
                 docstring_encoder
                 ):
        super(JointEncoder, self).__init__()
        self.hparams = hparams
        ...
        self.loss_fn = NPairsLoss()

But I'm getting the following error:

ValueError: Unsupported config type of <class 'omegaconf.dictconfig.DictConfig'>.

I tried mixing argparse with hydra: https://github.com/yukw777/leela-zero-pytorch/blob/983d9568ed34ed06ebde47ecb65b1e3b2d3a37c0/leela_zero_pytorch/train.py#L17-L52

This way Hydra doesn't own the logging directory structure.

awesome this is great! mind adding a tutorial to the docs on this? (under hyperparameters)

Yeah, the way I'm currently using Hydra is:

  • In config.yaml
# Put all other args here
batch_size: 128
path: my_data/train

# can organize as needed
model:
  - pretrained: albert-based-uncased
  - num_layers: 2
  - dropout: 0.5

optim:
  - lr: 3e-4
  - scheduler: CosineLR

# Group all trainer args under this
trainer:
  - gpus: 1
  - val_percent_check: 0.1
  .... Put all trainer args here ...

and then in training

class MyModule(LightningModule):
    self.__init__(hparams: DictConfig):
         self.hparams = hparams
         self.model = BaseModel(version=self.hparams.model.version, layers= self.hparams.model.layer)
    self.__configure_optimizer(self):
         opt = Adam(lr=self.hparams.optim.lr)
.....
module = MyModule(hparams=cfg)
trainer = Trainer(**OmegaConf.to_container(cfg.trainer, resolve=True))
trainer.fit(module)

the problem here is that you always have to list out trainer args.

args = Trainer.add_argparse_args(args)

the above will enable all arguments in argparse

Friends,

I did something like @lkhphuc:

  • config.yaml
#configs/config.yaml
dataset:
  train_path: "resources/dataset/train.jsonl"
  test_path: "resources/dataset/train.jsonl"
  val_path: "resources/dataset/train.jsonl"

train:
  batch_size: 32
test:
  batch_size: 64
val:
  batch_size: 32

preprocessing:
  max_length: 64
  • entry point
@hydra.main(config_path="configs/config.yaml", strict=False)
def dev_run(cfg: DictConfig):

    print(cfg.pretty())

    model = JointEncoder(hparams=cfg, ... )
    trainer = Trainer(fast_dev_run=True)
    trainer.fit(model)


if __name__ == "__main__":
    dev_run()
  • LightningModule
class JointEncoder(LightningModule):
    """Encodes the code and docstring into an same space of embeddings."""

    def __init__(self,
                 hparams: DictConfig,
                 code_encoder,
                 docstring_encoder
                 ):
        super(JointEncoder, self).__init__()
        self.hparams = hparams
        ...
        self.loss_fn = NPairsLoss()

But I'm getting the following error:

ValueError: Unsupported config type of <class 'omegaconf.dictconfig.DictConfig'>.

@Ceceu self.hparams=hparams works with omegaconf=2.0.0, needs to upgrade hydra-core as well. (https://github.com/PyTorchLightning/pytorch-lightning/issues/2197#issuecomment-667914874)

I tried mixing argparse with hydra: yukw777/leela-zero-pytorch@983d956/leela_zero_pytorch/train.py#L17-L52

This way Hydra doesn't own the logging directory structure.

(just passing by)

Note: strict=False is deprecated in Hydra 1.0 :(

@vict0rsch thanks for passing by! I actually no longer use that approach, as it is not the recommended approach. Please check out my blog post for more details! https://link.medium.com/KaROmhBvz9

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

Was this page helpful?
0 / 5 - 0 ratings