Pytorch-lightning: Error using Hydra

Created on 29 May 2020 · 10Comments · Source: PyTorchLightning/pytorch-lightning

Hi all,

Thanks a lot for the awesome library. I'm trying to use Hydra with pytorch-lightning. I'm using the last release of pytorch-lightning.
However, I got the following error after my training step:

...
  File "/home/.local/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 241, in on_validation_end
    self._do_check_save(filepath, current, epoch)
  File "/home/.local/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 275, in _do_check_save
    self._save_model(filepath)
  File "/home/.local/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 142, in _save_model
    self.save_function(filepath)
  File "/home/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/training_io.py", line 260, in save_checkpoint
    checkpoint = self.dump_checkpoint()
  File "/home/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/training_io.py", line 353, in dump_checkpoint
    raise ValueError(
ValueError: ('The acceptable hparams type is dict or argparse.Namespace,', ' not DictConfig')
Exception ignored in: <function tqdm.__del__ at 0x7f1f9e379a60>

It says that Dictconfig are not supported for saving, but I saw in a pull request that this problem has been corrected.
Can you point me to direction on how to correct this?

Code

@hydra.main("config/config.yaml")
def main(cfg=None):
    wrap_tb_logger()
    model = hydra.utils.instantiate(cfg.model, cfg)

    trainer = pl.Trainer(
        gpus=list(cfg.gpus),
        max_epochs=cfg.epochs,
        train_percent_check=0.4
    )

    trainer.fit(model)

What's your environment?

OS: Linux
Packaging pip
Version 0.7.1

bug / fix question

Source

pvnieo

All 10 comments

Hi! thanks for your contribution!, great first issue!

github-actions[bot] on 29 May 2020

👀1

@Borda can you add this fix?

williamFalcon on 30 May 2020

@williamFalcon @Borda I solved this problem by converting hparams to a flat dictionary after I initialized my model.

import omegaconf

def _to_dot_dict(cfg):
    res = {}
    for k, v in cfg.items():
        if isinstance(v, omegaconf.DictConfig):
            res.update(
                {k + "." + subk: subv for subk, subv in _to_dot_dict(v).items()}
            )
        elif isinstance(v, (str, int, float, bool)):
            res[k] = v

    return res

I admit this is only a detour waiting for the feature to be released!

pvnieo on 31 May 2020

@pvnieo check master?
i pushed a fix last night

williamFalcon on 31 May 2020

@williamFalcon Yes thank you, I just tested with the last dev version and It's working!
However, I got this warning, and I don't know where is it called!

UserWarning: you called `module.module_arguments` without calling self.auto_collect_arguments()

pvnieo on 31 May 2020

we can just disable it haha. submit a PR?

williamFalcon on 31 May 2020

Sorry, I submit a PR for what?

pvnieo on 31 May 2020

disabling the warning

williamFalcon on 31 May 2020

Ok.
I just discovered that (I don't know if it's related the last dev version, but I hadn't this problem before) when I kill my program using Ctrl+C, the process isn't killed and it continue training (before it's says that it's trying "to kill the process gracefully" or sth like this).
I found in this stack over flow question that it's maybe related to using multiple threads and the killing isn't handled well.
Is this a known issue?

pvnieo on 31 May 2020

this is fixed in #2029

williamFalcon on 1 Jun 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings