Hi all,
Thanks a lot for the awesome library. I'm trying to use Hydra with pytorch-lightning. I'm using the last release of pytorch-lightning.
However, I got the following error after my training step:
...
File "/home/.local/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 241, in on_validation_end
self._do_check_save(filepath, current, epoch)
File "/home/.local/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 275, in _do_check_save
self._save_model(filepath)
File "/home/.local/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 142, in _save_model
self.save_function(filepath)
File "/home/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/training_io.py", line 260, in save_checkpoint
checkpoint = self.dump_checkpoint()
File "/home/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/training_io.py", line 353, in dump_checkpoint
raise ValueError(
ValueError: ('The acceptable hparams type is dict or argparse.Namespace,', ' not DictConfig')
Exception ignored in: <function tqdm.__del__ at 0x7f1f9e379a60>
It says that Dictconfig are not supported for saving, but I saw in a pull request that this problem has been corrected.
Can you point me to direction on how to correct this?
@hydra.main("config/config.yaml")
def main(cfg=None):
wrap_tb_logger()
model = hydra.utils.instantiate(cfg.model, cfg)
trainer = pl.Trainer(
gpus=list(cfg.gpus),
max_epochs=cfg.epochs,
train_percent_check=0.4
)
trainer.fit(model)
Hi! thanks for your contribution!, great first issue!
@Borda can you add this fix?
@williamFalcon @Borda I solved this problem by converting hparams to a flat dictionary after I initialized my model.
import omegaconf
def _to_dot_dict(cfg):
res = {}
for k, v in cfg.items():
if isinstance(v, omegaconf.DictConfig):
res.update(
{k + "." + subk: subv for subk, subv in _to_dot_dict(v).items()}
)
elif isinstance(v, (str, int, float, bool)):
res[k] = v
return res
I admit this is only a detour waiting for the feature to be released!
@pvnieo check master?
i pushed a fix last night
@williamFalcon Yes thank you, I just tested with the last dev version and It's working!
However, I got this warning, and I don't know where is it called!
UserWarning: you called `module.module_arguments` without calling self.auto_collect_arguments()
we can just disable it haha. submit a PR?
Sorry, I submit a PR for what?
disabling the warning
Ok.
I just discovered that (I don't know if it's related the last dev version, but I hadn't this problem before) when I kill my program using Ctrl+C, the process isn't killed and it continue training (before it's says that it's trying "to kill the process gracefully" or sth like this).
I found in this stack over flow question that it's maybe related to using multiple threads and the killing isn't handled well.
Is this a known issue?
this is fixed in #2029