Pytorch-lightning: ModelCheckpoint save_function() not set?

Created on 11 Oct 2020  路  2Comments  路  Source: PyTorchLightning/pytorch-lightning

I am training a PL model using the following code snippet:

    # logger
    tb_logger = pl_loggers.TensorBoardLogger(cfg.logs.path, name='rnn_exp')

    # checkpoint callback
    checkpoint_callback = ModelCheckpoint(
        filepath=cfg.checkpoint.path + "encoder_rnn{epoch:02d}",
        save_top_k=1,
        mode="min" # monitor is defined in val_step: EvalResult(checkpoint_on=val_loss)
    )

    # early stopping callback
    early_stopping_callback = EarlyStopping(
        monitor="val_loss",
        patience=cfg.val.patience,
        mode="min"
    )

    tokenizer = ...
    dm = MyDataModule(cfg, tokenizer)

    model = RNNEncoder(cfg)

    trainer = Trainer(
        fast_dev_run=False,
        max_epochs=cfg.train.max_epochs,
        gpus=1,
        logger=tb_logger,
        callbacks=[checkpoint_callback, early_stopping_callback]
    )

    # training
    dm.setup('fit')
    trainer.fit(model, datamodule=dm)

However, after the first epoch, the model presents the following error, probably when calling the model checkpoint callback:

    trainer.fit(model, datamodule=dm)
  File "/home/celso/projects/venvs/semantic_code_search/lib/python3.7/site-packages/pytorch_lightning/trainer/states.py", line 48, in wrapped_fn
    result = fn(self, *args, **kwargs)
  File "/home/celso/projects/venvs/semantic_code_search/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1073, in fit
    results = self.accelerator_backend.train(model)
  File "/home/celso/projects/venvs/semantic_code_search/lib/python3.7/site-packages/pytorch_lightning/accelerators/gpu_backend.py", line 51, in train
    results = self.trainer.run_pretrain_routine(model)
  File "/home/celso/projects/venvs/semantic_code_search/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1239, in run_pretrain_routine
    self.train()
  File "/home/celso/projects/venvs/semantic_code_search/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 394, in train
    self.run_training_epoch()
  File "/home/celso/projects/venvs/semantic_code_search/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 516, in run_training_epoch
    self.run_evaluation(test_mode=False)
  File "/home/celso/projects/venvs/semantic_code_search/lib/python3.7/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 603, in run_evaluation
    self.on_validation_end()
  File "/home/celso/projects/venvs/semantic_code_search/lib/python3.7/site-packages/pytorch_lightning/trainer/callback_hook.py", line 176, in on_validation_end
    callback.on_validation_end(self, self.get_model())
  File "/home/celso/projects/venvs/semantic_code_search/lib/python3.7/site-packages/pytorch_lightning/utilities/distributed.py", line 27, in wrapped_fn
    return fn(*args, **kwargs)
  File "/home/celso/projects/venvs/semantic_code_search/lib/python3.7/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 380, in on_validation_end
    self._do_check_save(filepath, current, epoch, trainer, pl_module)
  File "/home/celso/projects/venvs/semantic_code_search/lib/python3.7/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 421, in _do_check_save
    self._save_model(filepath, trainer, pl_module)
  File "/home/celso/projects/venvs/semantic_code_search/lib/python3.7/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 212, in _save_model
    raise ValueError(".save_function() not set")
ValueError: .save_function() not set

Could you tell me if I forgot to configure something?

bug / fix help wanted

All 2 comments

currently you need to set the ModelCheckpoint via Trainer(checkpoint_callback=...)

3990 will enable passing it to callbacks

Thanks, @awaelchli,

I've just thought that. Thanks a lot for the help.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

remisphere picture remisphere  路  3Comments

monney picture monney  路  3Comments

maxime-louis picture maxime-louis  路  3Comments

williamFalcon picture williamFalcon  路  3Comments

anthonytec2 picture anthonytec2  路  3Comments