Hi,
How to resume training from a previous checkpoint on a custom dataset?
I know that you need to change
trainer.resume_or_load(resume=True)
Besides this, what should be the values of cfg.merge_from_file and cfg.MODEL.WEIGHTS?
Many thanks!
This is answered in https://detectron2.readthedocs.io/modules/engine.html#detectron2.engine.defaults.DefaultTrainer.resume_or_load.
If the last checkpoint exists, cfg will not be used so it can be anything.
The code of config of resume is as follows.
parser.add_argument(
"--resume",
action="store_true",
help="whether to attempt to resume from the checkpoint directory",
)
The argparse action API doesn't take anything as an argument, it simply gives a true flag. Therefore, if you execute the program as follows, resume training will be performed and the checkpoint described in last_checkpoint file of specified OUTPUT_DIR will be loaded. If last_checkpoint file does not exist, start normal training.
python tools/train_net.py \
--config-file configs/PascalVOC-Detection/faster_rcnn_R_50_C4.yaml OUTPUT_DIR output \
--resume
Most helpful comment
This is answered in https://detectron2.readthedocs.io/modules/engine.html#detectron2.engine.defaults.DefaultTrainer.resume_or_load.
If the last checkpoint exists,
cfgwill not be used so it can be anything.