Detectron2: How to resume training from last checkpoint?

Created on 23 Oct 2019 · 2Comments · Source: facebookresearch/detectron2

Hi,

How to resume training from a previous checkpoint on a custom dataset?
I know that you need to change

trainer.resume_or_load(resume=True)

Besides this, what should be the values of cfg.merge_from_file and cfg.MODEL.WEIGHTS?

Many thanks!

Source

var316

👍1

Most helpful comment

This is answered in https://detectron2.readthedocs.io/modules/engine.html#detectron2.engine.defaults.DefaultTrainer.resume_or_load.
If the last checkpoint exists, cfg will not be used so it can be anything.

ppwwyyxx on 23 Oct 2019

👍3

All 2 comments

ppwwyyxx on 23 Oct 2019

👍3

The code of config of resume is as follows.

https://github.com/facebookresearch/detectron2/blob/662cbb71538fc5169dc2361f97ca0e4ed2961f75/detectron2/engine/defaults.py#L48

    parser.add_argument(
        "--resume",
        action="store_true",
        help="whether to attempt to resume from the checkpoint directory",
    )

The argparse action API doesn't take anything as an argument, it simply gives a true flag. Therefore, if you execute the program as follows, resume training will be performed and the checkpoint described in last_checkpoint file of specified OUTPUT_DIR will be loaded. If last_checkpoint file does not exist, start normal training.

python tools/train_net.py \
    --config-file configs/PascalVOC-Detection/faster_rcnn_R_50_C4.yaml OUTPUT_DIR output \
    --resume

Keiku on 15 Mar 2020

👍2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

production environment

kl720 · 3Comments

How to speed up the data loader?

invisprints · 4Comments

Failed to load OpenCL runtime

choasup · 3Comments

Is it possible to use detectron2 without cuda?

marcoippolito · 4Comments

torchvision vs detectron2

aminekechaou · 3Comments