2018-12-07 18:58:13,471 maskrcnn_benchmark.inference INFO: OrderedDict([('bbox', OrderedDict([('AP', 0.266143220179594), ('AP50', 0.4705279119903588), ('AP75', 0.2664711486678874), ('APs', 0.0742186384761436), ('APm', 0.26418817964465885), ('APl', 0.4618351991771723)])), ('segm', OrderedDict([('AP', 0.2169857479304357), ('AP50', 0.4159623962610022), ('AP75', 0.17807455425402843), ('APs', 0.029122872145021395), ('APm', 0.174442224182182), ('APl', 0.42977448859947454)]))])
--config-file "../configs/cityscapes/e2e_mask_rcnn_R_50_FPN_1x_cocostyle.yaml" SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.00125 SOLVER.MAX_ITER 200000 SOLVER.STEPS "(160000, 180000)" TEST.IMS_PER_BATCH 1
maskrcnn-benchmark/maskrcnn_benchmark/utils/model_serialization.py get problem becase model_state_dict[key] = loaded_state_dict[key_old] overwriting the original value : def load_state_dict(model, loaded_state_dict):
model_state_dict = model.state_dict()
# if the state_dict comes from a model that was wrapped in a
# DataParallel or DistributedDataParallel during serialization,
# remove the "module" prefix before performing the matching
loaded_state_dict = strip_prefix_if_present(loaded_state_dict, prefix="module.")
align_and_update_state_dicts(model_state_dict, loaded_state_dict) ##model_state_dict[key] = loaded_state_dict[key_old]
# use strict loading
model.load_state_dict(model_state_dict)
def load_state_dict(model, loaded_state_dict):
model_state_dict = model.state_dict()
# if the state_dict comes from a model that was wrapped in a
# DataParallel or DistributedDataParallel during serialization,
# remove the "module" prefix before performing the matching
loaded_state_dict = strip_prefix_if_present(loaded_state_dict, prefix="module.")
# align_and_update_state_dicts(model_state_dict, loaded_state_dict)
# # finetune
loaded_state_dict = {k:v for k,v in loaded_state_dict.items() if k in model_state_dict and model_state_dict[k].size()==v.size()}
model_state_dict.update(loaded_state_dict)
# use strict loading
model.load_state_dict(model_state_dict)
maskrcnn_benchmark/utils/checkpoint.py get error, i don't know why should load self.optimizer.load_state_dict and self.scheduler.load_state_dict , it has 'momentum_buffer'paremeter , i don't understand why load this parameter . can you explain ? and how can i use coco pretrain model to finetune cityscapes ? thanks ! def load(self, f=None):
if self.has_checkpoint():
# override argument with existing checkpoint
f = self.get_checkpoint_file()
if not f:
# no checkpoint could be found
self.logger.info("No checkpoint found. Initializing model from scratch")
return {}
self.logger.info("Loading checkpoint from {}".format(f))
checkpoint = self._load_file(f)
self._load_model(checkpoint)
if "optimizer" in checkpoint and self.optimizer:
self.logger.info("Loading optimizer from {}".format(f))
self.optimizer.load_state_dict(checkpoint.pop("optimizer"))
if "scheduler" in checkpoint and self.scheduler:
self.logger.info("Loading scheduler from {}".format(f))
self.scheduler.load_state_dict(checkpoint.pop("scheduler"))
# return any further checkpoint data
return checkpoint
Hi,
I believe best results for cityscapes are obtained after starting from a model pre-trained on COCO, and then some model surgery are done so that the common classes between COCO and cityscapes are kept.
See this file for more information.
About your second question, I'm sorry but I couldn't understand what was the problem that you are facing. Can you give a bit more context?
thanks you @fmassa reply. i have get some help from #15 . but I haven't reproduced the cityscapes instance segmentation result yet .i hope someone can share the cityscpaes model . so i can compare the different.
| time| set| data_val|segAP|mAp
| ------ | ------ | ------ | ------ | ------ |
| | paper |fine|0.315| |
| | paper |fine+coco|0.365| |
|2018-12-06| single gpu |fine|0.217|0.266|
|2018-12-11| multi gpu| fine| 0.238|0.278|
|2018-12-08| single gpu|fine+coco|0.285|0.331|
I haven't myself trained models on cityscapes, so I might not be the best person to help you with that. Maybe @henrywang1 knows a bit better, as he's the one who originally added support to cityscapes
Hi @ranjiewwen,
I only tried end to end training on cityscapes.
I followed the steps described by the paper, and the result AP[val] is about 0.316.
We train with image scale (shorter side) randomly sampled from [800, 1024], which reduces overfitting; inference is on a single scale of 1024 pixels.
I didn't submit the code because I thought everyone might have their own transformation.
You could refer the below changes:
In transform.py, add this class
class RandomResize(object):
def __init__(self, min_size, max_size):
self.min_size = min_size
self.max_size = max_size
def get_size(self, image_size):
w, h = image_size
min_size = self.min_size
max_size = self.max_size
rand = random.randint(min_size, max_size)
return rand, int(w*rand/h)
def __call__(self, image, target):
size = self.get_size(image.size)
image = F.resize(image, size)
target = target.resize(image.size)
return image, target
In build.py, modify build_transforms
if "cityscapes" in cfg.DATASETS.TRAIN[0]:
if is_train:
transform = T.Compose(
[
T.RandomResize(800, 1024),
T.RandomHorizontalFlip(flip_prob),
T.ToTensor(),
normalize_transform,
]
)
else:
transform = T.Compose(
[
T.ToTensor(),
normalize_transform,
]
)
else: #...
thanks @henrywang1 . i will try to train again! look for the good result !
thanks you @fmassa reply. i have get some help from #15 . but I haven't reproduced the cityscapes instance segmentation result yet .i hope someone can share the cityscpaes model . so i can compare the different.
* for the second question is simple : " i want to finetune cityscapse from pretrained coco detectron models with different number of classes". * beacuse the class num is different, so i modify the load_state_dict funtion, but the code also load optimizer and scheduler parameter ,so it also conflict, so when load these parameter, i block this code . * the follow is my result:time set data_val segAP mAp
paper fine 0.315
paper fine+coco 0.365
2018-12-06 single gpu fine 0.217 0.266
2018-12-11 multi gpu fine 0.238 0.278
2018-12-08 single gpu fine+coco 0.285 0.331
I am wondering what is the mAP in your result? Is it the bbox mAP?
thanks you @fmassa reply. i have get some help from #15 . but I haven't reproduced the cityscapes instance segmentation result yet .i hope someone can share the cityscpaes model . so i can compare the different.
* for the second question is simple : " i want to finetune cityscapse from pretrained coco detectron models with different number of classes". * beacuse the class num is different, so i modify the load_state_dict funtion, but the code also load optimizer and scheduler parameter ,so it also conflict, so when load these parameter, i block this code . * the follow is my result:time set data_val segAP mAp
paper fine 0.315
paper fine+coco 0.365
2018-12-06 single gpu fine 0.217 0.266
2018-12-11 multi gpu fine 0.238 0.278
2018-12-08 single gpu fine+coco 0.285 0.331I am wondering what is the mAP in your result? Is it the bbox mAP?
mAP is for the bbox , you can read original mask r-cnn paper, or read the evaluation code: coco_eval.py
hi @ranjiewwen
had u reproduced the results on cityscapes dataset? i follow the steps in mask-rcnn paper, only get 0.250 using fine dataset, and 0.293 using fine + coco
after set both MAX_SIZE_TRAIN and MAX_SIZE_TEST to 2048 and do the re-trainings, get 0.316
using fine and 0.358 using fine + coco.
hi @henrywang1
what value did u set for MAX_SIZE_TRAIN and MAX_SIZE_TEST? is it 2048?
Hi @zimenglan-sysu-512
I followed the setting on the paper, so I hard-coded the MIN/MAX_SIZE_TRAIN (as described in https://github.com/facebookresearch/maskrcnn-benchmark/issues/259#issuecomment-449118259)
And I just notice that my previous reply is not complete.
For test, the paper said inference is on
a single scale of 1024 pixels.
So we have to let the transform be
transform = T.Compose(
[
T.Resize(1024, 1024),
T.ToTensor(),
normalize_transform,
]
)
For other settings or training log, you could send an e-mail me.
Most helpful comment
Hi @ranjiewwen,
I only tried end to end training on cityscapes.
I followed the steps described by the paper, and the result AP[val] is about 0.316.
I didn't submit the code because I thought everyone might have their own transformation.
You could refer the below changes:
In
transform.py, add this classIn
build.py, modifybuild_transforms