Detectron2: issue while using a custom coco annotated dataset

Created on 13 Oct 2019  ยท  5Comments  ยท  Source: facebookresearch/detectron2

โ“ Empty masks after training on a custom coco dataset

I have a custom coco annotated dataset (generated thru coco annotator tool), and I am using
the method described in the doc to register my dataset

from detectron2.data import DatasetCatalog, MetadataCatalog
from detectron2.data.datasets import register_coco_instances


register_coco_instances("chunks/train", {}, "annotations/chunks_train.json",
                        "chunks_train/")

register_coco_instances("chunks/val", {}, "annotations/chunks_val.json",
                        "chunks_val/")

chunks_metadata_train = MetadataCatalog.get("chunks/train")
chunks_metadata_val = MetadataCatalog.get("chunks/val")

Then I can train the model without any problem again following the documentation

cfg = get_cfg()

cfg.merge_from_file("./detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.DATASETS.TRAIN = ("chunks/train",)
cfg.DATASETS.TEST = ()    # no metrics implemented for this dataset
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl"  # initialize from model zoo
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025
cfg.SOLVER.MAX_ITER = 3000    # 300 iterations seems good enough, but you can certainly train longer
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128   # faster, and good enough for this toy dataset
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1 # only has one class

os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg) 
trainer.resume_or_load(resume=False)
trainer.train()

The problem is occuring during inference

d = DatasetCatalog.get('chunks/val')
im = cv2.imread(d[0]["file_name"])
outputs = predictor(im)

#if i take any of the predicted masks, they are completely empty/black/zeroed
masks = np.array(outputs['instances'].get('pred_masks')[0].to('cpu'))
np.unique(masks.astype('uint8'))

==> array([0], dtype=uint8)

then obviously if i try to visualize I've got an error, as opencv findContours cannot detect any bounding box as the masks are empty

v = v.draw_instance_predictions(outputs["instances"].to("cpu"))

...
/data/home/doursand/notebooks/Detectron2/detectron2/detectron2/utils/visualizer.py in mask_to_polygons(self, mask)
108 res = cv2.findContours(mask.astype("uint8"), cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE)
109 hierarchy = res[-1]
--> 110 has_holes = (hierarchy.reshape(-1, 4)[:, 3] >= 0).sum() > 0
111 res = res[-2]
112 res = [x.flatten() for x in res]

AttributeError: 'NoneType' object has no attribute 'reshape'
...

Anybody knows what I am missing here ? thanks in advance

Most helpful comment

@ppwwyyxx

Actually I gave up trying to use the register_coco_instances function and instead I created a modified version of the get_ballon_dict from the tutorial which is basically computing the bounding boxes out of the masks. I also had to define the thing_dataset_id_to_contiguous_id method to my corresponding object id

Here is the new function i created in case someone would like to try. With this new function I was able to train the model and to have the predictions of the masks without any issues

def get_peanuts_dicts(img_path,json_path):
    '''
    AD Oct 2019 : loading function to calculate the bboxes from the masks (as it does not exist so far in Detectron2)
    expected format:

    {'file_name': 'path/to/image',
    'height' : imgheight,
    'width': imgwidth,
    'annotations':
    [{'bbox': [xmin,ymin,xmax,ymax], 'bbox_mode': <BoxMode.XYXY_ABS: 0>,
    'segmentation': [[polygon/coordinates/x/y]],
    'category_id': 0, 'iscrowd': 0}, 
    {'bbox': [xmin2,ymin2,xmax2,ymax2], 'bbox_mode': <BoxMode.XYXY_ABS: 0>,
    'segmentation': [[polygon2/coordinates/x/y]],
    'category_id': 0, 'iscrowd': 0},

    etc ...]

    }
    INPUTS:
        img_path . str, path to the images
        json_path. str, path to the json annotation file (coco format)

    '''

    with open(json_path) as f:
        imgs_anns = json.load(f)
    dataset_dicts = []

    for img in imgs_anns['images']:
        record = {}

        filename = os.path.join(img_path,img['file_name'])
        height, width = cv2.imread(filename).shape[:2]
        image_id = img['id']

        record["file_name"] = filename
        record["height"] = height
        record["width"] = width

        for annos in imgs_anns['annotations']:
            if annos['image_id'] == image_id:

                objs = []
                poly = annos['segmentation']

                for p in poly:
                    bbox=[]
                    x, y = p[::2] , p[1::2]
                    bbox.append(np.min(x))
                    bbox.append(np.min(y))
                    bbox.append(np.max(x))
                    bbox.append(np.max(y))
                    obj = {"bbox": bbox,
                            "bbox_mode": BoxMode.XYXY_ABS,
                            "segmentation": [p],
                            "category_id": 0,
                            "iscrowd": 0}
                    objs.append(obj)


        record["annotations"] = objs
        dataset_dicts.append(record)
    return dataset_dicts

All 5 comments

In OpenCV 'NoneType' errors are generated due to not being able to read the frames. Try changing the index of d to 1.

thanks , however i think i know what the problem is . The coco dataset i generated is only with masks and without any bboxes, as I was creating these bboxes from the mask coordinates in the __getitem__ method of the custom dataset I was using in torchvision segmentation example. It sounds like the register_coco_instances function does expect a "complete" coco dataset including bboxes (e.g. they are not calculated automatically from the masks). So I can close this for now

The visualizer does not handle empty masks very well. This will be fixed soon.

As for why your model predicts empty masks - you can first verify your data format is correct by visualizing them just like the colab tutorial did. If the data is correct but the training fails to produce good models, we do not help people design models/parameters for their datasets.

@ppwwyyxx

Actually I gave up trying to use the register_coco_instances function and instead I created a modified version of the get_ballon_dict from the tutorial which is basically computing the bounding boxes out of the masks. I also had to define the thing_dataset_id_to_contiguous_id method to my corresponding object id

Here is the new function i created in case someone would like to try. With this new function I was able to train the model and to have the predictions of the masks without any issues

def get_peanuts_dicts(img_path,json_path):
    '''
    AD Oct 2019 : loading function to calculate the bboxes from the masks (as it does not exist so far in Detectron2)
    expected format:

    {'file_name': 'path/to/image',
    'height' : imgheight,
    'width': imgwidth,
    'annotations':
    [{'bbox': [xmin,ymin,xmax,ymax], 'bbox_mode': <BoxMode.XYXY_ABS: 0>,
    'segmentation': [[polygon/coordinates/x/y]],
    'category_id': 0, 'iscrowd': 0}, 
    {'bbox': [xmin2,ymin2,xmax2,ymax2], 'bbox_mode': <BoxMode.XYXY_ABS: 0>,
    'segmentation': [[polygon2/coordinates/x/y]],
    'category_id': 0, 'iscrowd': 0},

    etc ...]

    }
    INPUTS:
        img_path . str, path to the images
        json_path. str, path to the json annotation file (coco format)

    '''

    with open(json_path) as f:
        imgs_anns = json.load(f)
    dataset_dicts = []

    for img in imgs_anns['images']:
        record = {}

        filename = os.path.join(img_path,img['file_name'])
        height, width = cv2.imread(filename).shape[:2]
        image_id = img['id']

        record["file_name"] = filename
        record["height"] = height
        record["width"] = width

        for annos in imgs_anns['annotations']:
            if annos['image_id'] == image_id:

                objs = []
                poly = annos['segmentation']

                for p in poly:
                    bbox=[]
                    x, y = p[::2] , p[1::2]
                    bbox.append(np.min(x))
                    bbox.append(np.min(y))
                    bbox.append(np.max(x))
                    bbox.append(np.max(y))
                    obj = {"bbox": bbox,
                            "bbox_mode": BoxMode.XYXY_ABS,
                            "segmentation": [p],
                            "category_id": 0,
                            "iscrowd": 0}
                    objs.append(obj)


        record["annotations"] = objs
        dataset_dicts.append(record)
    return dataset_dicts

Sounds good. The model does expect the bounding boxes to exist and by default it does not compute it from masks (because the computed one is not necessarily equal to the annotation).
If you use a custom dataloader, you can then enable it in your mapper here:https://github.com/facebookresearch/detectron2/blob/bc4cf198dc04e5e22a6dfd19fc8846f5ba7f0fc8/detectron2/data/dataset_mapper.py#L133
but of course, doing it in the dataset is also a good solution

Was this page helpful?
0 / 5 - 0 ratings

Related issues

LotharTUM picture LotharTUM  ยท  3Comments

choasup picture choasup  ยท  3Comments

marcoippolito picture marcoippolito  ยท  4Comments

Ormagardskvaedi picture Ormagardskvaedi  ยท  4Comments

danielgordon10 picture danielgordon10  ยท  3Comments