Maskrcnn-benchmark: How to Improve?

Created on 16 Dec 2018 · 4Comments · Source: facebookresearch/maskrcnn-benchmark

❓ Questions and Help

Hi there! I've finally been able to get a Mask R-CNN to train but unfortunately the results are not great. This is probably due to the fact that I'm using medical imaging data (ultrasound images) and the fact that by nature, medical imaging datasets (especially annotated ones) are small.

I was wondering if you have any advice on how to improve the results.

For context, I have 5635 images total (including train and val - I do a 90/10 split) that are of size 420 x 580. I've been using pretrained resnet-50 weights and am currently trying resnet-101. There's typically only one mask per image.

Here's a sample output from the model:

Here's what the mask should be (the blacked out area in the image):

Here's my config file for resnet-50 (note that I'm training on one GPU):

MODEL:
  META_ARCHITECTURE: "GeneralizedRCNN"
  WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
  BACKBONE:
    CONV_BODY: "R-50-FPN"
    OUT_CHANNELS: 256
  RPN:
    USE_FPN: True
    ANCHOR_SIZES: (32, 64, 128, 256, 512)
    ANCHOR_STRIDE: (4, 8, 16, 32, 64)
    PRE_NMS_TOP_N_TRAIN: 2000
    PRE_NMS_TOP_N_TEST: 1000
    POST_NMS_TOP_N_TEST: 1000
    FPN_POST_NMS_TOP_N_TEST: 1000
  ROI_HEADS:
    USE_FPN: True
  ROI_BOX_HEAD:
    POOLER_RESOLUTION: 7
    POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
    POOLER_SAMPLING_RATIO: 2
    FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
    PREDICTOR: "FPNPredictor"
    NUM_CLASSES: 2
  ROI_MASK_HEAD:
    POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
    FEATURE_EXTRACTOR: "MaskRCNNFPNFeatureExtractor"
    PREDICTOR: "MaskRCNNC4Predictor"
    POOLER_RESOLUTION: 14
    POOLER_SAMPLING_RATIO: 2
    RESOLUTION: 28
    SHARE_BOX_FEATURE_EXTRACTOR: False
  MASK_ON: True
DATASETS:
  TRAIN: ("nerve_train",)
  TEST: ("nerve_val",)
DATALOADER:
  NUM_WORKERS: 0
  SIZE_DIVISIBILITY: 32
INPUT:
  MIN_SIZE_TRAIN: 420
  MAX_SIZE_TRAIN: 580
  MIN_SIZE_TEST: 420
  MAX_SIZE_TEST: 580
SOLVER:
  BASE_LR: 0.0025
  WEIGHT_DECAY: 0.0001
  STEPS: (60000, 80000)
  MAX_ITER: 90000
  IMS_PER_BATCH: 2
TEST:
  IMS_PER_BATCH: 2

Thank you so much in advance.

question

Source

jbitton

Most helpful comment

Is this the sample output you shew a training image or testing image?
Looking at the sample, I think the RPN part is not working as you can see no predicted bbox overlaps with true bbox. In this case I would first train the RPN head only and see how the detection works.

wmmxk on 2 Jan 2019

👍3

All 4 comments

Hi,

There is a number of things that could be done / checked.

more aggressive data augmentation, like random cropping / warping. This requires writing a bit of code (should be very simple), and also making sure that there is always the mask in the cropped region (in order to avoid other errors)
can your model overfit to the training data?