First, thanks for well-documented and awesome implementation on Mask-RCNN!
I am working on nucleus data which have much more crowd nucleus compare to DSB 2018 data, but same goal to segment the nucleus part. One image contains lots of object, up to 1,500 masks on one image.
I have read model.py code for understanding how RPN works. For Proposal layer, it take anchors from RPN, sort it by foreground class score and apply NMS. So most of the ROIs will have high probability of being foreground class.
# Box Scores. Use the foreground class confidence. [Batch, num_rois, 1]
scores = inputs[0][:, :, 1]
ix = tf.nn.top_k(scores, pre_nms_limit, sorted=True, name="top_anchors").indices
scores = utils.batch_slice([scores, ix], lambda x, y: tf.gather(x, y), self.config.IMAGES_PER_GPU)
After Proposal layer, it will send those ROIs into DetectionTargetLayer and sample some positive ROIs and negative ROIs by comparing the IOU of Ground Truth. The sampled ROIs will be used to train the classifier and regressor. Mask-RCNN paper suggests that the ratio of positive ROI and negative ROI should be 1:3.
# 1. Positive ROIs are those with >= 0.5 IoU with a GT box
positive_roi_bool = (roi_iou_max >= 0.5)
positive_indices = tf.where(positive_roi_bool)[:, 0]
# 2. Negative ROIs are those with < 0.5 with every GT box. Skip crowds.
negative_indices = tf.where(tf.logical_and(roi_iou_max < 0.5, no_crowd_bool))[:, 0]
My question is, what if our RPN is trained super well and it can catch almost every foreground class? if that happened, DetectionTargetLayer will get lots of positive ROIs and sample only few negative ROIs which might affect the training of classifier?
In my cases, many nucleus are pretty small and crowd, most of nucleus are side by side. So it's easy for RPN selecting too many positive anchors on IOU > 0.5, I am wondering this might hurt the result.
If I misunderstand anything please correct me and thanks again for awesome implementation!
@jimmy15923 Interesting viewpoint! I currently suffer from getting too high false positive rate. And if your point is correct, maybe we can set higher roi_iou_max_threshold in positive_roi_bool = (roi_iou_max >= roi_iou_max_threshold) to solve the problem!
@jimmy15923 Your understanding is correct. If the RPN passes mostly positive ROIs to the second stage, then the 2nd stage classifier won't have enough negative samples for good training. One option is to increase the number of proposals passed to the 2nd stage to ensure the DetectionTargetLayer has enough positive and negative proposals to pick from. This can be controlled with the config variables POST_NMS_ROIS_TRAINING and `POST_NMS_ROIS_INFERENCE1.
@keineahnung2345 Your idea might help in reducing the false positive rate by requiring a higher IoU before the classifier would detects a positive object. However, it seems to me that this is a different problem from the one that @jimmy15923 mentioned, which is that there are too many ground truth positive objects.
Most helpful comment
@jimmy15923 Your understanding is correct. If the RPN passes mostly positive ROIs to the second stage, then the 2nd stage classifier won't have enough negative samples for good training. One option is to increase the number of proposals passed to the 2nd stage to ensure the DetectionTargetLayer has enough positive and negative proposals to pick from. This can be controlled with the config variables
POST_NMS_ROIS_TRAININGand `POST_NMS_ROIS_INFERENCE1.@keineahnung2345 Your idea might help in reducing the false positive rate by requiring a higher IoU before the classifier would detects a positive object. However, it seems to me that this is a different problem from the one that @jimmy15923 mentioned, which is that there are too many ground truth positive objects.