Mask_rcnn: Implementation of RPN Losses

Created on 30 Jul 2018 · 2Comments · Source: matterport/Mask_RCNN

I am confused about the RPN loss implementation. Here is my understanding of the existing code:

build_rpn_targets:

From the ground truth data, it computes _rpn_match_ (batch, 4092, 1), where each value is -1, 1 or 0; and it computes _rpn_bbox_ (batch, 256, 1), tentatively made of 50/50 positive/negative boxes (with random sampling).

rpn_losses:

_rpn_class_loss_ computes the classification loss on all 4,092 anchors.
_rpn_bbox_loss_ computes the box loss on only the 256 anchors from _rpn_bbox_ above.

However, this is the opposite of what the paper mentions ...

_rpn_bbox_loss_ by definition computes a loss only for positive anchors, so there is no need to balance with negative ones. And we should not have a loss of zero for the "extra" positive anchors (if there are more than 128).

_rpn_class_loss_ is the one for which the selection is important, as otherwise the positive ones are overwhelmed by the negative ones.

Am I missing something ?

Source

jnd77

👍4 🚀1

Most helpful comment

I have changed my implementation to follow the paper for the rpn_losses (as per my comment in this issue), hence I had to do the change the [ix] -> [i].
However if you keep the current implementation of rpn_losses with sampling done for the bounding boxes, then the change is not required.

I'm also not following this repo anymore. I decided to reimplement the model in a more modular way, and use more recent versions of Tensorflow.

jnd77 on 5 Apr 2019

👍2 🚀1

All 2 comments

Referring to your previous #1274 did you get to the bottom of this? I am having a pretty much identical issue and it has been driving me mad for a fortnight.

Is the fix for this a case of modifying the [ix] -> [i]?

Thanks!