I am confused about the RPN loss implementation. Here is my understanding of the existing code:
build_rpn_targets:
From the ground truth data, it computes _rpn_match_ (batch, 4092, 1), where each value is -1, 1 or 0; and it computes _rpn_bbox_ (batch, 256, 1), tentatively made of 50/50 positive/negative boxes (with random sampling).
rpn_losses:
_rpn_class_loss_ computes the classification loss on all 4,092 anchors.
_rpn_bbox_loss_ computes the box loss on only the 256 anchors from _rpn_bbox_ above.
However, this is the opposite of what the paper mentions ...
_rpn_bbox_loss_ by definition computes a loss only for positive anchors, so there is no need to balance with negative ones. And we should not have a loss of zero for the "extra" positive anchors (if there are more than 128).
_rpn_class_loss_ is the one for which the selection is important, as otherwise the positive ones are overwhelmed by the negative ones.
Am I missing something ?
Referring to your previous #1274 did you get to the bottom of this? I am having a pretty much identical issue and it has been driving me mad for a fortnight.
Is the fix for this a case of modifying the [ix] -> [i]?
Thanks!
I have changed my implementation to follow the paper for the rpn_losses (as per my comment in this issue), hence I had to do the change the [ix] -> [i].
However if you keep the current implementation of rpn_losses with sampling done for the bounding boxes, then the change is not required.
I'm also not following this repo anymore. I decided to reimplement the model in a more modular way, and use more recent versions of Tensorflow.
Most helpful comment
I have changed my implementation to follow the paper for the rpn_losses (as per my comment in this issue), hence I had to do the change the [ix] -> [i].
However if you keep the current implementation of rpn_losses with sampling done for the bounding boxes, then the change is not required.
I'm also not following this repo anymore. I decided to reimplement the model in a more modular way, and use more recent versions of Tensorflow.