https://github.com/tensorflow/models/tree/master/official/vision/detection/ops/roi_ops.py
In the multilevel_propose_rois function, the returned selected_rois should have shape [batch_size, rpn_post_nms_top_k, 4]. However, if the params.anchor.num_scales argument is anything other that 1, the batch_size of the returned selected_rois is actually batch_size * num_scales. The reason for this is because when calculating "this_level_anchors", a -1 is used to fill in the shape:
this_level_anchors = tf.cast(
tf.reshape(anchor_boxes[level], [-1, num_boxes, 4]),
dtype=this_level_scores.dtype)
I assume this should instead be something like ...
this_level_anchors = tf.cast(
tf.reshape(anchor_boxes[level], [-1, num_boxes*num_scales, 4]),
dtype=this_level_scores.dtype)
but this introduces shape mismatch errors in other parts of the code as well. For example, after making the above changes, I see a new error:
ValueError: Dimensions must be equal, but are 49152 and 147456 for '{{node multilevel_propose_rois/level_2/decode_boxes/mul_2}} = Mul[T=DT_FLOAT](multilevel_propose_rois/level_2/decode_boxes/strided_slice, multilevel_propose_rois/level_2/decode_boxes/add)' with input shapes: [1,49152,1], [1,147456,1].
This is referring to a size mismatch in the official/vision/detection/utils/box_utils.py script, for the line decoded_boxes_yc = dy * anchor_h + anchor_yc, in which dy has shape (1, 49152, 1), while anchor_h has shape (1, 147456, 1). Here, we see the same factor of three that comes from the num_scales value.
Set the params.anchors.num_scales to something larger than 1. I was trying this with 3.
I expect the returned selected_rois to have shape [batch_size, rpn_post_nms_top_k, 4], but it actually has shape [batch_size*num_scales, rpn_post_nms_top_k, 4].
Thanks roserustowicz for pointing to this issue and for debugging.
In maskrcnn_config.py we have rpn_head.anchors_per_location=3 as default value which doesn't match anchor.num_scales * anchor.aspect_ratios after you changed anchor.num_scales=3.
As rpn_head.anchors_per_location is redundant config and can be obtained by anchor.num_scales * anchor.aspect_ratios I have removed it and used the formula instead by this commit.
This should resolve your issue.
Most helpful comment
Thanks roserustowicz for pointing to this issue and for debugging.
In maskrcnn_config.py we have
rpn_head.anchors_per_location=3as default value which doesn't matchanchor.num_scales * anchor.aspect_ratiosafter you changedanchor.num_scales=3.As
rpn_head.anchors_per_locationis redundant config and can be obtained byanchor.num_scales * anchor.aspect_ratiosI have removed it and used the formula instead by this commit.This should resolve your issue.