Models: Error with shape mismatch when changing `num_scales` for mask rcnn

Created on 1 Aug 2020 · 1Comment · Source: tensorflow/models

1. The entire URL of the file you are using

https://github.com/tensorflow/models/tree/master/official/vision/detection/ops/roi_ops.py

2. Describe the bug

In the multilevel_propose_rois function, the returned selected_rois should have shape [batch_size, rpn_post_nms_top_k, 4]. However, if the params.anchor.num_scales argument is anything other that 1, the batch_size of the returned selected_rois is actually batch_size * num_scales. The reason for this is because when calculating "this_level_anchors", a -1 is used to fill in the shape:

        this_level_anchors = tf.cast(
            tf.reshape(anchor_boxes[level], [-1, num_boxes, 4]),
            dtype=this_level_scores.dtype)

I assume this should instead be something like ...

        this_level_anchors = tf.cast(
            tf.reshape(anchor_boxes[level], [-1, num_boxes*num_scales, 4]),
            dtype=this_level_scores.dtype)

but this introduces shape mismatch errors in other parts of the code as well. For example, after making the above changes, I see a new error:

ValueError: Dimensions must be equal, but are 49152 and 147456 for '{{node multilevel_propose_rois/level_2/decode_boxes/mul_2}} = Mul[T=DT_FLOAT](multilevel_propose_rois/level_2/decode_boxes/strided_slice, multilevel_propose_rois/level_2/decode_boxes/add)' with input shapes: [1,49152,1], [1,147456,1].

This is referring to a size mismatch in the official/vision/detection/utils/box_utils.py script, for the line decoded_boxes_yc = dy * anchor_h + anchor_yc, in which dy has shape (1, 49152, 1), while anchor_h has shape (1, 147456, 1). Here, we see the same factor of three that comes from the num_scales value.

3. Steps to reproduce

Set the params.anchors.num_scales to something larger than 1. I was trying this with 3.

4. Expected behavior

I expect the returned selected_rois to have shape [batch_size, rpn_post_nms_top_k, 4], but it actually has shape [batch_size*num_scales, rpn_post_nms_top_k, 4].

## 5. Additional context

6. System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Debian GNU/Linux 9.11
Mobile device name if the issue happens on a mobile device: N/A
TensorFlow installed from (source or binary): binary (pip)
TensorFlow version (use command below): 2.3.0
Python version: 3.7.3
CUDA/cuDNN version: 10.1/7.6.4
GPU model and memory: NVIDIA Tesla K80

official bug

Source

roserustowicz

👍1

Most helpful comment

Thanks roserustowicz for pointing to this issue and for debugging.
In maskrcnn_config.py we have rpn_head.anchors_per_location=3 as default value which doesn't match anchor.num_scales * anchor.aspect_ratios after you changed anchor.num_scales=3.

As rpn_head.anchors_per_location is redundant config and can be obtained by anchor.num_scales * anchor.aspect_ratios I have removed it and used the formula instead by this commit.
This should resolve your issue.