Mask_rcnn: Repeated anchors in generate_pyramid_anchors()

Created on 8 Jan 2018  Â·  3Comments  Â·  Source: matterport/Mask_RCNN

I'm work on the source code of Mask_RCNN and I find something interesting.
Code:

print(config.RPN_ANCHOR_SCALES)
print(config.RPN_ANCHOR_RATIOS)
print(config.BACKBONE_SHAPES)
print(config.BACKBONE_STRIDES)
print(config.RPN_ANCHOR_STRIDE)

output:

(8, 16, 32, 64, 128)
[0.5, 1, 2]
[[32 32]
 [16 16]
 [ 8  8]
 [ 4  4]
 [ 2  2]]
[4, 8, 16, 32, 64]
1

We have 5 kinds of feature maps in different size: 32*32, 16*16, 8*8, 4*4, 2*2.
In each of the pixel of each feature map, we generate 3 kinds of anchors of different ratios. In other words, the anchors in each feature map should be [32*32, 16*16, 8*8, 4*4, 2*2] * 3, but I find that the number of anchors generated by function generate_pyramid_anchors() is three times the number above.
Code:

boxes = generate_pyramid_anchors(config.RPN_ANCHOR_SCALES,
                         config.RPN_ANCHOR_RATIOS,
                         config.BACKBONE_SHAPES,
                         config.BACKBONE_STRIDES,
                         config.RPN_ANCHOR_STRIDE)

output:

scales=  8 , shape=  [32 32]
boxes.shape=  (9216, 4)
scales=  16 , shape=  [16 16]
boxes.shape=  (2304, 4)
scales=  32 , shape=  [8 8]
boxes.shape=  (576, 4)
scales=  64 , shape=  [4 4]
boxes.shape=  (144, 4)
scales=  128 , shape=  [2 2]
boxes.shape=  (36, 4)

Code:

print(boxes[-36:])

output:

array([[ -90.50966799,  -45.254834  ,   90.50966799,   45.254834  ],
       [ -90.50966799,  -45.254834  ,   90.50966799,   45.254834  ],
       [ -90.50966799,  -45.254834  ,   90.50966799,   45.254834  ],
       [ -64.        ,  -64.        ,   64.        ,   64.        ],
       [ -64.        ,  -64.        ,   64.        ,   64.        ],
       [ -64.        ,  -64.        ,   64.        ,   64.        ],
       [ -45.254834  ,  -90.50966799,   45.254834  ,   90.50966799],
       [ -45.254834  ,  -90.50966799,   45.254834  ,   90.50966799],
       [ -45.254834  ,  -90.50966799,   45.254834  ,   90.50966799],
       [ -90.50966799,   18.745166  ,   90.50966799,  109.254834  ],
       [ -90.50966799,   18.745166  ,   90.50966799,  109.254834  ],
       [ -90.50966799,   18.745166  ,   90.50966799,  109.254834  ],
       [ -64.        ,    0.        ,   64.        ,  128.        ],
       [ -64.        ,    0.        ,   64.        ,  128.        ],
       [ -64.        ,    0.        ,   64.        ,  128.        ],
       [ -45.254834  ,  -26.50966799,   45.254834  ,  154.50966799],
       [ -45.254834  ,  -26.50966799,   45.254834  ,  154.50966799],
......

Those anchors repeated 3 times. I wonder is this a bug or just for convenient?

Most helpful comment

@Superlee506

The reason why this implementation of Mask R-CNN uses only 1 scale and 3 ratios at each scale for anchors is because it incorporates FPN. As stated in the FPN paper, Section 4.1, Feature Pyramid Networks for RPN:

"Because the head slides densely over all locations at all pyramid levels, it is not necessary to have multi-scale anchors on a specific level. Instead, we assign anchors of a single scale to each level."

In other words, the FPN takes care of the scale issue by virtue by having different pyramid levels, each addressing a different scale. Thus, there is no need to have multiple scales at each FPN level. We just simply need different anchor ratios for each scale at each level.

All 3 comments

@Mabinogiysk I find in the original paper, they used k = 9(3 scales and 3 ratios)anchors. But in this version, they just generated 3(1 scales, 3 ratios). I get confused about this.

@Superlee506

The reason why this implementation of Mask R-CNN uses only 1 scale and 3 ratios at each scale for anchors is because it incorporates FPN. As stated in the FPN paper, Section 4.1, Feature Pyramid Networks for RPN:

"Because the head slides densely over all locations at all pyramid levels, it is not necessary to have multi-scale anchors on a specific level. Instead, we assign anchors of a single scale to each level."

In other words, the FPN takes care of the scale issue by virtue by having different pyramid levels, each addressing a different scale. Thus, there is no need to have multiple scales at each FPN level. We just simply need different anchor ratios for each scale at each level.

@FruVirus Copy that, thanks

Was this page helpful?
0 / 5 - 0 ratings