Maskrcnn-benchmark: Confused about some Pooler parameters

Created on 8 Mar 2019 · 4Comments · Source: facebookresearch/maskrcnn-benchmark

❓ Questions and Help

Hi, I'm not a deep learning expert so I apologize if that is a trivial question.

I'm a bit confused by the

_C.MODEL.ROI_BOX_HEAD.POOLER_SCALES = (0.25, 0.125, 0.0625, 0.03125)
_C.MODEL.ROI_BOX_HEAD.POOLER_SAMPLING_RATIO = 2

parameters in the yacs config files. I looked at the Pooler and relevant roiAlign cuda code, but I'm still not sure how these values are computed and what they mean. Can somebody please explain them? Thanks.

question

Source

salehiac

Most helpful comment

No need to apologize, there is no trivial question.

These scales stand for the reduction scales caused by the backbone's strides. BTW, you should understand well the ResNet and ResNeXt architectures to better understand this explanation.

For instance, suppose you found a RoI of coordinates [0, 0, 64, 64] in the input image. Suppose again that you want to pool its features from all backbone's levels (here, a backbone is a ResNet or ResNeXt architecture).

So, since there is a stride of 2 in the conv1 layer and another stride of 2 at the end of the first block, it results in a feature-map 4x smaller than the original image, thus, a scale of 0.25. Since, there is a stride of 2 between all the convolution blocks of the backbone, the scale gets divided by 2 at each level.

Hence, the coordinates of your RoI will be:

[0, 0, 16, 16] in the first level feature-map
[0, 0, 8, 8] in the second level feature-map
[0, 0, 4, 4] in the third level feature-map
[0, 0, 2, 2] in the fourth level feature-map

The sampling_ratio parameter determines how many samples you want to do in the bi-linear interpolation of the RoIAlign algorithm.

LeviViana on 8 Mar 2019

👍7 ❤4

All 4 comments

No need to apologize, there is no trivial question.

These scales stand for the reduction scales caused by the backbone's strides. BTW, you should understand well the ResNet and ResNeXt architectures to better understand this explanation.

Hence, the coordinates of your RoI will be:

[0, 0, 16, 16] in the first level feature-map
[0, 0, 8, 8] in the second level feature-map
[0, 0, 4, 4] in the third level feature-map
[0, 0, 2, 2] in the fourth level feature-map

The sampling_ratio parameter determines how many samples you want to do in the bi-linear interpolation of the RoIAlign algorithm.

LeviViana on 8 Mar 2019

👍7 ❤4

Thank you very much @LeviViana . That is a very good explanation!

salehiac on 8 Mar 2019

👍1

Thanks for your great explanation @LeviViana !

fmassa on 9 Mar 2019

👍1

thanks

abcxs on 19 Jun 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Error when trying to train: RuntimeError: cuda runtime error (59) : device-side assert triggered

Nacho114 · 4Comments

Why the large batchsize cause training slow?

auroua · 3Comments

?? What's the problem

kaaier · 3Comments

Raise ValueError: Type mismatch (<type 'str'> vs. <type 'tuple'>) with values (coco_2017_train vs. ('coco_2017_train',)) for config key: DATASETS.TRAIN

SkeletonOne · 3Comments

problem at last setp 'python setup.py build develop'.

nanyoullm · 3Comments