Py-faster-rcnn: Why does it need a reshape layer in RPN's cls layer?

Created on 11 Aug 2016  路  1Comment  路  Source: rbgirshick/py-faster-rcnn

I am trying to understand the flow of data alongside the RPN and have some confusions.
Here's my questions based on the VGG (16 layers) and Pascal VOC 2007 dataset with approximate joint training (end-to-end):

The last shareable layer in VGG has an output of 512 * 14 * 14 (assuming input size is 3* 224 * 224). In the paper it says "...each sliding window is mapped to a lower-dimensional feature(256-d for ZF and 512-d for VGG, with ReLU [33] following)." According to the train.prototxt file (see here), that was implemented as another convolution layer (kernel size 3, stride 1, pad 1, num_output 512). This means the output would be the same as the input (512 * 14 * 14). After that, it enters the cls and reg layers. The cls layer has an output of dimension 18 * 14 * 14. What confuses me is that the data then gets reshaped (correct me if I am wrong) into 2 * 126 * 14, which looks weird to me. Can any one explain to me why this is so? What does each dimension represent, especially for the 126 * 14 part?

Most helpful comment

The scores blob (18x14x14) is temporarily reshaped to (2x126x14) in order to compute softmax probabilities of the scores. This makes sure that fg and bg score add up to 1 for all of 91414 ( = 126*14) anchors. The following layer in RoI proposal part of the network reverts back this reshaping.

layer {
  name: 'rpn_cls_prob_reshape'
  type: 'Reshape'
  bottom: 'rpn_cls_prob'
  top: 'rpn_cls_prob_reshape'
  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }
}

>All comments

The scores blob (18x14x14) is temporarily reshaped to (2x126x14) in order to compute softmax probabilities of the scores. This makes sure that fg and bg score add up to 1 for all of 91414 ( = 126*14) anchors. The following layer in RoI proposal part of the network reverts back this reshaping.

layer {
  name: 'rpn_cls_prob_reshape'
  type: 'Reshape'
  bottom: 'rpn_cls_prob'
  top: 'rpn_cls_prob_reshape'
  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }
}
Was this page helpful?
0 / 5 - 0 ratings