Vision: Adding more outputs to Faster R-CNN

Created on 22 Jul 2020  路  1Comment  路  Source: pytorch/vision

馃殌 Feature

In addition to the current output from the Faster R-CNN model, where only the detections after NMS are given out, adding outputs from before the NMS for deeper insight in class distributions.

Motivation

After training the network on a custom dataset (with good results actually!), I had found several false positives and wrongly classified objects. It would be useful to me to get a deeper insight how the score for the real object class is and use the class distribution for further processing.

Pitch

My idea (proposal):
Adding 3 more keys to the output dictionary.
E.g. "boxes_raw", "labels_raw" and "scores_raw" with outputs from before the NMS.

  • boxes_raw [N, num_classes, 4]
  • labels_raw [N, num_classes]
  • scores_raw [N, num_classes]

Alternatives

None from my side so far. Would love to hear other approaches

feature request models object detection

Most helpful comment

Hi,

You can attach forward hooks to the model so that you can obtain the intermediate outputs that you want.

Here is an example:

import torch, torchvision
model = torchvision.models.detection.fasterrcnn_resnet50_fpn()
model.eval()

# location where the outputs will be originally stored
o = []
# register forward hook to get the outputs before filtering
hook = model.roi_heads.box_predictor.register_forward_hook(lambda module, input, output: o.append(output))

# do a forward to retrieve
r = model([torch.rand(3, 200, 200)])

# the intermediate outputs will be appended to tensor o
intermediate_output = o.pop()

# can now print what it gives
print(intermediate_output[0].shape, intermediate_output[1].shape)
#torch.Size([1000, 91]) torch.Size([1000, 364])

and you can do your custom processing on this.

I believe this is currently the best way of returning this intermediate outputs, and allows you to get more information than what you have requested (for example, the outputs of the pooled features before passing through the classifier).

I'm closing this issue, but let us know if you have further questions.

>All comments

Hi,

You can attach forward hooks to the model so that you can obtain the intermediate outputs that you want.

Here is an example:

import torch, torchvision
model = torchvision.models.detection.fasterrcnn_resnet50_fpn()
model.eval()

# location where the outputs will be originally stored
o = []
# register forward hook to get the outputs before filtering
hook = model.roi_heads.box_predictor.register_forward_hook(lambda module, input, output: o.append(output))

# do a forward to retrieve
r = model([torch.rand(3, 200, 200)])

# the intermediate outputs will be appended to tensor o
intermediate_output = o.pop()

# can now print what it gives
print(intermediate_output[0].shape, intermediate_output[1].shape)
#torch.Size([1000, 91]) torch.Size([1000, 364])

and you can do your custom processing on this.

I believe this is currently the best way of returning this intermediate outputs, and allows you to get more information than what you have requested (for example, the outputs of the pooled features before passing through the classifier).

I'm closing this issue, but let us know if you have further questions.

Was this page helpful?
0 / 5 - 0 ratings