In addition to the current output from the Faster R-CNN model, where only the detections after NMS are given out, adding outputs from before the NMS for deeper insight in class distributions.
After training the network on a custom dataset (with good results actually!), I had found several false positives and wrongly classified objects. It would be useful to me to get a deeper insight how the score for the real object class is and use the class distribution for further processing.
My idea (proposal):
Adding 3 more keys to the output dictionary.
E.g. "boxes_raw", "labels_raw" and "scores_raw" with outputs from before the NMS.
None from my side so far. Would love to hear other approaches
Hi,
You can attach forward hooks to the model so that you can obtain the intermediate outputs that you want.
Here is an example:
import torch, torchvision
model = torchvision.models.detection.fasterrcnn_resnet50_fpn()
model.eval()
# location where the outputs will be originally stored
o = []
# register forward hook to get the outputs before filtering
hook = model.roi_heads.box_predictor.register_forward_hook(lambda module, input, output: o.append(output))
# do a forward to retrieve
r = model([torch.rand(3, 200, 200)])
# the intermediate outputs will be appended to tensor o
intermediate_output = o.pop()
# can now print what it gives
print(intermediate_output[0].shape, intermediate_output[1].shape)
#torch.Size([1000, 91]) torch.Size([1000, 364])
and you can do your custom processing on this.
I believe this is currently the best way of returning this intermediate outputs, and allows you to get more information than what you have requested (for example, the outputs of the pooled features before passing through the classifier).
I'm closing this issue, but let us know if you have further questions.
Most helpful comment
Hi,
You can attach forward hooks to the model so that you can obtain the intermediate outputs that you want.
Here is an example:
and you can do your custom processing on this.
I believe this is currently the best way of returning this intermediate outputs, and allows you to get more information than what you have requested (for example, the outputs of the pooled features before passing through the classifier).
I'm closing this issue, but let us know if you have further questions.