Hello guys,
Thank's for your amazing work.
I got a question concerning the FPN backbone, it requires strides to be log2-contiguous, as far as I understand it means that each input feature map of the FPN must be downsampled 2 times wrt the previous feature map.
As expected, running this bit of code:
from detectron2.modeling.backbone.fpn import build_resnet_fpn_backbone
inputs = torch.randn(10, 3, 500, 500)
class shape:
def __init__(self, channels, height, width):
self.channels = channels
self.height = height
self.width = width
model_resnet_fpn = build_resnet_fpn_backbone(cfg_resnet, shape(3, 500, 500))
P = model_resnet_fpn(inputs)
Throws the error :
RuntimeError Traceback (most recent call last)
<ipython-input-19-bfac22f8e8d0> in <module>
8
9 model_resnet_fpn = build_resnet_fpn_backbone(cfg_resnet, shape(3, 500, 500)).cuda()
---> 10 P = model_resnet_fpn(inputs)
~/venv/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
539 result = self._slow_forward(*input, **kwargs)
540 else:
--> 541 result = self.forward(*input, **kwargs)
542 for hook in self._forward_hooks.values():
543 hook_result = hook(self, input, result)
~/git_projects/detectron2/detectron2/modeling/backbone/fpn.py in forward(self, x)
130 top_down_features = F.interpolate(prev_features, scale_factor=2, mode="nearest")
131 lateral_features = lateral_conv(features)
--> 132 prev_features = lateral_features + top_down_features
133 if self._fuse_type == "avg":
134 prev_features /= 2
RuntimeError: The size of tensor a (63) must match the size of tensor b (64) at non-singleton dimension 3
Because the dimensions of the bottom up outputs map are :
res2 torch.Size([3, 256, 125, 125])
res3 torch.Size([3, 512, 63, 63])
res4 torch.Size([3, 1024, 32, 32])
res5 torch.Size([3, 2048, 16, 16])
Hence in the FPN network it can't add the 2 tensors and it throws the error.
However when using the Detectron2 pipeline to train with odd sized images, it runs smoothly and won't throw this error.
My question is then : how does Detectron2 handles this case, and how can I make it work using build_resnet_fpn_backbone as above ?
My config file is :
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN" # others are not fully implemented yet
BACKBONE:
NAME: "build_resnet_fpn_backbone" # feature pyramid backbone
WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
MASK_ON: False
RESNETS:
OUT_FEATURES: ["res2", "res3", "res4", "res5"]
DEPTH: 50
FPN:
IN_FEATURES: ["res2", "res3", "res4", "res5"]
ANCHOR_GENERATOR:
SIZES: [[32], [64], [128], [256], [512]] # One size for each in feature map
ASPECT_RATIOS: [[0.5, 1.0, 2.0]] # Three aspect ratios (same for all in feature maps)
RPN:
IN_FEATURES: ["p2", "p3", "p4", "p5", "p6"]
ROI_HEADS:
NAME: "StandardROIHeads"
IN_FEATURES: ["p2", "p3", "p4", "p5"]
Thank you
Images are padded inside ImageList.from_tensors, used by every model, e.g. here https://github.com/facebookresearch/detectron2/blob/4bd27960e1c56eb8950b04f24c60b845d5840d8a/detectron2/modeling/meta_arch/rcnn.py#L185
Images are padded inside
ImageList.from_tensors, used by every model, e.g. here
This needs to be in the tutorial Partially execute a model:!
Most helpful comment
Images are padded inside
ImageList.from_tensors, used by every model, e.g. here https://github.com/facebookresearch/detectron2/blob/4bd27960e1c56eb8950b04f24c60b845d5840d8a/detectron2/modeling/meta_arch/rcnn.py#L185