Mmdetection: Batched inference

Created on 29 Dec 2018 · 4Comments · Source: open-mmlab/mmdetection

I found the model was rewritten. Is any way yo batched inference that inference 4 pictures per GPU per time?

Source

robeson1010

All 4 comments

The api is currently not available, though it has been implemented.
For single GPU inference, you can modify the argument imgs_per_gpu and comment corresponding assert statements.
For multi GPU inference, it needs more modification.

hellock on 29 Dec 2018

The api is currently not available, though it has been implemented.
For single GPU inference, you can modify the argument imgs_per_gpu and comment corresponding assert statements.
For multi GPU inference, it needs more modification.

Thanks for your reply, I have checked the code and comment the assert line in base.py. However, the output the following line is always for the first img even the data['img'] is [4,3,xxx,xxx]. How to run the right result here?
result = model(return_loss=False, rescale=not show, **data)

robeson1010 on 29 Dec 2018

👀6

yes，I found the same problem. it do'n surpport batch inference even in single GPU inference，the result is always for the first img even the data['img'] is [4,3,xxx,xxx] @hellock @robeson1010

xgmiao on 19 Jun 2019

What is the current restriction of inference/training on a batch of different images?
Is it due to the image preprocessing and the batch for each GPU must be of the same size?
It looks like the scaling/padding may not ensure the same image tensor size in a batch because the img_transformer is set to pad to a multiple of cfg.data.size_divisor, not a particular size.

BTW, I made some modifications to do the inference on a batch of images of the same size besides the suggestions above.
In simple_test(), I saw the rois is converted to bboxes with image indexes in the 0th dim.
However, the computed det_bboxes contains detected bboxs and scores but no such image index because delta2bbox() removes the image indexes which the following multiclass_nms() does not care about.
So the returned results are kind of batch-agnostic which is definitely unexpected.
Therefore, this is not done yet.

# in inference.py, to be called given a list of images
def _inference_batch(model, imgs, img_transform, device):
    batch = []
    for img in imgs:
        img = mmcv.imread(img)
        data = _prepare_data(img, img_transform, model.cfg, device)
        batch.append(data['img'][0])
    batch = torch.cat(batch)
    with torch.no_grad():
        result = model(return_loss=False, rescale=True, img=[batch], img_meta=data['img_meta'])
    return result

# in test_mixins.py
def simple_test_rpn(self, x, img_meta, rpn_test_cfg):
   rpn_outs = self.rpn_head(x)
   #  proposal_inputs = rpn_outs + (img_meta, rpn_test_cfg)
   proposal_inputs = rpn_outs + (img_meta * len(x[0]), rpn_test_cfg)
   proposal_list = self.rpn_head.get_bboxes(*proposal_inputs)
   return proposal_list

Update
After reading through tools/test.py and some detector models, it seems like the batch is assumed to be some data augmentation of a particular image.
The aug_test() is only implemented in some models that subclass TwoStageDetector and calls aug_test_rpn() and aug_test_bboxes() to process each image in a loop, not necessarily in data parallel.
All the boxes are merged and recovered for nms given that all of the boxes are from the augmentation of the same image.
However, what is asked for is to do the inference of a batch of different images.
Does this batch inference still work with the current design with minimal modification or some new API like batch_test() ?
How about padding the input images to be of the max size in the batch as in torchvision, not just a multiple of the size divisor individually?
Any further suggestions?
Also, correct me if necessary.