Vision: ValueError: All bounding boxes should have positive height and width. Found invaid box [500.728515625, 533.3333129882812, 231.10546875, 255.2083282470703] for target at index 0.

Created on 2 Oct 2020  路  15Comments  路  Source: pytorch/vision

i am training detecto for custom object detection. anyone who can help me as soon as possible. i will be very grateful to you.
here is the code.
from detecto import core, utils, visualize
dataset = core.Dataset('content/sample_data/newdataset/car/images/')
model = core.Model(['car'])
model.fit(dataset)

here is the output:

ValueError Traceback (most recent call last)
in ()
4 model = core.Model(['car'])
5
----> 6 model.fit(dataset)

2 frames
/usr/local/lib/python3.6/dist-packages/torchvision/models/detection/generalized_rcnn.py in forward(self, images, targets)
91 raise ValueError("All bounding boxes should have positive height and width."
92 " Found invalid box {} for target at index {}."
---> 93 .format(degen_bb, target_idx))
94
95 features = self.backbone(images.tensors)

ValueError: All bounding boxes should have positive height and width. Found invaid box [500.728515625, 533.3333129882812, 231.10546875, 255.2083282470703] for target at index 0.

question object detection

Most helpful comment

Hi,

The answer from @oke-aditya is correct. You are probably passing to the model bounding boxes in the format [xmin, ymin, width, height], while Faster R-CNN expects boxes to be in [xmin, ymin, xmax, ymax] format.

Changing this should fix the issue.

We have btw recently added box conversion utilities to torchvision (thanks to @oke-aditya ), they can be found in https://github.com/pytorch/vision/blob/a98e17e50146529cdfadb590ba063e6bbee71de2/torchvision/ops/boxes.py#L137-L156

All 15 comments

I guess you have a degenerate box case. The boxes should be of format (xmin, ymin, xmax, ymax) for FRCNN to work.
You are having exactly opposite bounding box (degenerate case).

Hi,

The answer from @oke-aditya is correct. You are probably passing to the model bounding boxes in the format [xmin, ymin, width, height], while Faster R-CNN expects boxes to be in [xmin, ymin, xmax, ymax] format.

Changing this should fix the issue.

We have btw recently added box conversion utilities to torchvision (thanks to @oke-aditya ), they can be found in https://github.com/pytorch/vision/blob/a98e17e50146529cdfadb590ba063e6bbee71de2/torchvision/ops/boxes.py#L137-L156

So should I change my xml file format.

@kashf99 this question is better suited to the detecto repo, and this is part of their API. https://github.com/alankbi/detecto

Ok thank you

I guess you have a degenerate box case. The boxes should be of format (xmin, ymin, xmax, ymax) for FRCNN to work.
You are having exactly opposite bounding box (degenerate case).

Yeah thank you . It worked. But its very slow. Overload of nonzero is deprecated.

Overload of nonzero is deprecated.

This has been fixed in torchvision master since https://github.com/pytorch/vision/pull/2705

Hi @fmassa . I am also getting the same error, but I had passed [xmin, ymin, xmax, ymax] to the model. Can someone help me out.

Can you post details so that we can reproduce the issue ?

@oke-aditya what I have share code or abstract details.

Any code sample that can help people to reprdouce error you get.

boxes.append([xmin, ymin, xmax, ymax])
boxes = torch.as_tensor(boxes, dtype=torch.float32)
These are box cordinates. I'm passing.

@MALLI7622 make sure that xmin < xmax and that ymin < ymax for all boxes

@fmassa I had resolved the issue 4 days back, Thanks for your help. I was getting another error in Faster-RCNN. My model was resulting in these values. I don't know how to resolve this. I had changed the class index starting from 1 instead of 0 and increased output classes+1 because of starting with 1. Can you help me how to resolve this issue?
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000

When I was predicting with this model. I didn't get anything. It was predicting this
[{'boxes': tensor([], device='cuda:0', size=(0, 4)),
'labels': tensor([], device='cuda:0', dtype=torch.int64),
'scores': tensor([], device='cuda:0')}]

@MALLI7622 this might be due to many things. I would encourage you to start with the finetuning tutorial in https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html , as maybe you are not training for long enough.

Was this page helpful?
0 / 5 - 0 ratings