Detectron: Training for Detection only with Rectangular bounding box and without polygonal mask

Created on 14 Feb 2018 · 14Comments · Source: facebookresearch/Detectron

Hello Everyone,

Can you explain how do I train this detectron for object detection task only. By object detection only, I mean I have a rectangular bounding box around objects I need to detect. I don't have a mask for object and I don't even want that in my inference.

All I want is to train it like object detection frameworks like Yolo and SSD, just with rectangular bounding box around object.

Any suggestions will be highly appreciated.
Thank you,
Regards,
Dharma KC

community help wanted

Source

ghost

👍3 😄1

Most helpful comment

You can create dummy segmentation masks. I did this and it seems to work.

https://github.com/facebookresearch/Detectron/issues/98#issuecomment-363709933

dhpollack on 14 Feb 2018

👍2

All 14 comments

You can create dummy segmentation masks. I did this and it seems to work.

https://github.com/facebookresearch/Detectron/issues/98#issuecomment-363709933

dhpollack on 14 Feb 2018

👍2

That is correct. Assuming a box is given as

box = [x, y, width, height]

then

[[box[0], box[1], box[0], box[1] + box[3], box[0] + box[2], box[1] + box[3], box[0] + box[2], box[1]]]

gives you the corresponding segementation mask.

Works just fine in my case.

kampelmuehler on 14 Feb 2018

@kampelmuehler so the segmentation mask doesn't need to be a closed figure? Does the program just assume there is a line segment between the first and the last point?

dhpollack on 14 Feb 2018

apparently yes, I wasn't encountering any problems.

kampelmuehler on 14 Feb 2018

Thank you guys, I will try and let you know if it works.

ghost on 14 Feb 2018

Do you guys already have a code to convert the bounding box from Yolo/SSD format to coco json format ?

ghost on 15 Feb 2018

Did you were successful on training bbox detection only?

I am obtaining low mAP results (20-30%) when using ResNet-50-FPN-x1. (When using Faster R-CNN+VGG from Caffe implementation I obtain 50%) When using ResNeXt-101-FPN-x1 the network does not ouput anything when testing.
In both cases the loss function decreases and the accuracy_cls value is higher than 0.9.

dmasmont on 5 Mar 2018

I didn't test for an mAP score but I did get results that successfully found the object classes that I was looking for with both of the networks that you mentioned. Been having a lot of problems with the retinanets tho.

dhpollack on 5 Mar 2018

@dhpollack Do you have more than 81 classes?

dmasmont on 5 Mar 2018

I did it with 131 classes and then combined those into 37 classes. A lot of what I am looking for is similar (fashion) and I noticed the network was not finding the subtle distinctions between these classes, but that wasn't very important to me hence consolidating the classes.

dhpollack on 5 Mar 2018

Thanks @dhpollack ,
Did you use the default .yaml files? Did you used default pre-trained weights or some special initialization like this one?

I don't understand why ResNeXt-101-FPN-x1 is not outputting anything at detection time but ResNet-50 does...

dmasmont on 5 Mar 2018

It is not exactly the same, but it's similar to the original yaml. For one, my dataset is a different size, so I changed the number of iterations / schedule and I'm using 4 GPUs so I adjusted that. I've also played with the batch size and the scale, but ultimately I think that I kept the scale at the default.

While using a different dataset, I've run into a few problems.

First, I used a dataset that only had one item/bounding box per image, so my final network would only find one item. Using another dataset with multiple boxes per image, The network would find multiple objects in the test images.

Secondly, I ran into a bunch of random segfaults that I frankly never fully resolved. It had to do with the scaling of images and/or the aspect ratio. I "solved" that by limiting the aspect ratio of the input images to square-ish images (1 / 1.8 to 1.8). This is not ideal because I lost a bit of training data, but it did get rid of the segfaults.

dhpollack on 5 Mar 2018

There is a python script that can translate original PASCAL VOC xml to json format: https://github.com/CivilNet/Gemfield/blob/master/src/python/pascal_voc_xml2json/pascal_voc_xml2json.py

gemfield on 20 Mar 2018

I am working on Mask R-CNN, I am training images. I have 1 + 1 class.In my JSON I have 2 shape: 'rect ' and 'polygon'. My not working when I added 'rect' also on same data set. Let how to handle the different shapes. My program only works with polygon shapes

daoud on 9 Aug 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Output from training

fangpengcheng95 · 4Comments

How can i train model from scratch

Hwang-dae-won · 3Comments

keypoints detection fails

olgaliak · 4Comments

pre-trained weights from coco dataset

coldgemini · 3Comments

RuntimeError: [enforce fail at conv_op_cudnn.cc:811] status == CUDNN_STATUS_SUCCESS. 8 vs 0. , Error at: /pytorch/caffe2/operators/conv_op_cudnn.cc:811: CUDNN_STATUS_EXECUTION_FAILED

Emma0928 · 3Comments