Hello Everyone,
Can you explain how do I train this detectron for object detection task only. By object detection only, I mean I have a rectangular bounding box around objects I need to detect. I don't have a mask for object and I don't even want that in my inference.
All I want is to train it like object detection frameworks like Yolo and SSD, just with rectangular bounding box around object.
Any suggestions will be highly appreciated.
Thank you,
Regards,
Dharma KC
You can create dummy segmentation masks. I did this and it seems to work.
https://github.com/facebookresearch/Detectron/issues/98#issuecomment-363709933
That is correct. Assuming a box is given as
box = [x, y, width, height]
then
[[box[0], box[1], box[0], box[1] + box[3], box[0] + box[2], box[1] + box[3], box[0] + box[2], box[1]]]
gives you the corresponding segementation mask.
Works just fine in my case.
@kampelmuehler so the segmentation mask doesn't need to be a closed figure? Does the program just assume there is a line segment between the first and the last point?
apparently yes, I wasn't encountering any problems.
Thank you guys, I will try and let you know if it works.
Do you guys already have a code to convert the bounding box from Yolo/SSD format to coco json format ?
Did you were successful on training bbox detection only?
I am obtaining low mAP results (20-30%) when using ResNet-50-FPN-x1. (When using Faster R-CNN+VGG from Caffe implementation I obtain 50%) When using ResNeXt-101-FPN-x1 the network does not ouput anything when testing.
In both cases the loss function decreases and the accuracy_cls value is higher than 0.9.
I didn't test for an mAP score but I did get results that successfully found the object classes that I was looking for with both of the networks that you mentioned. Been having a lot of problems with the retinanets tho.
@dhpollack Do you have more than 81 classes?
I did it with 131 classes and then combined those into 37 classes. A lot of what I am looking for is similar (fashion) and I noticed the network was not finding the subtle distinctions between these classes, but that wasn't very important to me hence consolidating the classes.
Thanks @dhpollack ,
Did you use the default .yaml files? Did you used default pre-trained weights or some special initialization like this one?
I don't understand why ResNeXt-101-FPN-x1 is not outputting anything at detection time but ResNet-50 does...
It is not exactly the same, but it's similar to the original yaml. For one, my dataset is a different size, so I changed the number of iterations / schedule and I'm using 4 GPUs so I adjusted that. I've also played with the batch size and the scale, but ultimately I think that I kept the scale at the default.
While using a different dataset, I've run into a few problems.
First, I used a dataset that only had one item/bounding box per image, so my final network would only find one item. Using another dataset with multiple boxes per image, The network would find multiple objects in the test images.
Secondly, I ran into a bunch of random segfaults that I frankly never fully resolved. It had to do with the scaling of images and/or the aspect ratio. I "solved" that by limiting the aspect ratio of the input images to square-ish images (1 / 1.8 to 1.8). This is not ideal because I lost a bit of training data, but it did get rid of the segfaults.
There is a python script that can translate original PASCAL VOC xml to json format: https://github.com/CivilNet/Gemfield/blob/master/src/python/pascal_voc_xml2json/pascal_voc_xml2json.py
I am working on Mask R-CNN, I am training images. I have 1 + 1 class.In my JSON I have 2 shape: 'rect ' and 'polygon'. My not working when I added 'rect' also on same data set. Let how to handle the different shapes. My program only works with polygon shapes
Most helpful comment
You can create dummy segmentation masks. I did this and it seems to work.
https://github.com/facebookresearch/Detectron/issues/98#issuecomment-363709933