Detectron2: How to obtain the Bounding Box Co-ordinates of any predicted Object in the Image

Created on 2 Jun 2020 · 14Comments · Source: facebookresearch/detectron2

Hello all,
I would like to get the Co-ordinates of Bounding Box of a particular predicted object in the image.
For example in the below mentioned link, the image has different objects detected by Detectron2 like cyclists, bottle, person,etc
Detectron2 image at source

_What output I am expecting_

I would like to get the Co-ordinates of bounding box of the 2 water bottes fixed on the bicycle frame.
Maybe store as text file to infer later or print them to understand which Co-ordinates of bounding box belongs corresponds to which object.
As we have many objects in a single image, I would like to print the list of objects detected along with the co-ordinates of Bounding Box.

Thank you in advance.

Source

deeplearner93

Most helpful comment

@deeplearner93 .
Hi. This is just an example.
Detectron2 has the file /detectron2/demo/predictor.py.
The file /detectron2/demo/predictor.py is called by the file /detectron2/demo/demo.py
We will invoke the file /detectron2/demo/demo.py to do the test.
https://github.com/facebookresearch/detectron2/tree/master/demo

PART1
STEP1.
Open the file /detectron2/demo/predictor.py

STEP2
Edit the function run_on_image(self, image) in following way.
The last instruction in the function run_on image is:
return predictions, vis_output
Add before the last instruction (the instruction return) the following instructions print

print(instances)
print(instances.pred_boxes)
print(instances.pred_boxes[0])

OUTPUT AND EXPLANATION
I got these outputs.

A) OUTPUT OF print(instances)
Instances(num_instances=4, image_height=360, image_width=640, fields=[pred_boxes, scores, pred_classes, pred_masks])

Explanation: this output says me there are 4 boxes detected.

B) OUTPUT OF print(instances.pred_boxes)
Boxes(tensor([[289.3555, 17.8171, 451.1482, 347.6050],
[382.5501, 14.9712, 635.7133, 231.8446],
[467.1654, 66.3414, 611.7201, 226.0997],
[ 22.4782, 3.7928, 428.1484, 254.6716]]))

Explanation: this output says me, the coordinates of the boxes detected.
In particular, the first box (instances.pred_boxes[0]) has the top_left point with coordinates (x,y)=(289.3555, 17.8171), and the bottom_right point with coordinates (x,y)=(451.1482, 347.6050)

C) OUTPUT OF print(instances.pred_boxes[0])
Boxes(tensor([[289.3555, 17.8171, 451.1482, 347.6050]]))
Explanation: with this command, I just print the coordinates of the first box (instances.pred_boxes[0])

PART2
SEE ALSO
A) https://detectron2.readthedocs.io/tutorials/models.html#model-output-format
B) https://github.com/facebookresearch/detectron2/issues/356

PART3
This is my code, basically I have added 3 instructions PRINT, before of the instruction RETURN, in the file https://github.com/facebookresearch/detectron2/blob/master/demo/predictor.py

START CODE

FILE /detectron2/demo/predictor.py

FUNCTION run_on_image(self, image)

 def run_on_image(self, image):
    vis_output = None
    predictions = self.predictor(image)
    # Convert image from OpenCV BGR format to Matplotlib RGB format.
    image = image[:, :, ::-1]
    visualizer = Visualizer(image, self.metadata, instance_mode=self.instance_mode)
    if "panoptic_seg" in predictions:
        panoptic_seg, segments_info = predictions["panoptic_seg"]
        vis_output = visualizer.draw_panoptic_seg_predictions(
            panoptic_seg.to(self.cpu_device), segments_info
        )
    else:
        if "sem_seg" in predictions:
            vis_output = visualizer.draw_sem_seg(
                predictions["sem_seg"].argmax(dim=0).to(self.cpu_device)
            )
        if "instances" in predictions:
            instances = predictions["instances"].to(self.cpu_device)
            vis_output = visualizer.draw_instance_predictions(predictions=instances)
    print(instances)
    print(instances.pred_boxes)
    print(instances.pred_boxes[0])
    return predictions, vis_output

END CODE

PART4
To test my code I run these commands in the bash shell.
COMMAND1: cd /000myfiles/anacondadir1/detectron2/demo
COMMAND2: python3 demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input my_image.jpg --opts MODEL.DEVICE cpu MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl &

kenny1323 on 15 Jul 2020

👍2

All 14 comments

See tutorial: https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5

ppwwyyxx on 2 Jun 2020

@ppwwyyxx
Thank you for the reply,
I have checked the tutorial on GoogleCoLab. As per the Segment: "Run a pre-trained detectron2 model", I am able to visualise the Information of the bounding boxes. But, I do not see such variable or line of Code in cloned repository of detectron2. After a complete search across different executable file and Folders , i dont see any exact line of Code as mentioned in colab tutorial.

Please support.
Thank you.

deeplearner93 on 3 Jun 2020

The tutorial shows how to "print the list of objects detected along with the co-ordinates of Bounding Box." in https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5#scrollTo=7d3KxiHO_0gb as you asked.

Tutorials show how a user can use detectron2, so the content does not need to be part of the repository.

ppwwyyxx on 3 Jun 2020

Screenshot 2020-06-03 at 10 45 23 AM

deeplearner93 on 3 Jun 2020

@ppwwyyxx . Thank you once again. I understand my query should have been correctly framed. I would like to reframe my query.

I want use detectron2 on my laptop locally without using Google colab. As normally as running anything locally on PC.
I follow the instructions to setup the dependencies and requirements.
I would like to "Run"/"Execute" detectron2 to make predictions on my locally stored images. I would like to "print /see" similar information of the bounding box for the set of objects detected and the corresponding class assignment ( as it can be seen from googlecolab tutorial)

Need support for the point 3.

How to obtain similar information(as in colab tutorial) about the co-ordinates of finally predicted Bounding Box along with objects detected and the class assignment when I run Detectron2 locally on my PC on my own set of images?
(simply saying as shown in image attached above: need similar output of final BB when executing locally on my PC)
It will be highly helpful to share the steps/procedure to arrive at the requested query.

deeplearner93 on 3 Jun 2020

The code will run on a PC if you write the code in a python file on the PC and execute the python file.

ppwwyyxx on 3 Jun 2020

@ppwwyyxx by default, You have a lot of executables like visualizer.py, box_regression.py in the project, but it is unclear which executable exactly gives the final BB output after detection. I would like to know if there is any file from which I can extract the same information as in colab. Maybe, I can workout from there.

deeplearner93 on 3 Jun 2020

No files in the repository gives the coordinates of bounding boxes. The code in colab shows how to get the coordinates of bounding boxes.

ppwwyyxx on 3 Jun 2020

PART1
STEP1.
Open the file /detectron2/demo/predictor.py

print(instances)
print(instances.pred_boxes)
print(instances.pred_boxes[0])

OUTPUT AND EXPLANATION
I got these outputs.

A) OUTPUT OF print(instances)
Instances(num_instances=4, image_height=360, image_width=640, fields=[pred_boxes, scores, pred_classes, pred_masks])

Explanation: this output says me there are 4 boxes detected.

PART2
SEE ALSO
A) https://detectron2.readthedocs.io/tutorials/models.html#model-output-format
B) https://github.com/facebookresearch/detectron2/issues/356

PART3
This is my code, basically I have added 3 instructions PRINT, before of the instruction RETURN, in the file https://github.com/facebookresearch/detectron2/blob/master/demo/predictor.py

START CODE

FILE /detectron2/demo/predictor.py

FUNCTION run_on_image(self, image)

 def run_on_image(self, image):
    vis_output = None
    predictions = self.predictor(image)
    # Convert image from OpenCV BGR format to Matplotlib RGB format.
    image = image[:, :, ::-1]
    visualizer = Visualizer(image, self.metadata, instance_mode=self.instance_mode)
    if "panoptic_seg" in predictions:
        panoptic_seg, segments_info = predictions["panoptic_seg"]
        vis_output = visualizer.draw_panoptic_seg_predictions(
            panoptic_seg.to(self.cpu_device), segments_info
        )
    else:
        if "sem_seg" in predictions:
            vis_output = visualizer.draw_sem_seg(
                predictions["sem_seg"].argmax(dim=0).to(self.cpu_device)
            )
        if "instances" in predictions:
            instances = predictions["instances"].to(self.cpu_device)
            vis_output = visualizer.draw_instance_predictions(predictions=instances)
    print(instances)
    print(instances.pred_boxes)
    print(instances.pred_boxes[0])
    return predictions, vis_output

END CODE

kenny1323 on 15 Jul 2020

👍2

@kenny1323

Wow!!. Thank you very much !! All "Bow" to your work.

deeplearner93 on 16 Jul 2020

😄1

Hi, I have a problem. In my case I want the box coordinates as individual values because i need to extract the detected image from the main image. I can get all the coordinates as below:
Boxes(tensor([[2054.7739, 287.8489, 2595.0151, 728.5417]], device='cuda:0'))
But I have not been able to save each element as an individual element (x1=2054.7739, y1=287.8489...)
I need each element to crop the image and get only the detected element. I try to convert the box element to list (.tolist) but that didn't work. Eny help?

Warday on 23 Oct 2020

@Warday.
Hi.
Here you can find my directory /detectron2/demo
https://github.com/kenny1323/detectron2_ken

PART1
About the box extraction I have added 2 files.
1)cp demo.py extract_person_box.py;
2)cp predictor.py extract_person_box_core.py

I have edited extract_person_box.py and extract_person_box_core.py in the following way.

The file extract_person_box.py basically is the same of the file demo.py, there are only few differences.
The file extract_person_box_core.py has a new block of code tagged START_BOXES_ECTRACTION
Inside the file extract_person_box_core.py, in particular search the instruction crop.

You should read the file readme.txt too.
https://github.com/kenny1323/detectron2_ken/blob/master/README.txt

BOX EXTRACTION EXANPLE

BASH COMMAND

F="/SUPERDIR1"/allfile/1.png;
cd /000myfiles/anacondadir1/detectron2/demo
python3 extract_person_box.py --config-file /000myfiles/anacondadir1/detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input $F --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl &
sleep 3

PART2
About the mask extraction I have added 2 files.
1)cp demo.py extract_mask.py;
2)cp predictor.py extract_mask_core.py

I have edited extract_mask.py and extract_mask_core.py in the following way.

The file extract_mask.py basically is the same of the file demo.py, there are only few differences.
The file extract_mask_core.py has a new block of code tagged START_MASK_EXTRACTION.
The image /detectron2/demo/000028.jpg._out1.png is an example of mask extraction.
Basically, the alpha channel of any pixel of the mask is set to zero.
url_image: https://github.com/kenny1323/detectron2_ken/blob/master/000028.jpg._out1.png

MASK EXTRACTION

BASH COMMAND

F="/SUPERDIR1"/allfile/1000.png;
cd /000myfiles/anacondadir1/detectron2/demo
python3 extract_mask_cumulative.py --config-file /000myfiles/anacondadir1/detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input $F --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl &
sleep 3

Post Scriptum.
About the image 000028.jpg._out1.png, you should invert the transparency, namely: for any pixel with alpha channel 0, change it to alphachannel=255; and any pixel with alpha channel not 0, change it to alphachannel=0;

kenny1323 on 24 Oct 2020

👍1

Hi @Warday ,

Based on the @deeplearner93 image attached on this issue,

you can just like

output_pred_boxes = outputs["instances"].pred_boxes
for i in output_pred_boxes.__iter__():
print(i.cpu().numpy())

you will get individual bounding boxes at ease.

elmonisch on 27 Oct 2020

👍1

Thanx kenny1323, reading source code from extract_mask_core.py I could extract each box.
thanx elmonisch i will check what is faster. I did
Box= outputs["instances"].pred_boxes
a=Box.tensor.cpu()
a=a.numpy()
and then navigate in each box
thanx for both answers

Warday on 27 Oct 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings