Detectron2: How to obtain the Bounding Box Co-ordinates of any predicted Object in the Image

Created on 2 Jun 2020  路  14Comments  路  Source: facebookresearch/detectron2

Hello all,
I would like to get the Co-ordinates of Bounding Box of a particular predicted object in the image.
For example in the below mentioned link, the image has different objects detected by Detectron2 like cyclists, bottle, person,etc
Detectron2 image at source

_What output I am expecting_

I would like to get the Co-ordinates of bounding box of the 2 water bottes fixed on the bicycle frame.
Maybe store as text file to infer later or print them to understand which Co-ordinates of bounding box belongs corresponds to which object.
As we have many objects in a single image, I would like to print the list of objects detected along with the co-ordinates of Bounding Box.

Thank you in advance.

Most helpful comment

@deeplearner93 .
Hi. This is just an example.
Detectron2 has the file /detectron2/demo/predictor.py.
The file /detectron2/demo/predictor.py is called by the file /detectron2/demo/demo.py
We will invoke the file /detectron2/demo/demo.py to do the test.
https://github.com/facebookresearch/detectron2/tree/master/demo

PART1
STEP1.
Open the file /detectron2/demo/predictor.py

STEP2
Edit the function run_on_image(self, image) in following way.
The last instruction in the function run_on image is:
return predictions, vis_output
Add before the last instruction (the instruction return) the following instructions print

print(instances)
print(instances.pred_boxes)
print(instances.pred_boxes[0])

OUTPUT AND EXPLANATION
I got these outputs.

A) OUTPUT OF print(instances)
Instances(num_instances=4, image_height=360, image_width=640, fields=[pred_boxes, scores, pred_classes, pred_masks])

Explanation: this output says me there are 4 boxes detected.

B) OUTPUT OF print(instances.pred_boxes)
Boxes(tensor([[289.3555, 17.8171, 451.1482, 347.6050],
[382.5501, 14.9712, 635.7133, 231.8446],
[467.1654, 66.3414, 611.7201, 226.0997],
[ 22.4782, 3.7928, 428.1484, 254.6716]]))

Explanation: this output says me, the coordinates of the boxes detected.
In particular, the first box (instances.pred_boxes[0]) has the top_left point with coordinates (x,y)=(289.3555, 17.8171), and the bottom_right point with coordinates (x,y)=(451.1482, 347.6050)

C) OUTPUT OF print(instances.pred_boxes[0])
Boxes(tensor([[289.3555, 17.8171, 451.1482, 347.6050]]))
Explanation: with this command, I just print the coordinates of the first box (instances.pred_boxes[0])

PART2
SEE ALSO
A) https://detectron2.readthedocs.io/tutorials/models.html#model-output-format
B) https://github.com/facebookresearch/detectron2/issues/356

PART3
This is my code, basically I have added 3 instructions PRINT, before of the instruction RETURN, in the file https://github.com/facebookresearch/detectron2/blob/master/demo/predictor.py

START CODE

FILE /detectron2/demo/predictor.py

FUNCTION run_on_image(self, image)

 def run_on_image(self, image):
    vis_output = None
    predictions = self.predictor(image)
    # Convert image from OpenCV BGR format to Matplotlib RGB format.
    image = image[:, :, ::-1]
    visualizer = Visualizer(image, self.metadata, instance_mode=self.instance_mode)
    if "panoptic_seg" in predictions:
        panoptic_seg, segments_info = predictions["panoptic_seg"]
        vis_output = visualizer.draw_panoptic_seg_predictions(
            panoptic_seg.to(self.cpu_device), segments_info
        )
    else:
        if "sem_seg" in predictions:
            vis_output = visualizer.draw_sem_seg(
                predictions["sem_seg"].argmax(dim=0).to(self.cpu_device)
            )
        if "instances" in predictions:
            instances = predictions["instances"].to(self.cpu_device)
            vis_output = visualizer.draw_instance_predictions(predictions=instances)
    print(instances)
    print(instances.pred_boxes)
    print(instances.pred_boxes[0])
    return predictions, vis_output

END CODE

PART4
To test my code I run these commands in the bash shell.
COMMAND1: cd /000myfiles/anacondadir1/detectron2/demo
COMMAND2: python3 demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input my_image.jpg --opts MODEL.DEVICE cpu MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl &

All 14 comments

@ppwwyyxx
Thank you for the reply,
I have checked the tutorial on GoogleCoLab. As per the Segment: "Run a pre-trained detectron2 model", I am able to visualise the Information of the bounding boxes. But, I do not see such variable or line of Code in cloned repository of detectron2. After a complete search across different executable file and Folders , i dont see any exact line of Code as mentioned in colab tutorial.

Please support.
Thank you.

The tutorial shows how to "print the list of objects detected along with the co-ordinates of Bounding Box." in https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5#scrollTo=7d3KxiHO_0gb as you asked.

Tutorials show how a user can use detectron2, so the content does not need to be part of the repository.

Screenshot 2020-06-03 at 10 45 23 AM

@ppwwyyxx . Thank you once again. I understand my query should have been correctly framed. I would like to reframe my query.

  1. I want use detectron2 on my laptop locally without using Google colab. As normally as running anything locally on PC.
  2. I follow the instructions to setup the dependencies and requirements.
  3. I would like to "Run"/"Execute" detectron2 to make predictions on my locally stored images. I would like to "print /see" similar information of the bounding box for the set of objects detected and the corresponding class assignment ( as it can be seen from googlecolab tutorial)

Need support for the point 3.

  • How to obtain similar information(as in colab tutorial) about the co-ordinates of finally predicted Bounding Box along with objects detected and the class assignment when I run Detectron2 locally on my PC on my own set of images?
    (simply saying as shown in image attached above: need similar output of final BB when executing locally on my PC)
    It will be highly helpful to share the steps/procedure to arrive at the requested query.

The code will run on a PC if you write the code in a python file on the PC and execute the python file.

@ppwwyyxx by default, You have a lot of executables like visualizer.py, box_regression.py in the project, but it is unclear which executable exactly gives the final BB output after detection. I would like to know if there is any file from which I can extract the same information as in colab. Maybe, I can workout from there.

No files in the repository gives the coordinates of bounding boxes. The code in colab shows how to get the coordinates of bounding boxes.

@deeplearner93 .
Hi. This is just an example.
Detectron2 has the file /detectron2/demo/predictor.py.
The file /detectron2/demo/predictor.py is called by the file /detectron2/demo/demo.py
We will invoke the file /detectron2/demo/demo.py to do the test.
https://github.com/facebookresearch/detectron2/tree/master/demo

PART1
STEP1.
Open the file /detectron2/demo/predictor.py

STEP2
Edit the function run_on_image(self, image) in following way.
The last instruction in the function run_on image is:
return predictions, vis_output
Add before the last instruction (the instruction return) the following instructions print

print(instances)
print(instances.pred_boxes)
print(instances.pred_boxes[0])

OUTPUT AND EXPLANATION
I got these outputs.

A) OUTPUT OF print(instances)
Instances(num_instances=4, image_height=360, image_width=640, fields=[pred_boxes, scores, pred_classes, pred_masks])

Explanation: this output says me there are 4 boxes detected.

B) OUTPUT OF print(instances.pred_boxes)
Boxes(tensor([[289.3555, 17.8171, 451.1482, 347.6050],
[382.5501, 14.9712, 635.7133, 231.8446],
[467.1654, 66.3414, 611.7201, 226.0997],
[ 22.4782, 3.7928, 428.1484, 254.6716]]))

Explanation: this output says me, the coordinates of the boxes detected.
In particular, the first box (instances.pred_boxes[0]) has the top_left point with coordinates (x,y)=(289.3555, 17.8171), and the bottom_right point with coordinates (x,y)=(451.1482, 347.6050)

C) OUTPUT OF print(instances.pred_boxes[0])
Boxes(tensor([[289.3555, 17.8171, 451.1482, 347.6050]]))
Explanation: with this command, I just print the coordinates of the first box (instances.pred_boxes[0])

PART2
SEE ALSO
A) https://detectron2.readthedocs.io/tutorials/models.html#model-output-format
B) https://github.com/facebookresearch/detectron2/issues/356

PART3
This is my code, basically I have added 3 instructions PRINT, before of the instruction RETURN, in the file https://github.com/facebookresearch/detectron2/blob/master/demo/predictor.py

START CODE

FILE /detectron2/demo/predictor.py

FUNCTION run_on_image(self, image)

 def run_on_image(self, image):
    vis_output = None
    predictions = self.predictor(image)
    # Convert image from OpenCV BGR format to Matplotlib RGB format.
    image = image[:, :, ::-1]
    visualizer = Visualizer(image, self.metadata, instance_mode=self.instance_mode)
    if "panoptic_seg" in predictions:
        panoptic_seg, segments_info = predictions["panoptic_seg"]
        vis_output = visualizer.draw_panoptic_seg_predictions(
            panoptic_seg.to(self.cpu_device), segments_info
        )
    else:
        if "sem_seg" in predictions:
            vis_output = visualizer.draw_sem_seg(
                predictions["sem_seg"].argmax(dim=0).to(self.cpu_device)
            )
        if "instances" in predictions:
            instances = predictions["instances"].to(self.cpu_device)
            vis_output = visualizer.draw_instance_predictions(predictions=instances)
    print(instances)
    print(instances.pred_boxes)
    print(instances.pred_boxes[0])
    return predictions, vis_output

END CODE

PART4
To test my code I run these commands in the bash shell.
COMMAND1: cd /000myfiles/anacondadir1/detectron2/demo
COMMAND2: python3 demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input my_image.jpg --opts MODEL.DEVICE cpu MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl &

@kenny1323

Wow!!. Thank you very much !! All "Bow" to your work.

Hi, I have a problem. In my case I want the box coordinates as individual values because i need to extract the detected image from the main image. I can get all the coordinates as below:
Boxes(tensor([[2054.7739, 287.8489, 2595.0151, 728.5417]], device='cuda:0'))
But I have not been able to save each element as an individual element (x1=2054.7739, y1=287.8489...)
I need each element to crop the image and get only the detected element. I try to convert the box element to list (.tolist) but that didn't work. Eny help?

@Warday.
Hi.
Here you can find my directory /detectron2/demo
https://github.com/kenny1323/detectron2_ken

PART1
About the box extraction I have added 2 files.
1)cp demo.py extract_person_box.py;
2)cp predictor.py extract_person_box_core.py

I have edited extract_person_box.py and extract_person_box_core.py in the following way.

The file extract_person_box.py basically is the same of the file demo.py, there are only few differences.
The file extract_person_box_core.py has a new block of code tagged START_BOXES_ECTRACTION
Inside the file extract_person_box_core.py, in particular search the instruction crop.

You should read the file readme.txt too.
https://github.com/kenny1323/detectron2_ken/blob/master/README.txt

BOX EXTRACTION EXANPLE

BASH COMMAND

F="/SUPERDIR1"/allfile/1.png;
cd /000myfiles/anacondadir1/detectron2/demo
python3 extract_person_box.py --config-file /000myfiles/anacondadir1/detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input $F --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl &
sleep 3

PART2
About the mask extraction I have added 2 files.
1)cp demo.py extract_mask.py;
2)cp predictor.py extract_mask_core.py

I have edited extract_mask.py and extract_mask_core.py in the following way.

The file extract_mask.py basically is the same of the file demo.py, there are only few differences.
The file extract_mask_core.py has a new block of code tagged START_MASK_EXTRACTION.
The image /detectron2/demo/000028.jpg._out1.png is an example of mask extraction.
Basically, the alpha channel of any pixel of the mask is set to zero.
url_image: https://github.com/kenny1323/detectron2_ken/blob/master/000028.jpg._out1.png

MASK EXTRACTION

BASH COMMAND

F="/SUPERDIR1"/allfile/1000.png;
cd /000myfiles/anacondadir1/detectron2/demo
python3 extract_mask_cumulative.py --config-file /000myfiles/anacondadir1/detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input $F --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl &
sleep 3

Post Scriptum.
About the image 000028.jpg._out1.png, you should invert the transparency, namely: for any pixel with alpha channel 0, change it to alphachannel=255; and any pixel with alpha channel not 0, change it to alphachannel=0;

Hi @Warday ,

Based on the @deeplearner93 image attached on this issue,

you can just like

output_pred_boxes = outputs["instances"].pred_boxes
for i in output_pred_boxes.__iter__():
print(i.cpu().numpy())

you will get individual bounding boxes at ease.

Thanx kenny1323, reading source code from extract_mask_core.py I could extract each box.
thanx elmonisch i will check what is faster. I did
Box= outputs["instances"].pred_boxes
a=Box.tensor.cpu()
a=a.numpy()
and then navigate in each box
thanx for both answers

Was this page helpful?
0 / 5 - 0 ratings

Related issues

AntonBaumannDE picture AntonBaumannDE  路  3Comments

limsijie93 picture limsijie93  路  3Comments

jinfagang picture jinfagang  路  3Comments

joeythegod picture joeythegod  路  4Comments

danielgordon10 picture danielgordon10  路  3Comments