Hello all,
I would like to get the Co-ordinates of Bounding Box of a particular predicted object in the image.
For example in the below mentioned link, the image has different objects detected by Detectron2 like cyclists, bottle, person,etc
Detectron2 image at source
_What output I am expecting_
I would like to get the Co-ordinates of bounding box of the 2 water bottes fixed on the bicycle frame.
Maybe store as text file to infer later or print them to understand which Co-ordinates of bounding box belongs corresponds to which object.
As we have many objects in a single image, I would like to print the list of objects detected along with the co-ordinates of Bounding Box.
Thank you in advance.
@ppwwyyxx
Thank you for the reply,
I have checked the tutorial on GoogleCoLab. As per the Segment: "Run a pre-trained detectron2 model", I am able to visualise the Information of the bounding boxes. But, I do not see such variable or line of Code in cloned repository of detectron2. After a complete search across different executable file and Folders , i dont see any exact line of Code as mentioned in colab tutorial.
Please support.
Thank you.
The tutorial shows how to "print the list of objects detected along with the co-ordinates of Bounding Box." in https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5#scrollTo=7d3KxiHO_0gb as you asked.
Tutorials show how a user can use detectron2, so the content does not need to be part of the repository.

@ppwwyyxx . Thank you once again. I understand my query should have been correctly framed. I would like to reframe my query.
Need support for the point 3.
The code will run on a PC if you write the code in a python file on the PC and execute the python file.
@ppwwyyxx by default, You have a lot of executables like visualizer.py, box_regression.py in the project, but it is unclear which executable exactly gives the final BB output after detection. I would like to know if there is any file from which I can extract the same information as in colab. Maybe, I can workout from there.
No files in the repository gives the coordinates of bounding boxes. The code in colab shows how to get the coordinates of bounding boxes.
@deeplearner93 .
Hi. This is just an example.
Detectron2 has the file /detectron2/demo/predictor.py.
The file /detectron2/demo/predictor.py is called by the file /detectron2/demo/demo.py
We will invoke the file /detectron2/demo/demo.py to do the test.
https://github.com/facebookresearch/detectron2/tree/master/demo
PART1
STEP1.
Open the file /detectron2/demo/predictor.py
STEP2
Edit the function run_on_image(self, image) in following way.
The last instruction in the function run_on image is:
return predictions, vis_output
Add before the last instruction (the instruction return) the following instructions print
print(instances)
print(instances.pred_boxes)
print(instances.pred_boxes[0])
OUTPUT AND EXPLANATION
I got these outputs.
A) OUTPUT OF print(instances)
Instances(num_instances=4, image_height=360, image_width=640, fields=[pred_boxes, scores, pred_classes, pred_masks])
Explanation: this output says me there are 4 boxes detected.
B) OUTPUT OF print(instances.pred_boxes)
Boxes(tensor([[289.3555, 17.8171, 451.1482, 347.6050],
[382.5501, 14.9712, 635.7133, 231.8446],
[467.1654, 66.3414, 611.7201, 226.0997],
[ 22.4782, 3.7928, 428.1484, 254.6716]]))
Explanation: this output says me, the coordinates of the boxes detected.
In particular, the first box (instances.pred_boxes[0]) has the top_left point with coordinates (x,y)=(289.3555, 17.8171), and the bottom_right point with coordinates (x,y)=(451.1482, 347.6050)
C) OUTPUT OF print(instances.pred_boxes[0])
Boxes(tensor([[289.3555, 17.8171, 451.1482, 347.6050]]))
Explanation: with this command, I just print the coordinates of the first box (instances.pred_boxes[0])
PART2
SEE ALSO
A) https://detectron2.readthedocs.io/tutorials/models.html#model-output-format
B) https://github.com/facebookresearch/detectron2/issues/356
PART3
This is my code, basically I have added 3 instructions PRINT, before of the instruction RETURN, in the file https://github.com/facebookresearch/detectron2/blob/master/demo/predictor.py
def run_on_image(self, image):
vis_output = None
predictions = self.predictor(image)
# Convert image from OpenCV BGR format to Matplotlib RGB format.
image = image[:, :, ::-1]
visualizer = Visualizer(image, self.metadata, instance_mode=self.instance_mode)
if "panoptic_seg" in predictions:
panoptic_seg, segments_info = predictions["panoptic_seg"]
vis_output = visualizer.draw_panoptic_seg_predictions(
panoptic_seg.to(self.cpu_device), segments_info
)
else:
if "sem_seg" in predictions:
vis_output = visualizer.draw_sem_seg(
predictions["sem_seg"].argmax(dim=0).to(self.cpu_device)
)
if "instances" in predictions:
instances = predictions["instances"].to(self.cpu_device)
vis_output = visualizer.draw_instance_predictions(predictions=instances)
print(instances)
print(instances.pred_boxes)
print(instances.pred_boxes[0])
return predictions, vis_output
PART4
To test my code I run these commands in the bash shell.
COMMAND1: cd /000myfiles/anacondadir1/detectron2/demo
COMMAND2: python3 demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input my_image.jpg --opts MODEL.DEVICE cpu MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl &
@kenny1323
Wow!!. Thank you very much !! All "Bow" to your work.
Hi, I have a problem. In my case I want the box coordinates as individual values because i need to extract the detected image from the main image. I can get all the coordinates as below:
Boxes(tensor([[2054.7739, 287.8489, 2595.0151, 728.5417]], device='cuda:0'))
But I have not been able to save each element as an individual element (x1=2054.7739, y1=287.8489...)
I need each element to crop the image and get only the detected element. I try to convert the box element to list (.tolist) but that didn't work. Eny help?
@Warday.
Hi.
Here you can find my directory /detectron2/demo
https://github.com/kenny1323/detectron2_ken
PART1
About the box extraction I have added 2 files.
1)cp demo.py extract_person_box.py;
2)cp predictor.py extract_person_box_core.py
I have edited extract_person_box.py and extract_person_box_core.py in the following way.
The file extract_person_box.py basically is the same of the file demo.py, there are only few differences.
The file extract_person_box_core.py has a new block of code tagged START_BOXES_ECTRACTION
Inside the file extract_person_box_core.py, in particular search the instruction crop.
You should read the file readme.txt too.
https://github.com/kenny1323/detectron2_ken/blob/master/README.txt
F="/SUPERDIR1"/allfile/1.png;
cd /000myfiles/anacondadir1/detectron2/demo
python3 extract_person_box.py --config-file /000myfiles/anacondadir1/detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input $F --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl &
sleep 3
PART2
About the mask extraction I have added 2 files.
1)cp demo.py extract_mask.py;
2)cp predictor.py extract_mask_core.py
I have edited extract_mask.py and extract_mask_core.py in the following way.
The file extract_mask.py basically is the same of the file demo.py, there are only few differences.
The file extract_mask_core.py has a new block of code tagged START_MASK_EXTRACTION.
The image /detectron2/demo/000028.jpg._out1.png is an example of mask extraction.
Basically, the alpha channel of any pixel of the mask is set to zero.
url_image: https://github.com/kenny1323/detectron2_ken/blob/master/000028.jpg._out1.png
F="/SUPERDIR1"/allfile/1000.png;
cd /000myfiles/anacondadir1/detectron2/demo
python3 extract_mask_cumulative.py --config-file /000myfiles/anacondadir1/detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input $F --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl &
sleep 3
Post Scriptum.
About the image 000028.jpg._out1.png, you should invert the transparency, namely: for any pixel with alpha channel 0, change it to alphachannel=255; and any pixel with alpha channel not 0, change it to alphachannel=0;
Hi @Warday ,
Based on the @deeplearner93 image attached on this issue,
you can just like
output_pred_boxes = outputs["instances"].pred_boxes
for i in output_pred_boxes.__iter__():
print(i.cpu().numpy())
you will get individual bounding boxes at ease.
Thanx kenny1323, reading source code from extract_mask_core.py I could extract each box.
thanx elmonisch i will check what is faster. I did
Box= outputs["instances"].pred_boxes
a=Box.tensor.cpu()
a=a.numpy()
and then navigate in each box
thanx for both answers
Most helpful comment
@deeplearner93 .
Hi. This is just an example.
Detectron2 has the file /detectron2/demo/predictor.py.
The file /detectron2/demo/predictor.py is called by the file /detectron2/demo/demo.py
We will invoke the file /detectron2/demo/demo.py to do the test.
https://github.com/facebookresearch/detectron2/tree/master/demo
PART1
STEP1.
Open the file /detectron2/demo/predictor.py
STEP2
Edit the function run_on_image(self, image) in following way.
The last instruction in the function run_on image is:
return predictions, vis_output
Add before the last instruction (the instruction return) the following instructions print
print(instances)
print(instances.pred_boxes)
print(instances.pred_boxes[0])
OUTPUT AND EXPLANATION
I got these outputs.
A) OUTPUT OF print(instances)
Instances(num_instances=4, image_height=360, image_width=640, fields=[pred_boxes, scores, pred_classes, pred_masks])
Explanation: this output says me there are 4 boxes detected.
B) OUTPUT OF print(instances.pred_boxes)
Boxes(tensor([[289.3555, 17.8171, 451.1482, 347.6050],
[382.5501, 14.9712, 635.7133, 231.8446],
[467.1654, 66.3414, 611.7201, 226.0997],
[ 22.4782, 3.7928, 428.1484, 254.6716]]))
Explanation: this output says me, the coordinates of the boxes detected.
In particular, the first box (instances.pred_boxes[0]) has the top_left point with coordinates (x,y)=(289.3555, 17.8171), and the bottom_right point with coordinates (x,y)=(451.1482, 347.6050)
C) OUTPUT OF print(instances.pred_boxes[0])
Boxes(tensor([[289.3555, 17.8171, 451.1482, 347.6050]]))
Explanation: with this command, I just print the coordinates of the first box (instances.pred_boxes[0])
PART2
SEE ALSO
A) https://detectron2.readthedocs.io/tutorials/models.html#model-output-format
B) https://github.com/facebookresearch/detectron2/issues/356
PART3
This is my code, basically I have added 3 instructions PRINT, before of the instruction RETURN, in the file https://github.com/facebookresearch/detectron2/blob/master/demo/predictor.py
START CODE
FILE /detectron2/demo/predictor.py
FUNCTION run_on_image(self, image)
END CODE
PART4
To test my code I run these commands in the bash shell.
COMMAND1: cd /000myfiles/anacondadir1/detectron2/demo
COMMAND2: python3 demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input my_image.jpg --opts MODEL.DEVICE cpu MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl &