Detectron: Questions: Output of MaskRCNN

Created on 9 May 2018 · 10Comments · Source: facebookresearch/Detectron

In the Getting_Starting tutorial, we have:
python2 tools/infer_simple.py \
--cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
--output-dir /tmp/detectron-visualizations \
--image-ext jpg \
--wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
demo

This give us pdf images that visualize segmentation. Is it possible to get output in formats from which I can get pixels in the object mask? Does detectron contain functions to evaulate our testing set for the instance segmentation? Does detectron output confidence of the prediction?

Are there any tutorial from which I can learn in more details about how to use detectron, or the only way is to look at this forum and the code?

Source

natasasdj

👍3

Most helpful comment

Counts is Run Length Encoded.
https://en.wikipedia.org/wiki/Run-length_encoding

you can use mask utils to get the mask into a numpy array. I'm assuming your familier with opencv

import pycocotools.mask as mask_util

# this will make a 2d array of 1's and 0's
mask = mask_util.decode(found_segment)
cv2.imshow("mask", mask * 255.0)



# this will find the contours and potentially give you polygons representing the segments of the object
_, contours, _ = cv2.findContours(mask.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

p0wdrdotcom on 11 May 2018

👍5

All 10 comments

Ok, I found out how to output segmentation masks in infer_simple.py - cls_segms variable.
I still did not find out how I can output the prediction confidence.

natasasdj on 10 May 2018

After running "Inference with Pretrained Models/Directory of Image Files" from Getting-Started tutorial I get the following for the image demo/16004479832_a748d55f21_k.jpg.

cls_boxes[17]
array([[0.0000000e+00, 8.1034927e+01, 2.9909589e+02, 5.6584467e+02, 9.9869162e-01],
[3.2148853e+02, 2.0170921e+02, 9.0500000e+02, 5.9527612e+02, 9.9953604e-01],
[5.7266614e+02, 2.4558983e+02, 9.0472571e+02, 5.5804816e+02, 1.9798927e-01],
[1.3561227e+02, 6.9481277e+01, 4.5119119e+02, 5.1436737e+02, 5.4683097e-02]], dtype=float32)

The class 17 is dog. I would like to check if I correctly interpret this output.
Each raw in this array represents a bounded box for one instance object.
In one raw, the first and second elements represent the coordinates of the box center, the third and fourth elements represent the height and width (or the width and height?), and the fifth element represents the confidence (i.e. probability) for the object box prediction.
Is this correct?

In this image I see 2 dogs, however there are 4 raws (boxes) in cls_boxes. Is this an error in the object detection or do I misinterpret something? However, in the visualization only 2 boxes for dogs are shown.

natasasdj on 10 May 2018

This is the continuation of my previous post. I consider the same image demo/16004479832_a748d55f21_k.jpg.
The output variable cls_segms[17] also have 4 elements as cls_boxes[17], though there are only 2 dogs (class 17) on the image.
Moreover I don't understand the output of cls_segms. I suppose 'counts' is a segmentation mask. But in what format is it? How can I convert it into something human understandable?

'counts': 'nm0k4c=n0YO7h4cHoJc7Q5HgJe7Z5]HbJc7_5H\\J7f5cHSJ^7P6eHiI[7[6hH^IY7e6jHUIV7n6nHkHT7W7nHfHS7Z7oHcHQ7_7PI^HQ7b7RI[Hn6f7TIVHm6j7YInGh6T8_IbGb6_8j12M3N1O2O1N110O00010O000000O1O1000O010O010O010O0000010O1O002N100O1O1O2N1O2N10000O10O1O1O00100000O2O0O2N2N101N101O1O0O2O001N1O100O1000000000010O0O2O0O2N2M201N2O1O001O1N2O1N2M3N2N101N2O1O1O1N2N2N2L4L4M3N2N2O1N2M3M3M3L4M3M3N2O1N20000O10000O10000O101O001N101O1N2O1O1N3N1N3N2M3M4L5Kd0\O;E9G7I6J4Mg0XOa0@2N2N1N2O1O1N101O000O10000O100O10O0100O1O100O001O100O001O1O01O00001N1O2O0O2O0010O010O010O1O010O001O1O00000000O1O1O1O2O0O101O0100O010O100O10O0100O1O001O1O1O1O001O1O1O001O2N100O2N1O1O1O2N1O1O1N2O001O1O001O0010O0004L3M3L5L4G:BYhX;'

natasasdj on 10 May 2018

❤1

Counts is Run Length Encoded.
https://en.wikipedia.org/wiki/Run-length_encoding

you can use mask utils to get the mask into a numpy array. I'm assuming your familier with opencv

import pycocotools.mask as mask_util

# this will make a 2d array of 1's and 0's
mask = mask_util.decode(found_segment)
cv2.imshow("mask", mask * 255.0)



# this will find the contours and potentially give you polygons representing the segments of the object
_, contours, _ = cv2.findContours(mask.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

p0wdrdotcom on 11 May 2018

👍5

the Mask comes out the same way you passed it in when training. The data goes in via the Coco format
have a look here for the cocoapi

https://github.com/cocodataset/cocoapi

that package will give you pycocotools.

p0wdrdotcom on 11 May 2018

👍2

Thank you. The answer was very useful.
I figured out the answer to my question why I have more instances in cls_segms variable than I see it in the visualization. Every instance has a confidence of the prediction, and in the visualization are only shown those instances with confidence>0.7.

natasasdj on 16 May 2018

Hi，how can I convert the "courts" to a binary mask? I used the command:
mask = mask_util.decode(found_segment)
and replaced the "found_segment" with my "courts" and got an error like:
TypeError: string indices must be integers, not str
Can you tell me what's the correct way to use "decode"?
Thank you!

ll884856 on 14 Aug 2018

Oh I have found out how to do that~ Thank you~

ll884856 on 14 Aug 2018

@ll884856 How did you do it?

thecondofitz on 15 Aug 2018

@thecondofitz I used commands like this:
`
import pycocotools.mask as mask_util
import cv2
import numpy as np

mask = mask_util.decode( [{"counts": "l]a?V3an02O001N101O0O10001O0000001O00001O00001O1O001O010O00001O00000000010O01O001O001O00001O00001O000000001O00000000000001O0001O0000000001O01O000000010O001O00001O00001O000000001O000000000000000010O00001O00001O000000001O00000001O0001O0001O00001O001O001O00001O0000001O00000000001O0001O01O01O01000O010O010O01O01O0001O0001O00000000001O0000001O00001O00001O001O000001O01O01O00001O10O01O001O001O0001O01O00000001O00001O001O1O001O00001O0000001O000000001O0000001O00001O001O1O01O01O001O00000001O00001O001O0010O0001O00001O000000001O000001O00001O00001O1O001O001O00001O0000001O000000001O0000010O00001O001O010O000010O00001O000010O0001O00000000001O0000000000010O00000001O01O01O010O1O010O010O01O00010O00000000000001O0001O0000000001O00000000001O01O01O0010O00010O1O0010O01O001O01O0001O00000000000001O000000000000001O00000000001O00001O010O00001O01O0000010O00001O00010O00001O00000000001O00000000000000001O00000000000000000000000000000000000000000000000000010O00001O010O00100O0010O0010O0001O01O0000000001O00000000000000000000000000001O01O0000001O001O001O001O001O00001O0000000000000O101O000O10000O100O100O1O10000O10000O10001N10001N101N2N6Fmkc=", "size": [1080, 1440]}])

cv2.imshow("mask", mask * 255.0)
`

ll884856 on 19 Aug 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

No handlers could be found for logger "caffe2.python.net_drawer"

partnercloudsupport · 3Comments

Conda caffe2 and libcaffe2_detectron_ops_gpu.so not where it should be

baristahell · 3Comments

Bbox Mean/STD normalization

kleingeo · 3Comments

About SCALE and MAX_SIZE

lilichu · 3Comments

RuntimeError: [enforce fail at conv_op_cudnn.cc:811] status == CUDNN_STATUS_SUCCESS. 8 vs 0. , Error at: /pytorch/caffe2/operators/conv_op_cudnn.cc:811: CUDNN_STATUS_EXECUTION_FAILED

Emma0928 · 3Comments