Mask_rcnn: Train own Dataset by taking actual images

Created on 3 Nov 2017 · 31Comments · Source: matterport/Mask_RCNN

Hi,

Thanks a lot for the awesome repository.

I went to train_shapes file which describes about how to train for our own dataset.

But all the things which you guys are doing over there is by generating randomly. Could explain the same by taking actual images which has ground truth of mask, class and bounding related information.

Regards,
Pirag

Source

pirahagvp

Most helpful comment

If you look in coco.py you'll see that the Dataset method load_mask() decodes Coco's polygonal format into a binary image mask. Mask-RCNN uses image masks, not polygons, for training. In the Coco case, load_mask() calls annToMask() which returns an image. For a your new dataset where the masks are already images, you can write your own load_mask() method that will instead read the jpegs, recast them as images of the appropriate size and type, and return (for all the masks associated with the designated image) a stack of binary masks and nparray of class labels, just as is done in coco.py's version of load_mask()

kevin-mp on 29 Nov 2017

👍14 ❤1 🎉1

All 31 comments

Hi Pirag, the file coco.py is an example of training on the COCO dataset, which consists of natural images and ground truth masks and bounding boxes.

waleedka on 3 Nov 2017

👍5

Thanks a lot
is there a specific software you advise for annotating the segments?

Walid-Ahmed on 3 Nov 2017

Annotating masks is time consuming, so start by searching for public datasets that might include the objects you're trying to identify. If you can't find any and you decide to do it manually then http://labelme.csail.mit.edu/ and http://labelml.com/ seem like good options.

waleedka on 3 Nov 2017

Thanks
@waleedka
I checked the json file that comes with Coco and started the training successfully.
Can you please advice me how to create a one with similar structure from my images?
I checked http://labelme.csail.mit.edu/ and http://labelml.com/ but the the output is a single xml file for each image.
is there is any document that elaborate how the annotation json file structure should be?
Thanks a gain for your great work.

Walid-Ahmed on 10 Nov 2017

I have recently discovered http://www.robots.ox.ac.uk/~vgg/software/via/ which outputs a .json for the polygons you classify on your image, but this .json output is different in structure from the COCO .json annotations. Is there some kind of COCO-specific GUI to generate annotations?

As an example, I want to train this on some medical images (tissue samples) and there are just no datasets that would be annotated already. So I need to do this on my own.

mmejeras on 10 Nov 2017

👍3

This is the JSON format used by deepLab
I guess it is is similar to the Cocoo file
deepLabJSONAnnotationFormat.txt

Walid-Ahmed on 10 Nov 2017

I'm not aware of good documentation of the COCO json format. I guess you can try to read the code of the official COCO APIs and infer it from there.

Alternatively, you can use any other format. You just need to provide a load_mask() function in the Dataset class that can load that format and convert it to a Numpy array.

waleedka on 11 Nov 2017

👍1

My dataset is consist of JPEG images and JPEG masks (segmentation masks).
How to load it?
what is the format that the network finally used?

Liron-2 on 28 Nov 2017

kevin-mp on 29 Nov 2017

👍14 ❤1 🎉1

Thank you!

Liron-2 on 29 Nov 2017

The masks must to be from the shape [height, width, instance_count] ?

Liron-2 on 24 Dec 2017

Hi,
If I have an image which only contains 2 class: powerline & BG,
the mask should be [h, w, 2]?
I was confused that does BG need to be one mask in the stack of binary masks?

cechung on 27 Dec 2017

Can anyone tell me how I can set the path to train Mask-RCNN :
parser = argparse.ArgumentParser(
description='Train Mask R-CNN on MS COCO.')
parser.add_argument('-command',
metavar='train',
help="'train' or 'evaluate' on MS COCO")
parser.add_argument('-model',
metavar='COCO_DIR',
help='Directory of the MS-COCO dataset')
parser.add_argument('-dataset',
metavar='COCO_DIR/mask_rcnn_coco.h5',
help="Path to weights .h5 file or 'coco'")
args = parser.parse_args()
print("Command: ", args.command)
print("Model: ", args.model)
print("Dataset: ", args.dataset)

It outputs : Command : None
Model : None
Dataset: None

matiqul on 4 Jan 2018

@EricccChung did you make it? Any advice? I also would like to train for 2 class: background and object. Do we need the background class? the mask should be [h, w, 2] as you mentioned in your last message?

fastlater on 5 Mar 2018

@matiqul:
Train a new model starting from pre-trained COCO weights
python3 coco.py train --dataset=pathToCoCoDataset --model=coco
Train a new model starting from ImageNet weights
python3 coco.py train --dataset=/path/to/coco/ --model=imagenet
Continue training a model that you had trained earlier
python3 coco.py train --dataset=pathToCoCoDataset --model=pathToFilemask_rcnn_coco.h5
Continue training the last model you trained. This will find the last trained weights in the model directory.
python coco.py train --dataset=pathToCoCoDataset --model=last

AliceDinh on 16 Mar 2018

@mmejeras have you find the sollution?
because i am also work on some medical image so i need create own dataset

jaydipsinh0738 on 27 Mar 2018

I've built a very simple JSON annotation tool that creates a dataset from scratch because I faced the same problem of creating a dataset from scratch. It also has components that integrate it with a pretrained Mask RCNN model so as to make the workflow easier once a good model is trained. Here's the link: https://github.com/Deep-Magic/COCO-Style-Dataset-Generator-GUI.

hanskrupakar on 16 Apr 2018

👍4

How to train pet_data_set with mask ? i create tf test and eval files, but launch train.py ... then mask are not displayed, i think we have to launch train and use masks but how ? i am using the PNG mask and XML files and the pet_tf_createrecord.py

thanks i spend weeks on this problem !

leccyril on 11 Jul 2018

Hi, i have aerial images of vehicles in jpg format and bounding boxes in text files. How can I prepare them to train for Mask RCNN? Any help would be appreciated. :-)

anookeen on 15 Aug 2018

@anookeen Check #297 . It was discussed step by step how to prepare the data. Some functions have changed since then but the main idea still the same. Also, follow the nucleus demo.

fastlater on 28 Aug 2018

Loading weights C:\Users\Administrator\Desktop\Mask_RCNN-master\logs\waterbodie
s20180903T1051\mask_rcnn_waterbodies_0011.h5
2018-09-10 11:07:41.152244: I T:\src\github\tensorflow\tensorflow\core\platform\
cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow bi
nary was not compiled to use: AVX2
Running on C:\Users\Administrator\Desktop\waterbodies\val\15.JPG
Processing 1 images
image shape: (369, 305, 3) min: 0.00000 max: 255
.00000 uint8
molded_images shape: (1, 1024, 1024, 3) min: -123.70000 max: 151
.10000 float64
image_metas shape: (1, 14) min: 0.00000 max: 1024
.00000 float64
anchors shape: (1, 261888, 4) min: -0.35390 max: 1
.29134 float32
Traceback (most recent call last):
File "waterbodies.py", line 298, in
detect_and_color_splash(model, image_path=args.image)
File "waterbodies.py", line 216, in detect_and_color_splash
skimage.io.imsave(file_name, splash)
File "C:\Users\Administrator\Anaconda3\lib\site-packages\skimage\io_io.py", l
ine 131, in imsave
if is_low_contrast(arr):
File "C:\Users\Administrator\Anaconda3\lib\site-packages\skimage\exposure\expo
sure.py", line 503, in is_low_contrast
dlimits = dtype_limits(image, clip_negative=False)
File "C:\Users\Administrator\Anaconda3\lib\site-packages\skimage\util\dtype.py
", line 49, in dtype_limits
imin, imax = dtype_range[image.dtype.type]
KeyError:

Preeti2309 on 10 Sep 2018

A bit late to the party but for the record fast.ai has an excellent set of notebooks on object detection and segmentation that use a subset of the coco dataset. fwiw Jeremy Howard's code and video goes through the format of the coco annotations in a thorough and understandable way.

VanBantam on 1 Mar 2019

Hello, I want to know if anyone has been able to use jpg with xml annotations for training, in the example of ballons they use json, my annotations are made with labellimg.
I would appreciate any help.

adions025 on 7 Mar 2019

H, I need help. I like to apply coco.py to own dataset having 4 classes, but I got the following WARNING: WARNING:root:You are using the default load_mask(), maybe you need to define your own one. May it needs to modify the load_mask(). I would appreciate any help. Thank you.

look585record10s on 19 Apr 2019

Can you be more specific, can you share the code of your load_mask?

adions025 on 19 Apr 2019

Can someone please help me.
I am trying to use Mask R-CNN for 10 classes and I am not able to figure out how I must annotate the images using VGG Annotator. I found this while https://github.com/bsaldivaremc2/keras_tutorials/blob/master/VGG_Data_extraction.ipynb
which helps to understand the data but how do we annotate it for 10 classes to train?

PallawiSinghal on 24 Apr 2019

👍1

@look 585record10s
Pls can your share how you were able to this problem of
Warning:error:You are using the default load_mask(), maybe you need to define your own one. May it needs to modify the load_mask().

I am currently stuck here. Pls, I need help

kzish on 30 May 2019

Dear kzish,

I could solve the issues and the learning of Mask RCNN has been progressing now!

My memorandum is below:

In config.py:
line 28: GPU_COUNT = 4
line 73: NUM_CLASSES = 1 + 4 #backgroud + number of own classes

in coco.py:
line 84: GPU_COUNT = 4
line 87: NUM_CLASSES 1 + 4

line 116:

Add classes

self.add_class("coco", 1, "AAA")
self.add_class("coco", 2, "BBB")
self.add_class("coco", 3, "CCC")
self.add_class("coco", 4, "DDD")

it needed to change the versions of tensorflow-gpu and keras;
tensorflow-gpu == 1.7.0
keras == 2.1.3

After modifying the above:
python setup.py install
python samples/coco/coco.py train --dataset=/home/dl-box/path/to/coco --model==imagenet

Especially, the command of "python setup.py install" has to be needed after modifying your files. I strongly recommend this.

look585record10s on 31 May 2019

Dear look585record10s

I'm working on Mask R-CNN. I got troubles in implementing this architecture on Keras Model. Finally, i don't know to add more object.

class AvakoCounting(Dataset):
# load the dataset definitions
def load_dataset(self,is_train=True):
# define one class
#self.add_class("dataset", 1, "kangaroo")
self.add_class("dataset", 1, "voiture")
self.add_class("dataset", 2, "moto")
self.add_class("dataset", 3, "autocar")
self.add_class("dataset", 4, "camion_2_essieux")
self.add_class("dataset", 5, "camion_3_essieux")
self.add_class("dataset", 6, "camion_4_essieux")
self.add_class("dataset", 7, "ens_articule_3_essieux")
self.add_class("dataset", 8, "ens_articule_4_essieux")
self.add_class("dataset", 9, "ens_articule_5_essieux")
self.add_class("dataset", 10, "ens_articule_6_essieux")
self.add_class("dataset", 11, "minicar")
self.add_class("dataset", 12, "tricycle")
self.add_class("dataset", 13, "camionnette")
self.add_class("dataset", 14, "camion_tricycle")
images_test = '/content/drive/My Drive/Stage_ISE/workspace/training/images/test/'
images_train = '/content/drive/My Drive/Stage_ISE/workspace/training/images/train/'
annotations_dir = '/content/drive/My Drive/Stage_ISE/workspace/training/annotations/'
images_dir = '/content/images_completes/'
# find all images
d = 0
for filename in listdir(images_dir):
nom_image = filename.split('.jpg')[0]
d = d + 1
image_id = d
# skip all images after 150 if we are building the train set
f = 0
if is_train:
directory = images_train
ann_path = annotations_dir + nom_image + '.xml'
print("train")
#continue
# skip all images before 150 if we are building the test/val set
if not is_train:
directory = images_test
ann_path = annotations_dir + nom_image + '.xml'
print("test")
#continue
# add to dataset
img_path = directory + filename
self.add_image('dataset', image_id=image_id, path=img_path, annotation=ann_path)

def extract_boxes(self, filename):
# load and parse the file
tree = ElementTree.parse(filename)
# get the root of the document
root = tree.getroot()
# extract each bounding box
boxes = list()
for box in root.findall('.//bndbox'):
xmin = int(box.find('xmin').text)
ymin = int(box.find('ymin').text)
xmax = int(box.find('xmax').text)
ymax = int(box.find('ymax').text)
filem = box.find('name').text
coors = [xmin, ymin, xmax, ymax]
boxes.append(coors)
# extract image dimensions
width = int(root.find('.//size/width').text)
height = int(root.find('.//size/height').text)
return boxes, width, height

# load the masks for an image
def load_mask(self, image_id):
    # get details of image
    info = self.image_info[image_id]
    # define box file location
    path = info['annotation']
    # load XML
    boxes, w, h = self.extract_boxes(path)
    # create one array for all masks, each on a different channel
    masks = zeros([h, w, len(boxes)], dtype='uint8')
    # create masks
    class_ids = list()
    for i in range(len(boxes)):
        box = boxes[i]
        row_s, row_e = box[1], box[3]
        col_s, col_e = box[0], box[2]
        masks[row_s:row_e, col_s:col_e, i] = 1
        class_ids.append(self.class_names.index('voiture'))
    return masks, asarray(class_ids, dtype='int32')

fmigone on 20 Nov 2019

If you look in coco.py you'll see that the Dataset method load_mask() decodes Coco's polygonal format into a binary image mask. Mask-RCNN uses image masks, not polygons, for training. In the Coco case, load_mask() calls annToMask() which returns an image. For a your new dataset where the masks are already images, you can write your own load_mask() method that will instead read the jpegs, recast them as images of the appropriate size and type, and return (for all the masks associated with the designated image) a stack of binary masks and nparray of class labels, just as is done in coco.py's version of load_mask()

Has anyone managed to successfully do this? I work with images and mask files, so converting them to COCO-like format is time consuming and obnoxious. If there's a way to to train in a "flow from directory" like fashion, I'd much rather use that!

Mahi-Mai on 19 Mar 2020

@Mahi-Mai Do you know how to use binary masks in the Maks-rcnn model?
I have a binary mask with several objects, instead of a binary mask for each object
Or does anyone know how to convert this mask with several objects into a JSON format?