Hi,
Thanks a lot for the awesome repository.
I went to train_shapes file which describes about how to train for our own dataset.
But all the things which you guys are doing over there is by generating randomly. Could explain the same by taking actual images which has ground truth of mask, class and bounding related information.
Regards,
Pirag
Hi Pirag, the file coco.py is an example of training on the COCO dataset, which consists of natural images and ground truth masks and bounding boxes.
Thanks a lot
is there a specific software you advise for annotating the segments?
Annotating masks is time consuming, so start by searching for public datasets that might include the objects you're trying to identify. If you can't find any and you decide to do it manually then http://labelme.csail.mit.edu/ and http://labelml.com/ seem like good options.
Thanks
@waleedka
I checked the json file that comes with Coco and started the training successfully.
Can you please advice me how to create a one with similar structure from my images?
I checked http://labelme.csail.mit.edu/ and http://labelml.com/ but the the output is a single xml file for each image.
is there is any document that elaborate how the annotation json file structure should be?
Thanks a gain for your great work.
I have recently discovered http://www.robots.ox.ac.uk/~vgg/software/via/ which outputs a .json for the polygons you classify on your image, but this .json output is different in structure from the COCO .json annotations. Is there some kind of COCO-specific GUI to generate annotations?
As an example, I want to train this on some medical images (tissue samples) and there are just no datasets that would be annotated already. So I need to do this on my own.
This is the JSON format used by deepLab
I guess it is is similar to the Cocoo file
deepLabJSONAnnotationFormat.txt
I'm not aware of good documentation of the COCO json format. I guess you can try to read the code of the official COCO APIs and infer it from there.
Alternatively, you can use any other format. You just need to provide a load_mask() function in the Dataset class that can load that format and convert it to a Numpy array.
My dataset is consist of JPEG images and JPEG masks (segmentation masks).
How to load it?
what is the format that the network finally used?
If you look in coco.py you'll see that the Dataset method load_mask() decodes Coco's polygonal format into a binary image mask. Mask-RCNN uses image masks, not polygons, for training. In the Coco case, load_mask() calls annToMask() which returns an image. For a your new dataset where the masks are already images, you can write your own load_mask() method that will instead read the jpegs, recast them as images of the appropriate size and type, and return (for all the masks associated with the designated image) a stack of binary masks and nparray of class labels, just as is done in coco.py's version of load_mask()
Thank you!
The masks must to be from the shape [height, width, instance_count] ?
Hi,
If I have an image which only contains 2 class: powerline & BG,
the mask should be [h, w, 2]?
I was confused that does BG need to be one mask in the stack of binary masks?
Can anyone tell me how I can set the path to train Mask-RCNN :
parser = argparse.ArgumentParser(
description='Train Mask R-CNN on MS COCO.')
parser.add_argument('-command',
metavar='train',
help="'train' or 'evaluate' on MS COCO")
parser.add_argument('-model',
metavar='COCO_DIR',
help='Directory of the MS-COCO dataset')
parser.add_argument('-dataset',
metavar='COCO_DIR/mask_rcnn_coco.h5',
help="Path to weights .h5 file or 'coco'")
args = parser.parse_args()
print("Command: ", args.command)
print("Model: ", args.model)
print("Dataset: ", args.dataset)
It outputs : Command : None
Model : None
Dataset: None
@EricccChung did you make it? Any advice? I also would like to train for 2 class: background and object. Do we need the background class? the mask should be [h, w, 2] as you mentioned in your last message?
@matiqul:
Train a new model starting from pre-trained COCO weights
python3 coco.py train --dataset=pathToCoCoDataset --model=coco
Train a new model starting from ImageNet weights
python3 coco.py train --dataset=/path/to/coco/ --model=imagenet
Continue training a model that you had trained earlier
python3 coco.py train --dataset=pathToCoCoDataset --model=pathToFilemask_rcnn_coco.h5
Continue training the last model you trained. This will find the last trained weights in the model directory.
python coco.py train --dataset=pathToCoCoDataset --model=last
@mmejeras have you find the sollution?
because i am also work on some medical image so i need create own dataset
I've built a very simple JSON annotation tool that creates a dataset from scratch because I faced the same problem of creating a dataset from scratch. It also has components that integrate it with a pretrained Mask RCNN model so as to make the workflow easier once a good model is trained. Here's the link: https://github.com/Deep-Magic/COCO-Style-Dataset-Generator-GUI.
How to train pet_data_set with mask ? i create tf test and eval files, but launch train.py ... then mask are not displayed, i think we have to launch train and use masks but how ? i am using the PNG mask and XML files and the pet_tf_createrecord.py
thanks i spend weeks on this problem !
Hi, i have aerial images of vehicles in jpg format and bounding boxes in text files. How can I prepare them to train for Mask RCNN? Any help would be appreciated. :-)
@anookeen Check #297 . It was discussed step by step how to prepare the data. Some functions have changed since then but the main idea still the same. Also, follow the nucleus demo.
Loading weights C:\Users\Administrator\Desktop\Mask_RCNN-master\logs\waterbodie
s20180903T1051\mask_rcnn_waterbodies_0011.h5
2018-09-10 11:07:41.152244: I T:\src\github\tensorflow\tensorflow\core\platform\
cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow bi
nary was not compiled to use: AVX2
Running on C:\Users\Administrator\Desktop\waterbodies\val\15.JPG
Processing 1 images
image shape: (369, 305, 3) min: 0.00000 max: 255
.00000 uint8
molded_images shape: (1, 1024, 1024, 3) min: -123.70000 max: 151
.10000 float64
image_metas shape: (1, 14) min: 0.00000 max: 1024
.00000 float64
anchors shape: (1, 261888, 4) min: -0.35390 max: 1
.29134 float32
Traceback (most recent call last):
File "waterbodies.py", line 298, in
detect_and_color_splash(model, image_path=args.image)
File "waterbodies.py", line 216, in detect_and_color_splash
skimage.io.imsave(file_name, splash)
File "C:\Users\Administrator\Anaconda3\lib\site-packages\skimage\io_io.py", l
ine 131, in imsave
if is_low_contrast(arr):
File "C:\Users\Administrator\Anaconda3\lib\site-packages\skimage\exposure\expo
sure.py", line 503, in is_low_contrast
dlimits = dtype_limits(image, clip_negative=False)
File "C:\Users\Administrator\Anaconda3\lib\site-packages\skimage\util\dtype.py
", line 49, in dtype_limits
imin, imax = dtype_range[image.dtype.type]
KeyError:
A bit late to the party but for the record fast.ai has an excellent set of notebooks on object detection and segmentation that use a subset of the coco dataset. fwiw Jeremy Howard's code and video goes through the format of the coco annotations in a thorough and understandable way.
Hello, I want to know if anyone has been able to use jpg with xml annotations for training, in the example of ballons they use json, my annotations are made with labellimg.
I would appreciate any help.
H, I need help. I like to apply coco.py to own dataset having 4 classes, but I got the following WARNING: WARNING:root:You are using the default load_mask(), maybe you need to define your own one. May it needs to modify the load_mask(). I would appreciate any help. Thank you.
Can you be more specific, can you share the code of your load_mask?
Can someone please help me.
I am trying to use Mask R-CNN for 10 classes and I am not able to figure out how I must annotate the images using VGG Annotator. I found this while https://github.com/bsaldivaremc2/keras_tutorials/blob/master/VGG_Data_extraction.ipynb
which helps to understand the data but how do we annotate it for 10 classes to train?
@look 585record10s
Pls can your share how you were able to this problem of
Warning:error:You are using the default load_mask(), maybe you need to define your own one. May it needs to modify the load_mask().
I am currently stuck here. Pls, I need help
Dear kzish,
I could solve the issues and the learning of Mask RCNN has been progressing now!
My memorandum is below:
In config.py:
line 28: GPU_COUNT = 4
line 73: NUM_CLASSES = 1 + 4 #backgroud + number of own classes
in coco.py:
line 84: GPU_COUNT = 4
line 87: NUM_CLASSES 1 + 4
line 116:
self.add_class("coco", 1, "AAA")
self.add_class("coco", 2, "BBB")
self.add_class("coco", 3, "CCC")
self.add_class("coco", 4, "DDD")
it needed to change the versions of tensorflow-gpu and keras;
tensorflow-gpu == 1.7.0
keras == 2.1.3
After modifying the above:
python setup.py install
python samples/coco/coco.py train --dataset=/home/dl-box/path/to/coco --model==imagenet
Especially, the command of "python setup.py install" has to be needed after modifying your files. I strongly recommend this.
Dear look585record10s
I'm working on Mask R-CNN. I got troubles in implementing this architecture on Keras Model. Finally, i don't know to add more object.
class AvakoCounting(Dataset):
# load the dataset definitions
def load_dataset(self,is_train=True):
# define one class
#self.add_class("dataset", 1, "kangaroo")
self.add_class("dataset", 1, "voiture")
self.add_class("dataset", 2, "moto")
self.add_class("dataset", 3, "autocar")
self.add_class("dataset", 4, "camion_2_essieux")
self.add_class("dataset", 5, "camion_3_essieux")
self.add_class("dataset", 6, "camion_4_essieux")
self.add_class("dataset", 7, "ens_articule_3_essieux")
self.add_class("dataset", 8, "ens_articule_4_essieux")
self.add_class("dataset", 9, "ens_articule_5_essieux")
self.add_class("dataset", 10, "ens_articule_6_essieux")
self.add_class("dataset", 11, "minicar")
self.add_class("dataset", 12, "tricycle")
self.add_class("dataset", 13, "camionnette")
self.add_class("dataset", 14, "camion_tricycle")
images_test = '/content/drive/My Drive/Stage_ISE/workspace/training/images/test/'
images_train = '/content/drive/My Drive/Stage_ISE/workspace/training/images/train/'
annotations_dir = '/content/drive/My Drive/Stage_ISE/workspace/training/annotations/'
images_dir = '/content/images_completes/'
# find all images
d = 0
for filename in listdir(images_dir):
nom_image = filename.split('.jpg')[0]
d = d + 1
image_id = d
# skip all images after 150 if we are building the train set
f = 0
if is_train:
directory = images_train
ann_path = annotations_dir + nom_image + '.xml'
print("train")
#continue
# skip all images before 150 if we are building the test/val set
if not is_train:
directory = images_test
ann_path = annotations_dir + nom_image + '.xml'
print("test")
#continue
# add to dataset
img_path = directory + filename
self.add_image('dataset', image_id=image_id, path=img_path, annotation=ann_path)
def extract_boxes(self, filename):
# load and parse the file
tree = ElementTree.parse(filename)
# get the root of the document
root = tree.getroot()
# extract each bounding box
boxes = list()
for box in root.findall('.//bndbox'):
xmin = int(box.find('xmin').text)
ymin = int(box.find('ymin').text)
xmax = int(box.find('xmax').text)
ymax = int(box.find('ymax').text)
filem = box.find('name').text
coors = [xmin, ymin, xmax, ymax]
boxes.append(coors)
# extract image dimensions
width = int(root.find('.//size/width').text)
height = int(root.find('.//size/height').text)
return boxes, width, height
# load the masks for an image
def load_mask(self, image_id):
# get details of image
info = self.image_info[image_id]
# define box file location
path = info['annotation']
# load XML
boxes, w, h = self.extract_boxes(path)
# create one array for all masks, each on a different channel
masks = zeros([h, w, len(boxes)], dtype='uint8')
# create masks
class_ids = list()
for i in range(len(boxes)):
box = boxes[i]
row_s, row_e = box[1], box[3]
col_s, col_e = box[0], box[2]
masks[row_s:row_e, col_s:col_e, i] = 1
class_ids.append(self.class_names.index('voiture'))
return masks, asarray(class_ids, dtype='int32')
If you look in coco.py you'll see that the Dataset method load_mask() decodes Coco's polygonal format into a binary image mask. Mask-RCNN uses image masks, not polygons, for training. In the Coco case, load_mask() calls annToMask() which returns an image. For a your new dataset where the masks are already images, you can write your own load_mask() method that will instead read the jpegs, recast them as images of the appropriate size and type, and return (for all the masks associated with the designated image) a stack of binary masks and nparray of class labels, just as is done in coco.py's version of load_mask()
Has anyone managed to successfully do this? I work with images and mask files, so converting them to COCO-like format is time consuming and obnoxious. If there's a way to to train in a "flow from directory" like fashion, I'd much rather use that!
@Mahi-Mai Do you know how to use binary masks in the Maks-rcnn model?
I have a binary mask with several objects, instead of a binary mask for each object
Or does anyone know how to convert this mask with several objects into a JSON format?
Most helpful comment
If you look in coco.py you'll see that the Dataset method load_mask() decodes Coco's polygonal format into a binary image mask. Mask-RCNN uses image masks, not polygons, for training. In the Coco case, load_mask() calls annToMask() which returns an image. For a your new dataset where the masks are already images, you can write your own load_mask() method that will instead read the jpegs, recast them as images of the appropriate size and type, and return (for all the masks associated with the designated image) a stack of binary masks and nparray of class labels, just as is done in coco.py's version of load_mask()