Mask_rcnn: len(images) must be equal to BATCH_SIZE

Created on 14 Feb 2019  ·  15Comments  ·  Source: matterport/Mask_RCNN

GPU_COUNT = 2
IMAGES_PER_GPU = 2
when I ran ballon splash,it crashed ,and raised AssertionError: len(images) must be equal to BATCH_SIZE

len(images) = 1
BATCH_SIZE = 2

Most helpful comment

I also had this issue when running on multiple GPUs. For anyone who also has this issue, and finds the above chain hard to follow (like I did), here is what needs to be changed:

Mask_RCNN-master/mrcnn/model.py

Line 820: Change from

[self.config.BATCH_SIZE, self.config.DETECTION_MAX_INSTANCES, 6])

to

[self.config.IMAGES_PER_GPU, self.config.DETECTION_MAX_INSTANCES, 6])

Mask_RCNN-master/samples/ballon/inspect_ballon_model.ipynb

Import Block:

Change from

import random

to

from numpy import random

Run Detection Block:

Change from

image_id = random.choice(dataset.image_ids)
image, image_meta, gt_class_id, gt_bbox, gt_mask =\
    modellib.load_image_gt(dataset, config, image_id, use_mini_mask=False)
info = dataset.image_info[image_id]
print("image ID: {}.{} ({}) {}".format(info["source"], info["id"], image_id, 
                                       dataset.image_reference(image_id)))

# Run object detection
results = model.detect([image], verbose=1)

# Display results
ax = get_ax(1)
r = results[0]
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'], 
                            dataset.class_names, r['scores'], ax=ax,
                            title="Predictions")
log("gt_class_id", gt_class_id)
log("gt_bbox", gt_bbox)
log("gt_mask", gt_mask)

to

##Change from image to img_list where img_list==BATCH_SIZE (BS == IMAGES_PER_GPU * GPU_COUNT) to run on multiple GPUs
image_ids = random.choice(dataset.image_ids, config.BATCH_SIZE)

images = []
for image_id in image_ids:
    image, image_meta, gt_class_id, gt_bbox, gt_mask =\
        modellib.load_image_gt(dataset, config, image_id, use_mini_mask=False)
    info = dataset.image_info[image_id]
    images.append(image)
    print("image ID: {}.{} ({}) {}".format(info["source"], info["id"], image_id, 
                                       dataset.image_reference(image_id)))

# Run object detection
results = model.detect(images, verbose=1)

for i, image in enumerate(images):
    r = results[i]

    visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'], 
                                dataset.class_names, r['scores'])
    log("gt_class_id", gt_class_id)
    log("gt_bbox", gt_bbox)
    log("gt_mask", gt_mask)

I can submit this as an actual pull request if need be.

All 15 comments

Hi,
@penbury how did you solve this ? can you share with me ?
I train GPU_COUNT = 4
IMAGES_PER_GPU = 3

Hi,
@penbury how did you solve this ? can you share with me ?
I train GPU_COUNT = 4
IMAGES_PER_GPU = 3

Set len(images) equal to GPU_COUNT * IMAGES_PER_GPU,it might work.

Thank you but how to set the len(images) ?
I'm using model.detect([image], verbose=1)[0]

Do you mean add more image to the list?
GPU_COUNT = 4 & IMAGES_PER_GPU = 3 => batch_size = 12
then
model.detect([image_1 , image_2 ,....,image_12], verbose=1)[0] , right ?

@shaolinkhoa Maybe you can try this PR #1082.

Thank you @chAwater .
I tried all in the PR #1082 :

  • Change the "BATCH_SIZE" to "IMAGES_PER_GPU" in line 820 file model.py
  • Add the "DETECTION_MAX_INSTANCES = 2000"

The training phase is ok but the inference phase always receive an error when I use model.detec

sub_list = []
missing_count = 0
for i, row in tqdm(sample_df.iterrows(), total=len(sample_df)):
    image = resize_image(str(DATA_DIR/'test'/row['ImageId']))
    result = model.detect([image])[0]
    if result['masks'].size > 0:
        masks, _ = refine_masks(result['masks'], result['rois'])
        for m in range(masks.shape[-1]):
            mask = masks[:, :, m].ravel(order='F')
            rle = to_rle(mask)
            label = result['class_ids'][m] - 1
            sub_list.append([row['ImageId'], ' '.join(list(map(str, rle))), label])
    else:
        # The system does not allow missing ids, this is an easy way to fill them 
        sub_list.append([row['ImageId'], '1 1', 23])
        missing_count += 1

Hi, @shaolinkhoa

First, changing the DETECTION_MAX_INSTANCES is not necessary to make this PR work (also it's not the cause of your error). You should keep that in default value, unless you have other purpose.

The inference error is raised because your model still use the "training config", so the BATCH_SIZE is not 1, that will cause an error when model.detect([image]).

There are two ways to solve that:

  1. Use model.detect( img_list ) and make sure that len(img_list) == BATCH_SIZE
  2. Create a new model for inference:
## For training
Class TrainConfig(Config):
    GPU_COUNT = 2
    IMAGES_PER_GPU = 2
    ...

train_config = TrainConfig()
model = modellib.MaskRCNN(mode="training", config=train_config, ...)

### Training ###

## For inference
Class InferConfig(Config):
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1
    ...

infer_config = InferConfig()
model = modellib.MaskRCNN(mode="inference", config=infer_config, ...)
model.load_weights(XXX)

Hi @chAwater, thank you for helping.
I added more pictures to the input_img_list then it worked.

HI @shaolinkhoa , please, can you help me?
i have the same error: len(images) must be equal to BATCH_SIZE
I train GPU_COUNT = 1
IMAGES_PER_GPU = 2
BATCH_SIZE = 2.
I did all topic: PR#1082, and changed line 820 here
https://github.com/matterport/Mask_RCNN/blob/master/mrcnn/model.py#L820

but still have the same error.

Hi,
What is your config on the model ? training or inference?
If it is inference, your input has to contain 2 images in the list. Ex: input = [ image_1, image_2]

Hi @shaolinkhoa . . many thanks for your help.
My model config is "inference", and input has 1 image in the list. now i changed input. its helped!!!! 👍

I also had this issue when running on multiple GPUs. For anyone who also has this issue, and finds the above chain hard to follow (like I did), here is what needs to be changed:

Mask_RCNN-master/mrcnn/model.py

Line 820: Change from

[self.config.BATCH_SIZE, self.config.DETECTION_MAX_INSTANCES, 6])

to

[self.config.IMAGES_PER_GPU, self.config.DETECTION_MAX_INSTANCES, 6])

Mask_RCNN-master/samples/ballon/inspect_ballon_model.ipynb

Import Block:

Change from

import random

to

from numpy import random

Run Detection Block:

Change from

image_id = random.choice(dataset.image_ids)
image, image_meta, gt_class_id, gt_bbox, gt_mask =\
    modellib.load_image_gt(dataset, config, image_id, use_mini_mask=False)
info = dataset.image_info[image_id]
print("image ID: {}.{} ({}) {}".format(info["source"], info["id"], image_id, 
                                       dataset.image_reference(image_id)))

# Run object detection
results = model.detect([image], verbose=1)

# Display results
ax = get_ax(1)
r = results[0]
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'], 
                            dataset.class_names, r['scores'], ax=ax,
                            title="Predictions")
log("gt_class_id", gt_class_id)
log("gt_bbox", gt_bbox)
log("gt_mask", gt_mask)

to

##Change from image to img_list where img_list==BATCH_SIZE (BS == IMAGES_PER_GPU * GPU_COUNT) to run on multiple GPUs
image_ids = random.choice(dataset.image_ids, config.BATCH_SIZE)

images = []
for image_id in image_ids:
    image, image_meta, gt_class_id, gt_bbox, gt_mask =\
        modellib.load_image_gt(dataset, config, image_id, use_mini_mask=False)
    info = dataset.image_info[image_id]
    images.append(image)
    print("image ID: {}.{} ({}) {}".format(info["source"], info["id"], image_id, 
                                       dataset.image_reference(image_id)))

# Run object detection
results = model.detect(images, verbose=1)

for i, image in enumerate(images):
    r = results[i]

    visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'], 
                                dataset.class_names, r['scores'])
    log("gt_class_id", gt_class_id)
    log("gt_bbox", gt_bbox)
    log("gt_mask", gt_mask)

I can submit this as an actual pull request if need be.

I simply changed IMAGES_PER_GPU value to 1 rather than train set count and it worked.

class PredictionConfig(Config):
# define the name of the configuration
NAME = "Inlet_cfg"
# number of classes (background + Inlet)
NUM_CLASSES = 1 + 1
# simplify GPU config
GPU_COUNT = 1
IMAGES_PER_GPU = 1 ### updated one, initially it was having total train set count

Hey @Trotts , I have tried your solution and i'm still having the same issue. what did i missed?

Hey @Trotts , I have tried your solution and i'm still having the same issue. what did i missed?

What error are you receiving out? What does your code look like?

@Trotts

i followed this tutorial

and come up with this code:

from os import listdir
from xml.etree import ElementTree

from matplotlib import pyplot
from matplotlib.patches import Rectangle
from numpy import zeros
from numpy import asarray, expand_dims
from mrcnn.utils import Dataset
from mrcnn.config import Config
from mrcnn.model import MaskRCNN
from mrcnn.utils import Dataset
from mrcnn.utils import compute_ap
from mrcnn.model import load_image_gt
from mrcnn.model import mold_image

dataset_path = '/Users/temporaryuser/PycharmProjects/ImageDetection/weapon_detection/weapon_model/Mask_RCNN/knives'
model_path = '/Users/temporaryuser/PycharmProjects/ImageDetection/weapon_detection/weapon_model/Mask_RCNN' \
                '/mask_rcnn_coco.h5'


# class that defines and loads the knife dataset
class KangarooDataset(Dataset):
    # load the dataset definitions
    def load_dataset(self, dataset_dir, is_train=True):
        # define two class
        self.add_class('dataset', 1, 'gun')  # gun dataset
        self.add_class('dataset', 2, 'knife')  # knife dataset
        # define data locations
        images_dir = dataset_dir + '/images/'
        annotations_dir = dataset_dir + '/annots/'
        # find all images
        for filename in listdir(images_dir):
            # extract image id
            image_id = filename[:-4]
            # print(‘IMAGE ID: ‘,image_id)
            # skip all images after 90 if we are building the train set
            crosser = 2000112
            if not image_id.isdigit():
                continue
            if is_train and int(image_id) >= crosser:  # set limit for your train and test set
                continue
            # skip all images before 90 if we are building the test/val set
            if not is_train and int(image_id) < crosser:
                continue
            img_path = images_dir + filename
            ann_path = annotations_dir + image_id + '.xml'
            # add to dataset
            self.add_image('dataset', image_id=image_id, path=img_path, annotation=ann_path, class_ids=[0, 1, 2])
            # 0:BG, 1:knife, 2:gun

    # extract bounding boxes from an annotation file
    def extract_boxes(self, filename):
        # load and parse the file
        tree = ElementTree.parse(filename)
        # get the root of the document
        root = tree.getroot()
        # extract each bounding box
        boxes = list()
        # for box in root.findall('.//bndbox'):
        for box in root.findall('.//object'):  # Change required
            name = box.find('name').text  # Change required
            xmin = int(box.find('./bndbox/xmin').text)
            ymin = int(box.find('./bndbox/ymin').text)
            xmax = int(box.find('./bndbox/xmax').text)
            ymax = int(box.find('./bndbox/ymax').text)
            # coors = [xmin, ymin, xmax, ymax, name]
            coors = [xmin, ymin, xmax, ymax, name]  # Change required
            boxes.append(coors)
        # extract image dimensions
        width = int(root.find('.//size/width').text)
        height = int(root.find('.//size/height').text)
        return boxes, width, height

    # load the masks for an image
    def load_mask(self, image_id):
        # get details of image
        info = self.image_info[image_id]
        # define box file location
        path = info['annotation']
        # load XML
        boxes, w, h = self.extract_boxes(path)
        # create one array for all masks, each on a different channel
        masks = zeros([h, w, len(boxes)], dtype='uint8')
        # create masks
        class_ids = list()
        for i in range(len(boxes)):
            box = boxes[i]
            row_s, row_e = box[1], box[3]
            col_s, col_e = box[0], box[2]
            if box[4] == 'knife':  # Change required #change this to your .XML file
                masks[row_s:row_e, col_s:col_e, i] = 2  # Change required #assign number to your class_id
                class_ids.append(self.class_names.index('knife'))  # Change required
            else:
                masks[row_s:row_e, col_s:col_e, i] = 1  # Change required
                class_ids.append(self.class_names.index('gun'))  # Change required
        return masks, asarray(class_ids, dtype='int32')

        # load an image reference

    def image_reference(self, image_id):
        info = self.image_info[image_id]
        return info['path']


# define a configuration for the model
class KangarooConfig(Config):
    # define the name of the configuration
    NAME = "kangaroo_cfg"
    # number of classes (background + knife + gun)
    NUM_CLASSES = 1 + 2  # Change required
    # number of training steps per epoch
    STEPS_PER_EPOCH = 90


# plot a number of photos with ground truth and predictions
def plot_actual_vs_predicted(dataset, model, cfg, n_images=5):
    # load image and mask
    for i in range(n_images):
        # load the image and mask
        image = dataset.load_image(i)
        mask, _ = dataset.load_mask(i)
        # convert pixel values (e.g. center)
        scaled_image = mold_image(image, cfg)
        # convert image into one sample
        sample = expand_dims(scaled_image, 0)
        # make prediction
        yhat = model.detect(sample, verbose=0)[0]
        # define subplot
        pyplot.subplot(n_images, 2, i * 2 + 1)
        # plot raw pixel data
        pyplot.imshow(image)
        pyplot.title('Actual')
        # plot masks
        for j in range(mask.shape[2]):
            pyplot.imshow(mask[:, :, j], cmap='gray', alpha=0.3)
        # get the context for drawing boxes
        pyplot.subplot(n_images, 2, i * 2 + 2)
        # plot raw pixel data
        pyplot.imshow(image)
        pyplot.title('Predicted')
        ax = pyplot.gca()
        # plot each box
        for box in yhat['rois']:
            # get coordinates
            y1, x1, y2, x2 = box
            # calculate width and height of the box
            width, height = x2 - x1, y2 - y1
            # create the shape
            rect = Rectangle((x1, y1), width, height, fill=False, color='red')
            # draw the box
            ax.add_patch(rect)
    # show the figure
    pyplot.show()


def train_model():
    # prepare train set
    train_set = KangarooDataset()
    # dataset_path = '/Users/temporaryuser/PycharmProjects/ImageDetection/weapon_detection/weapon_model/Mask_RCNN/knives'
    train_set.load_dataset(dataset_path, is_train=True)
    train_set.prepare()
    print('Train: %d' % len(train_set.image_ids))
    # prepare test/val set
    test_set = KangarooDataset()
    test_set.load_dataset(dataset_path, is_train=False)
    test_set.prepare()
    print('Test: %d' % len(test_set.image_ids))
    # prepare config
    config = KangarooConfig()
    config.display()
    # create config
    # define the model
    model = MaskRCNN(mode='training', model_dir='./', config=config)
    # load weights (mscoco) and exclude the output layers
    # file_path = '/Users/temporaryuser/PycharmProjects/ImageDetection/weapon_detection/weapon_model/Mask_RCNN' \
    #             '/mask_rcnn_coco.h5'
    model.load_weights(model_path, by_name=True,
                       exclude=["mrcnn_class_logits", "mrcnn_bbox_fc", "mrcnn_bbox", "mrcnn_mask"])
    # train weights (output layers or 'heads')
    model.train(train_set, test_set, learning_rate=config.LEARNING_RATE, epochs=5, layers='heads')

    # plot predictions for train dataset
    # plot_actual_vs_predicted(train_set, model, config)
    # # plot predictions for test dataset
    # plot_actual_vs_predicted(test_set, model, config)


def test_model():
    # load the train dataset
    train_set = KangarooDataset()
    train_set.load_dataset(dataset_path, is_train=True)
    train_set.prepare()
    print('Train: %d' % len(train_set.image_ids))
    # load the test dataset
    test_set = KangarooDataset()
    test_set.load_dataset(dataset_path, is_train=False)
    test_set.prepare()
    print('Test: %d' % len(test_set.image_ids))
    # create config
    cfg = KangarooConfig()
    # define the model
    model = MaskRCNN(mode='inference', model_dir='./', config=cfg)
    # load model weights
    model.load_weights(model_path, by_name=True, exclude=[ "mrcnn_class_logits", "mrcnn_bbox_fc", "mrcnn_bbox", "mrcnn_mask"])
    # plot predictions for train dataset
    plot_actual_vs_predicted(train_set, model, cfg)
    # plot predictions for test dataset
    plot_actual_vs_predicted(test_set, model, cfg)



if __name__ == "__main__":
    test_model()

i have tried to train my model and it worked fine. then I have tested it using the test_model() function and run into error :
len(images) must be equal to BATCH_SIZE

then i followed this solution to fix it and now I'm getting this error:

ImportError: cannot import name 'saving' from 'keras.engine'

which probably caused by downgrading keras version.
can you please help me @Trotts ?

Was this page helpful?
0 / 5 - 0 ratings