Mask_rcnn: Support for greyscale images (in general, number of channels different from 3)

Created on 15 Dec 2017 · 18Comments · Source: matterport/Mask_RCNN

First of all, thank you for sharing with the community this amazing repository!

I'm experimenting with the Mask R-CNN methodology on a dataset of greyscale images and couldn't help noticing that only RGB images are supported.
While the approach suggested in the default load_image() method of the Dataset class does the trick (i.e. converting from greyscale to RGB), it appears to me that repeating 3 times the same data would result in underutilizing GPU memory.
I played a bit with the Dataset and Config classes to be able to feed single-channel images, however I received errors from model.py, which made me realize that the 3-channel requirement might be hard-coded at a deeper level.

I was wondering if there are any plans on adding support for images with an arbitrary number of channels.
Thanks in advance!

Source

mminervini

👍5

Most helpful comment

load_weights

How do you use the exclude parameter to exclude the first layer could you provide an example?

eoghanmull on 24 Jan 2019

👍3

All 18 comments

Did you try IMAGE_SHAPE = [height, width] or [height, width, 1]? I think the second would have more likelihood of working.

waleedka on 16 Dec 2017

Hi @waleedka,

Yes, I did try as you suggest, but all combinations resulted in some errors.
Inspecting error messages, it looks like they come from resize or padding operations either in utils.py or models.py. For example, here are some of the combinations I tried...

IMAGE_SHAPE = [height, width]
keras/engine/topology.py, line 458, in assert_input_compatibility, str(K.ndim(x))) complained about: ValueError: Input 0 is incompatible with layer zero_padding2d_1: expected ndim=4, found ndim=3.

IMAGE_SHAPE = [height, width, 1]
utils.py, line 406, np.pad() raised a bunch of ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (3,2) and requested shape (2,2) and also ValueError: Unable to create correctly shaped tuple from [(110, 110), (0, 0), (0, 0)].
I also tried to add a new axis to the 2-D image, by making load_image() return image[..., np.newaxis], but then in utils.py, line 396, resize_image(), imresize() raised a ValueError: 'arr' does not have a suitable array shape for any mode..

In both cases I also tried to provide MEAN_PIXEL either as a scalar or as a 1-dimensional array, but this had no effect.

Please let me know if you would like to see the complete stack trace and I will report it here.

mminervini on 16 Dec 2017

👍1

any progress on this issue?

paulcx on 12 Feb 2018

Based on @mminervini 's experiments, it looks like changing IMAGE_SHAPE alone is not enough. You'd want to find all the places where IMAGE_SHAPE is used and update the code to expect 1 in the third dimension instead of 3.

waleedka on 12 Feb 2018

@waleedka Thanks for explaination. However, there are some issues about loading weights I guesss after changing the IMAGE_SHAPE to [h, d, 1] and it works without using pretrained weights.

paulcx on 13 Feb 2018

That's correct. You can get around that by using most (but not all) of the pretrained weights. The first layer is the only layer that sees the input image, so you can exclude the weights of the first layer when you load the weights (the load_weights function has an exclude parameter).

If you want an even better initialization, you can read the weights of the first layer, average the weights of the RGB channels, and produce a new set of weights that would be better than starting with random weights.

waleedka on 14 Feb 2018

❤1 👍1

hello all, I implement all the changes as mentioned but still got an keras-related error:

conv1 (Conv2D)
bn_conv1 (BatchNorm)
res2a_branch2a (Conv2D)
bn2a_branch2a (BatchNorm)
res2a_branch2b (Conv2D)
bn2a_branch2b (BatchNorm)
res2a_branch2c (Conv2D)
res2a_branch1 (Conv2D)
bn2a_branch2c (BatchNorm)
bn2a_branch1 (BatchNorm)
res2b_branch2a (Conv2D)
bn2b_branch2a (BatchNorm)
res2b_branch2b (Conv2D)
bn2b_branch2b (BatchNorm)
res2b_branch2c (Conv2D)
bn2b_branch2c (BatchNorm)
res2c_branch2a (Conv2D)
bn2c_branch2a (BatchNorm)
res2c_branch2b (Conv2D)
bn2c_branch2b (BatchNorm)
res2c_branch2c (Conv2D)
bn2c_branch2c (BatchNorm)
res3a_branch2a (Conv2D)
bn3a_branch2a (BatchNorm)
res3a_branch2b (Conv2D)
bn3a_branch2b (BatchNorm)
res3a_branch2c (Conv2D)
res3a_branch1 (Conv2D)
bn3a_branch2c (BatchNorm)
bn3a_branch1 (BatchNorm)
res3b_branch2a (Conv2D)
bn3b_branch2a (BatchNorm)
res3b_branch2b (Conv2D)
bn3b_branch2b (BatchNorm)
res3b_branch2c (Conv2D)
bn3b_branch2c (BatchNorm)
res3c_branch2a (Conv2D)
bn3c_branch2a (BatchNorm)
res3c_branch2b (Conv2D)
bn3c_branch2b (BatchNorm)
res3c_branch2c (Conv2D)
bn3c_branch2c (BatchNorm)
res3d_branch2a (Conv2D)
bn3d_branch2a (BatchNorm)
res3d_branch2b (Conv2D)
bn3d_branch2b (BatchNorm)
res3d_branch2c (Conv2D)
bn3d_branch2c (BatchNorm)
res4a_branch2a (Conv2D)
bn4a_branch2a (BatchNorm)
res4a_branch2b (Conv2D)
bn4a_branch2b (BatchNorm)
res4a_branch2c (Conv2D)
res4a_branch1 (Conv2D)
bn4a_branch2c (BatchNorm)
bn4a_branch1 (BatchNorm)
res4b_branch2a (Conv2D)
bn4b_branch2a (BatchNorm)
res4b_branch2b (Conv2D)
bn4b_branch2b (BatchNorm)
res4b_branch2c (Conv2D)
bn4b_branch2c (BatchNorm)
res4c_branch2a (Conv2D)
bn4c_branch2a (BatchNorm)
res4c_branch2b (Conv2D)
bn4c_branch2b (BatchNorm)
res4c_branch2c (Conv2D)
bn4c_branch2c (BatchNorm)
res4d_branch2a (Conv2D)
bn4d_branch2a (BatchNorm)
res4d_branch2b (Conv2D)
bn4d_branch2b (BatchNorm)
res4d_branch2c (Conv2D)
bn4d_branch2c (BatchNorm)
res4e_branch2a (Conv2D)
bn4e_branch2a (BatchNorm)
res4e_branch2b (Conv2D)
bn4e_branch2b (BatchNorm)
res4e_branch2c (Conv2D)
bn4e_branch2c (BatchNorm)
res4f_branch2a (Conv2D)
bn4f_branch2a (BatchNorm)
res4f_branch2b (Conv2D)
bn4f_branch2b (BatchNorm)
res4f_branch2c (Conv2D)
bn4f_branch2c (BatchNorm)
res4g_branch2a (Conv2D)
bn4g_branch2a (BatchNorm)
res4g_branch2b (Conv2D)
bn4g_branch2b (BatchNorm)
res4g_branch2c (Conv2D)
bn4g_branch2c (BatchNorm)
res4h_branch2a (Conv2D)
bn4h_branch2a (BatchNorm)
res4h_branch2b (Conv2D)
bn4h_branch2b (BatchNorm)
res4h_branch2c (Conv2D)
bn4h_branch2c (BatchNorm)
res4i_branch2a (Conv2D)
bn4i_branch2a (BatchNorm)
res4i_branch2b (Conv2D)
bn4i_branch2b (BatchNorm)
res4i_branch2c (Conv2D)
bn4i_branch2c (BatchNorm)
res4j_branch2a (Conv2D)
bn4j_branch2a (BatchNorm)
res4j_branch2b (Conv2D)
bn4j_branch2b (BatchNorm)
res4j_branch2c (Conv2D)
bn4j_branch2c (BatchNorm)
res4k_branch2a (Conv2D)
bn4k_branch2a (BatchNorm)
res4k_branch2b (Conv2D)
bn4k_branch2b (BatchNorm)
res4k_branch2c (Conv2D)
bn4k_branch2c (BatchNorm)
res4l_branch2a (Conv2D)
bn4l_branch2a (BatchNorm)
res4l_branch2b (Conv2D)
bn4l_branch2b (BatchNorm)
res4l_branch2c (Conv2D)
bn4l_branch2c (BatchNorm)
res4m_branch2a (Conv2D)
bn4m_branch2a (BatchNorm)
res4m_branch2b (Conv2D)
bn4m_branch2b (BatchNorm)
res4m_branch2c (Conv2D)
bn4m_branch2c (BatchNorm)
res4n_branch2a (Conv2D)
bn4n_branch2a (BatchNorm)
res4n_branch2b (Conv2D)
bn4n_branch2b (BatchNorm)
res4n_branch2c (Conv2D)
bn4n_branch2c (BatchNorm)
res4o_branch2a (Conv2D)
bn4o_branch2a (BatchNorm)
res4o_branch2b (Conv2D)
bn4o_branch2b (BatchNorm)
res4o_branch2c (Conv2D)
bn4o_branch2c (BatchNorm)
res4p_branch2a (Conv2D)
bn4p_branch2a (BatchNorm)
res4p_branch2b (Conv2D)
bn4p_branch2b (BatchNorm)
res4p_branch2c (Conv2D)
bn4p_branch2c (BatchNorm)
res4q_branch2a (Conv2D)
bn4q_branch2a (BatchNorm)
res4q_branch2b (Conv2D)
bn4q_branch2b (BatchNorm)
res4q_branch2c (Conv2D)
bn4q_branch2c (BatchNorm)
res4r_branch2a (Conv2D)
bn4r_branch2a (BatchNorm)
res4r_branch2b (Conv2D)
bn4r_branch2b (BatchNorm)
res4r_branch2c (Conv2D)
bn4r_branch2c (BatchNorm)
res4s_branch2a (Conv2D)
bn4s_branch2a (BatchNorm)
res4s_branch2b (Conv2D)
bn4s_branch2b (BatchNorm)
res4s_branch2c (Conv2D)
bn4s_branch2c (BatchNorm)
res4t_branch2a (Conv2D)
bn4t_branch2a (BatchNorm)
res4t_branch2b (Conv2D)
bn4t_branch2b (BatchNorm)
res4t_branch2c (Conv2D)
bn4t_branch2c (BatchNorm)
res4u_branch2a (Conv2D)
bn4u_branch2a (BatchNorm)
res4u_branch2b (Conv2D)
bn4u_branch2b (BatchNorm)
res4u_branch2c (Conv2D)
bn4u_branch2c (BatchNorm)
res4v_branch2a (Conv2D)
bn4v_branch2a (BatchNorm)
res4v_branch2b (Conv2D)
bn4v_branch2b (BatchNorm)
res4v_branch2c (Conv2D)
bn4v_branch2c (BatchNorm)
res4w_branch2a (Conv2D)
bn4w_branch2a (BatchNorm)
res4w_branch2b (Conv2D)
bn4w_branch2b (BatchNorm)
res4w_branch2c (Conv2D)
bn4w_branch2c (BatchNorm)
res5a_branch2a (Conv2D)
bn5a_branch2a (BatchNorm)
res5a_branch2b (Conv2D)
bn5a_branch2b (BatchNorm)
res5a_branch2c (Conv2D)
res5a_branch1 (Conv2D)
bn5a_branch2c (BatchNorm)
bn5a_branch1 (BatchNorm)
res5b_branch2a (Conv2D)
bn5b_branch2a (BatchNorm)
res5b_branch2b (Conv2D)
bn5b_branch2b (BatchNorm)
res5b_branch2c (Conv2D)
bn5b_branch2c (BatchNorm)
res5c_branch2a (Conv2D)
bn5c_branch2a (BatchNorm)
res5c_branch2b (Conv2D)
bn5c_branch2b (BatchNorm)
res5c_branch2c (Conv2D)
bn5c_branch2c (BatchNorm)
fpn_c5p5 (Conv2D)
fpn_c4p4 (Conv2D)
fpn_c3p3 (Conv2D)
fpn_c2p2 (Conv2D)
fpn_p5 (Conv2D)
fpn_p2 (Conv2D)
fpn_p3 (Conv2D)
fpn_p4 (Conv2D)
In model: rpn_model
rpn_conv_shared (Conv2D)
rpn_class_raw (Conv2D)
rpn_bbox_pred (Conv2D)
mrcnn_mask_conv1 (TimeDistributed)
mrcnn_mask_bn1 (TimeDistributed)
mrcnn_mask_conv2 (TimeDistributed)
mrcnn_mask_bn2 (TimeDistributed)
mrcnn_mask_conv3 (TimeDistributed)
mrcnn_class_conv1 (TimeDistributed)
mrcnn_mask_bn3 (TimeDistributed)
mrcnn_class_bn1 (TimeDistributed)
mrcnn_mask_conv4 (TimeDistributed)
mrcnn_class_conv2 (TimeDistributed)
mrcnn_mask_bn4 (TimeDistributed)
mrcnn_class_bn2 (TimeDistributed)
mrcnn_mask_deconv1 (TimeDistributed)
mrcnn_mask_deconv2 (TimeDistributed)
mrcnn_bbox_fc (TimeDistributed)
mrcnn_mask_deconv3 (TimeDistributed)
mrcnn_class_logits (TimeDistributed)
mrcnn_mask (TimeDistributed)
Epoch 1/30
(1, 1024, 1024, 3)
Traceback (most recent call last):
File "m]2_coco_all_gray.py", line 540, in
train(model, args.train_target)
File "m2_coco_all_gray.py", line 281, in train
augmentation=aug)
File "C:\Users\S\Anaconda3\envs\tensorflow\lib\site-packages\mrcnn\model
.py", line 2417, in train
use_multiprocessing=True,
File "C:\Users\S\Anaconda3\envs\tensorflow\lib\site-packages\keras\legac
y\interfaces.py", line 91, in wrapper
return func(args, *kwargs)
File "C:\Users\S\Anaconda3\envs\tensorflow\lib\site-packages\keras\engin
e\training.py", line 1417, in fit_generator
initial_epoch=initial_epoch)
File "C:\Users\S\Anaconda3\envs\tensorflow\lib\site-packages\keras\engin
e\training_generator.py", line 213, in fit_generator
class_weight=class_weight)
File "C:\Users\S\Anaconda3\envs\tensorflow\lib\site-packages\keras\engin
e\training.py", line 1211, in train_on_batch
class_weight=class_weight)
File "C:\Users\Sandisk\Anaconda3\envs\tensorflow\lib\site-packages\keras\engin
e\training.py", line 750, in _standardize_user_data
exception_prefix='input')
File "C:\Users\S\Anaconda3\envs\tensorflow\lib\site-packages\keras\engin
e\training_utils.py", line 138, in standardize_input_data
str(data_shape))
ValueError: Error when checking input: expected input_image to have shape (None,
None, 1) but got array with shape (1024, 1024, 3)

any comment/suggestion would be appreciated!

chengchu88 on 20 Oct 2018

@chengchu88 It is saying that your input is a RGB image rather than a grayscale image. If you are using cv2 to read image, try cv2.imread('filepath', 0).

keineahnung2345 on 20 Oct 2018

@keineahnung2345

thanks. I figured it out. I should have modified the load_image in the
utils.py to comment out a few more lines that convert grayscale back to RGB.

Now, I got a new error:

Epoch 1/30(1, 2048, 2048)Traceback (most recent call last): File
"manos2_coco_all_gray.py", line 543, in train(model,
args.train_target) File "manos2_coco_all_gray.py", line 284, in train
augmentation=aug) File
"C:\Users\Sandisk\Anaconda3\envs\tensorflow\lib\site-packages\mrcnnmodel.py",
line 2417, in train use_multiprocessing=True, File
"C:\Users\Sandisk\Anaconda3\envs\tensorflow\lib\site-packages\keras\legacy\interfaces.py",
line 91, in wrapper return func(args, *kwargs) File
"C:\Users\Sandisk\Anaconda3\envs\tensorflow\lib\site-packages\keras\engine\training.py",
line 1417, in fit_generator initial_epoch=initial_epoch) File
"C:\Users\Sandisk\Anaconda3\envs\tensorflow\lib\site-packages\keras\engine\training_generator.py",
line 213, in fit_generator class_weight=class_weight) File
"C:\Users\Sandisk\Anaconda3\envs\tensorflow\lib\site-packages\keras\engine\training.py",
line 1211, in train_on_batch class_weight=class_weight) File
"C:\Users\Sandisk\Anaconda3\envs\tensorflow\lib\site-packages\keras\engine\training.py",
line 750, in _standardize_user_data exception_prefix='input') File
"C:\Users\Sandisk\Anaconda3\envs\tensorflow\lib\site-packages\keras\engine\training_utils.py",
line 128, in standardize_input_data 'with shape ' +
str(data_shape))ValueError: Error when checking input: expected input_image
to have 4 dimensions, but got array with shape (1, 2048, 2048)any help and
comment is appreciated..

On Sat, Oct 20, 2018 at 3:03 AM keineahnung2345 notifications@github.com
wrote:

@chengchu88 https://github.com/chengchu88 It is saying that your input
is a RGB image rather than a grayscale image. If you are using cv2 to read
image, try cv2.imread('filepath', 0).

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/matterport/Mask_RCNN/issues/140#issuecomment-431567019,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AlSxeMCmHSeOktXH5asThUCMz22X-Pl4ks5umvT-gaJpZM4RDk5g
.

chengchu88 on 20 Oct 2018

@chengchu88 Sorry, I have forgotten that the image read by cv2.imread('xxx', 0) is 2-dimensional. After reading image, you should use

import numpy as np
img = img[..., np.newaxis]

to add the 3rd dimension.

keineahnung2345 on 24 Oct 2018

❤1 👍1

thanks, appreciate your comment. I will give it a try!

On Wed, Oct 24, 2018 at 12:53 AM keineahnung2345 notifications@github.com
wrote:

@chengchu88 https://github.com/chengchu88 Sorry, I have forgotten that
the image read by cv2.imread('xxx', 0) is 2-dimensional. After reading
image, you should use

import numpy as np
img = img[..., np.newaxis]

to add the 3rd dimension.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/matterport/Mask_RCNN/issues/140#issuecomment-432551074,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AlSxeBCY2ube_EHaQqP9u7cy8nU1BJl0ks5uoBxkgaJpZM4RDk5g
.

chengchu88 on 24 Oct 2018

@chengchu88 did you resolve the problem?
I encountered the same error
I only found it is in the data_generator()
but I don't know how to modify now.

lunasdejavu on 14 Nov 2018

No.. I fixed one and then another problem pop up.. kind of put it off till
later..

On Wed, Nov 14, 2018, 2:30 AM lunasdejavu <[email protected] wrote:

@chengchu88 https://github.com/chengchu88 did you resolve the problem?
I encountered the same error
I only found it is in the data_generator()
but I don't know how to modify now.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/matterport/Mask_RCNN/issues/140#issuecomment-438614556,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AlSxeLBaqAKB_57LcfIevx80wak6lzj4ks5uu_C1gaJpZM4RDk5g
.

chengchu88 on 15 Nov 2018

load_weights

How do you use the exclude parameter to exclude the first layer could you provide an example?

eoghanmull on 24 Jan 2019

👍3

load_weights

How do you use the exclude parameter to exclude the first layer could you provide an example?

In your main function, change
model.load_weights(weights_path, by_name=True)
to

model.load_weights(weights_path, by_name=True,exclude=[   "conv1"])             
## multiplex##  load weights of all layers except conv1, which will be initialized to random weights

See more in last section of https://github.com/matterport/Mask_RCNN/wiki

Xiaoyang-Rebecca on 1 Mar 2019

👍1

Hi, does anyone have an idea as to how the pretrained weights are used for the 4th channel? are they even used at all?

ApoorvaSuresh on 19 Sep 2019

The wiki says,

If you train a subset of layers, remember to include conv1 since it's initialized to random weights. This is relevant if you pass layers="head" or layers="4+", ...etc. when you call train().

...Uh...how do we do that? Meaning, how do I change
model.train(dataset_train, dataset_val, learning_rate=config.LEARNING_RATE, epochs=2, layers='heads')
to reflect what was said in wiki point 5? As is, I'm getting image shape errors, as in...

ValueError: Error when checking input: expected input_image to have 4 dimensions, but got array with shape (4, 384, 512)

drscotthawley on 7 Dec 2019

The wiki says,

If you train a subset of layers, remember to include conv1 since it's initialized to random weights. This is relevant if you pass layers="head" or layers="4+", ...etc. when you call train().

...Uh...how do we do that? Meaning, how do I change
model.train(dataset_train, dataset_val, learning_rate=config.LEARNING_RATE, epochs=2, layers='heads')
to reflect what was said in wiki point 5? As is, I'm getting image shape errors, as in...
ValueError: Error when checking input: expected input_image to have 4 dimensions, but got array with shape (4, 384, 512)

I don't yet have a working solution but this edit to model.py (https://github.com/matterport/Mask_RCNN/issues/1121) seems to be on the right track:

        layer_regex = {
            # all layers but the backbone
            "heads": r"(conv1\_.*)|(mrcnn\_.*)|(rpn\_.*)|(fpn\_.*)",
            # From a specific Resnet stage and up
            "3+": r"(res3.*)|(bn3.*)|(res4.*)|(bn4.*)|(res5.*)|(bn5.*)|(mrcnn\_.*)|(rpn\_.*)|(fpn\_.*)",
            "4+": r"(res4.*)|(bn4.*)|(res5.*)|(bn5.*)|(mrcnn\_.*)|(rpn\_.*)|(fpn\_.*)",
            "5+": r"(res5.*)|(bn5.*)|(mrcnn\_.*)|(rpn\_.*)|(fpn\_.*)",
            # All layers
            "all": ".*",
        }