Mask_rcnn: Images with more than 3 channels

Created on 20 Jul 2018 · 11Comments · Source: matterport/Mask_RCNN

Hi, I am still trying to read and understand the papers and the structure of the trained network. Mask RCNN trained on COCO is already showing promising results on my data but I was wondering how complicated it would be to present images as input augmented with a fourth channel (depth image). Would that require retraining the whole network or even changing the network's structure?

Source

mehditlili

Most helpful comment

Check this out, the wiki has some guidelines for including more channels at the bottom: https://github.com/matterport/Mask_RCNN/wiki

One step it neglects to mention is that you will need to change part of the build function in mrcnn/model.py. I changed the input_image variable to accept an arbitrary number of channels that is passed in the config class:

# Inputs, changed from original mrcnn to allow variable channels besides 3
        input_image = KL.Input(
            shape=[None, None, config.CHANNELS_NUM], name="input_image")

rbavery on 22 Jul 2018

👍5 ❤1

All 11 comments

Check this out, the wiki has some guidelines for including more channels at the bottom: https://github.com/matterport/Mask_RCNN/wiki

# Inputs, changed from original mrcnn to allow variable channels besides 3
        input_image = KL.Input(
            shape=[None, None, config.CHANNELS_NUM], name="input_image")

rbavery on 22 Jul 2018

👍5 ❤1

Thanks for the hint,
I tried doing it following the wiki and what you mentioned but I still have some troubles loading my pretrained model.
After excluding the layer 'input_image' it complains about the second layer:

ValueError: Layer #2 (named "conv1"), weight <tf.Variable 'conv1_3/kernel:0' shape=(7, 7, 4, 64) dtype=float32_ref> has shape (7, 7, 4, 64), but the saved weight has shape (64, 3, 7, 7).

mehditlili on 6 Aug 2018

It looks like you are trying to use pretrained weights from Imagenet or Coco, which are 3 channel datasets. Should have specified in my earlier comment, but you can't use pretrained weights if your input data has a different number of channels that the pretrain dataset. At least, not without changing more of the underlying mrcnn code. These code changes will only allow you to train from scratch on your data.

Since most large image datasets are RGB and most satellite remote sensing datasets have RGB plus Near-Infrared, it'd be fantastic to be able to start from 3 channel pretrain weights with inputs of more than 3 channels. If anybody has suggestions on how to do this please comment! I'm pretty new to CNNs so advice is much appreciated.

rbavery on 6 Aug 2018

I deleted the second channel, conv1 instead of the first channel and it seems to be training fine, was just a fast try, not sure if only the second layer was removed or all layers called conv1

mehditlili on 6 Aug 2018

Hi mehditlili
According to the network architecture, 'conv1' refers to the convolution layer 1 in particular. 'input_image' is just the input tensor while 'conv1' is actually the first Conv layer.
If you load the pre-trained weights like following codes, it works.

model.load_weights(COCO_MODEL_PATH, by_name=True,
                   exclude=["conv1","mrcnn_class_logits", "mrcnn_bbox_fc", 
                                  "mrcnn_bbox", "mrcnn_mask"])

jytime on 7 Aug 2018

I second doing both:

In model.py , in the def build function:

        input_image = KL.Input(
            shape=[None, None, config.IMAGE_SHAPE[2]], name="input_image")

Then before training from a coco dataset:

model.load_weights(weights_path, by_name=True, exclude=[
                "mrcnn_class_logits", "mrcnn_bbox_fc",
                "mrcnn_bbox", "mrcnn_mask", "conv1"])

moorage on 17 Sep 2018

👍1

I also issued a related PR https://github.com/matterport/Mask_RCNN/pull/940

moorage on 18 Sep 2018

If you train a subset of layers, remember to include conv1 since it's initialized to random weights. This is relevant if you pass layers="head" or layers="4+", ...etc. when you call train().

@moorage

May I ask you how to include Conv1 please? I am trying to use the grayscale image 2448x2048

lunasdejavu on 13 Nov 2018

@lunasdejavu I just did layers='all' , see https://github.com/moorage/Mask_RCNN/blob/master/samples/greppy/greppy.py#L365-L368

moorage on 14 Nov 2018

👍1

@moorage thanks because I wrote the code from sample/train_shapes.py
it trains 'heads' first then 'all'.
But I am not sure what is the difference between training layers separately and training all for once?

lunasdejavu on 15 Nov 2018

May I ask you how to include Conv1 please? I am trying to use the grayscale image 2448x2048

@moorage thanks because I wrote the code from sample/train_shapes.py
it trains 'heads' first then 'all'.
But I am not sure what is the difference between training layers separately and training all for once?

I also have these two questions. Can anyone help?