Mask_rcnn: What's the purpose of `TimeDistributedLayer` Wrapper in `fpn_classifier_graph()`?

Created on 13 Nov 2018 · 3Comments · Source: matterport/Mask_RCNN

I find for each layer you build in fpn_classifier_graph() in model.py, you add a TimeDistributed wrapper. I don't understand what the purpose is. From Keras documentation, it said the TimeDistributed wrapper has something to do with temporal data, but I don't think the feature maps are temporal data. I also read the original paper and found no clues.

Could you please explain a little bit more about that?

Thanks very much!

Source

apptech-evan-huang

Most helpful comment

@apptech-evan-huang From the example of https://keras.io/layers/wrappers/.

model = Sequential()
model.add(TimeDistributed(Conv2D(64, (3, 3)),
                          input_shape=(10, 299, 299, 3)))

you can find that Conv2D is applied to the 10 timesteps(or say samples) of inputs.
The document says

This wrapper applies a layer to every temporal slice of an input.

, but the input doesn't have to be a time series. Instead, you can think of the input be 10 samples of images, each of size (299,299,3), because the operation is still the same.

In this line of code: https://github.com/matterport/Mask_RCNN/blob/master/mrcnn/model.py#L928,

x = KL.TimeDistributed(KL.Conv2D(fc_layers_size, (pool_size, pool_size), padding="valid"),
                           name="mrcnn_class_conv1")(x)

the input has the shape [batch, num_rois, POOL_SIZE, POOL_SIZE, channels], and the TimeDistributed layer here just applies Conv2D to each of the num_rois ROIs, which has nothing to do with time series.

keineahnung2345 on 14 Nov 2018

👍7 🎉3

All 3 comments

@apptech-evan-huang From the example of https://keras.io/layers/wrappers/.

model = Sequential()
model.add(TimeDistributed(Conv2D(64, (3, 3)),
                          input_shape=(10, 299, 299, 3)))

you can find that Conv2D is applied to the 10 timesteps(or say samples) of inputs.
The document says

This wrapper applies a layer to every temporal slice of an input.

, but the input doesn't have to be a time series. Instead, you can think of the input be 10 samples of images, each of size (299,299,3), because the operation is still the same.

In this line of code: https://github.com/matterport/Mask_RCNN/blob/master/mrcnn/model.py#L928,

x = KL.TimeDistributed(KL.Conv2D(fc_layers_size, (pool_size, pool_size), padding="valid"),
                           name="mrcnn_class_conv1")(x)

keineahnung2345 on 14 Nov 2018

👍7 🎉3

It seems to start to make sense to me!
Thanks very much!

apptech-evan-huang on 14 Nov 2018

@apptech-evan-huang Haha, the keras layer name TimeDistributed could be misleading here.

keineahnung2345 on 14 Nov 2018

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

TypeError: Axis must be specified when shapes of a and weights differ.

Mhaiyang · 4Comments

Repeated anchors in generate_pyramid_anchors()

Mabinogiysk · 3Comments

How to test multiple images?

canerozer · 3Comments

the resnet50 backbone on mask rcnn model pretrained weight in h5 file

simonhandsome · 3Comments

inspect_model :ValueError: Floating point image RGB values must be in the 0..1 range.

PaulChongPeng · 4Comments