Mask_rcnn: What's the purpose of `TimeDistributedLayer` Wrapper in `fpn_classifier_graph()`?

Created on 13 Nov 2018  路  3Comments  路  Source: matterport/Mask_RCNN

I find for each layer you build in fpn_classifier_graph() in model.py, you add a TimeDistributed wrapper. I don't understand what the purpose is. From Keras documentation, it said the TimeDistributed wrapper has something to do with temporal data, but I don't think the feature maps are temporal data. I also read the original paper and found no clues.

Could you please explain a little bit more about that?

Thanks very much!

Most helpful comment

@apptech-evan-huang From the example of https://keras.io/layers/wrappers/.

model = Sequential()
model.add(TimeDistributed(Conv2D(64, (3, 3)),
                          input_shape=(10, 299, 299, 3)))

you can find that Conv2D is applied to the 10 timesteps(or say samples) of inputs.
The document says

This wrapper applies a layer to every temporal slice of an input.

, but the input doesn't have to be a time series. Instead, you can think of the input be 10 samples of images, each of size (299,299,3), because the operation is still the same.

In this line of code: https://github.com/matterport/Mask_RCNN/blob/master/mrcnn/model.py#L928,

x = KL.TimeDistributed(KL.Conv2D(fc_layers_size, (pool_size, pool_size), padding="valid"),
                           name="mrcnn_class_conv1")(x)

the input has the shape [batch, num_rois, POOL_SIZE, POOL_SIZE, channels], and the TimeDistributed layer here just applies Conv2D to each of the num_rois ROIs, which has nothing to do with time series.

All 3 comments

@apptech-evan-huang From the example of https://keras.io/layers/wrappers/.

model = Sequential()
model.add(TimeDistributed(Conv2D(64, (3, 3)),
                          input_shape=(10, 299, 299, 3)))

you can find that Conv2D is applied to the 10 timesteps(or say samples) of inputs.
The document says

This wrapper applies a layer to every temporal slice of an input.

, but the input doesn't have to be a time series. Instead, you can think of the input be 10 samples of images, each of size (299,299,3), because the operation is still the same.

In this line of code: https://github.com/matterport/Mask_RCNN/blob/master/mrcnn/model.py#L928,

x = KL.TimeDistributed(KL.Conv2D(fc_layers_size, (pool_size, pool_size), padding="valid"),
                           name="mrcnn_class_conv1")(x)

the input has the shape [batch, num_rois, POOL_SIZE, POOL_SIZE, channels], and the TimeDistributed layer here just applies Conv2D to each of the num_rois ROIs, which has nothing to do with time series.

It seems to start to make sense to me!
Thanks very much!

@apptech-evan-huang Haha, the keras layer name TimeDistributed could be misleading here.

Was this page helpful?
0 / 5 - 0 ratings