Keras: Keras Resnet50 implementation pooling options do nothing

Created on 18 Sep 2017  路  9Comments  路  Source: keras-team/keras

Hello,

I was looking at the Resnet50 implementation bundled with Keras: https://github.com/fchollet/deep-learning-models/blob/master/resnet50.py

Supposedly there is an optional pooling toggle that either does either no pooling, global average pooling, or global max pooling at the end of the network. However looking at the code for that section:

... 

x = AveragePooling2D((7, 7), name='avg_pool')(x)

    if include_top:
        x = Flatten()(x)
        x = Dense(classes, activation='softmax', name='fc1000')(x)
    else:
        if pooling == 'avg':
            x = GlobalAveragePooling2D()(x)
        elif pooling == 'max':
            x = GlobalMaxPooling2D()(x)

It seems that no matter what option you pick Global Average Pooling (GAP) is always applied by means of the x = AveragePooling2D((7, 7), name='avg_pool')(x) line which if I understand correctly basically does the same thing as GAP by reducing everything down to a (1, 1, 2048) output.

The (optional) Global Average Pooling or Global Max Pooling operations after this line have nothing to work with anymore since the output is already (1,1) spatially and thus nothing can be averaged or max pooled anymore at this point making the optional toggle for them inoperable. Obviously the "no pooling" option is also non-functional because of this.

I would suggest removing the offending Average Pooling line and letting the GlobalAveragePooling2D operation take care of the this if requested by the user. The other options will then also function.

Most helpful comment

I agree with @mxvs, AveragePooling2D should be removed if pooling == None.

All 9 comments

The output shape of AveragePooling2D is a 4D tensor, something like this, (batch_size, pooled_rows, pooled_cols, channels), this is nothing but averaging. They need it, maybe because when you include top layer, it is better to average it and reduce the number of parameters before connecting to Flatten layer.

The output shape of GlobalAveragePooling2D however is a 2D tensor, (batch_size, channels), this can be viewed as Averaging + Flatten. They need it, maybe because when you don't include top layer, but still they want to flatten it to 2D tenor as output.

Also, if you don't want the Average polling layer, just pop it.

Hello,

My point is that the AveragePooling2D(7,7) operation prevents the other options from working.

If you first perform AveragePooling2D(7,7) followed by GlobalMaxPooling2D() you don't get max pooling at all, since MaxPooling of a (1,1) spatially has nothing to pool (its already (1,1)).

The correct code looks like this:

... 
    if include_top:
        x = AveragePooling2D((7, 7), name='avg_pool')(x)
        x = Flatten()(x)
        x = Dense(classes, activation='softmax', name='fc1000')(x)
    else:
        if pooling == 'avg':
            x = GlobalAveragePooling2D()(x)
        elif pooling == 'max':
            x = GlobalMaxPooling2D()(x)

This way you can either get:
A) average pooling + top layer (like in the ResNet Paper)
B) GlobalAverage Pooling without the top layer
C) GlobalMaxPooling without the top player
D) No pooling and simply the output of the last convolutional layer (as its mentioned in the Keras documentation).

The currently implementation prevents options C) and D) from working (since you always get AveragePooling after the last conv layer even if you don't want that) and option B is currently only flattens (you can't average pool a (1,1) volume).

I completely agree with @mxvs.
The current implementation also prevents the use of smaller (arbitrary) input_shape even if include_top = False.

@mxvs

It seems that no matter what option you pick Global Average Pooling (GAP) is always applied by means of the x = AveragePooling2D((7, 7), name='avg_pool')(x) line which if I understand correctly basically does the same thing as GAP by reducing everything down to a (1, 1, 2048) output.

This is incorrect. The output of AveragePooling2D(7,7) is 1x1 only for some range of input shapes (something like 197-224).

I agree with @mxvs, AveragePooling2D should be removed if pooling == None.

Just ran into this myself, would agree that the options are somewhat counterintuitive, and should end with the last activation layer if pooling == None

I am a little surprised this hasn't been addressed yet. AveragePooling2D is redundant if top=False!

I agree with @mxvs. It looks like an oversight though.

Still a problem. A workaround is:

resnet_50 = ResNet50(include_top=False, input_shape=(224, 224, 3))
output = GlobalMaxPooling2D()(resnet_50.layers[-2].output)
model = Model(inputs=resnet_50.input, outputs=output)
Was this page helpful?
0 / 5 - 0 ratings

Related issues

vinayakumarr picture vinayakumarr  路  3Comments

kylemcdonald picture kylemcdonald  路  3Comments

zygmuntz picture zygmuntz  路  3Comments

snakeztc picture snakeztc  路  3Comments

NancyZxll picture NancyZxll  路  3Comments