I have following code which works on pre-trained VGG model but fails on ResNet and Inception model.
vgg_model = keras.applications.vgg16.VGG16(weights='imagenet')
type(vgg_model)
vgg_model.summary()
model = Sequential()
for layer in vgg_model.layers:
model.add(layer)
Now, changing the model to ResNet as follows:
resnet_model = keras.applications.resnet50.ResNet50(weights='imagenet')
type(resnet_model)
resnet_model.summary()
model = Sequential()
for layer in resnet_model.layers:
model.add(layer)
gives following error:
ValueError Traceback (most recent call last)
in ()
1 model = Sequential()
2 for layer in resnet_model.layers:
----> 3 model.add(layer)/usr/local/lib/python2.7/dist-packages/keras/models.pyc in add(self, layer)
490 output_shapes=[self.outputs[0]._keras_shape])
491 else:
--> 492 output_tensor = layer(self.outputs[0])
493 if isinstance(output_tensor, list):
494 raise TypeError('All layers in a Sequential model '/usr/local/lib/python2.7/dist-packages/keras/engine/topology.pyc in __call__(self, inputs, **kwargs)
601 # Raise exceptions in case the input is not compatible
602 # with the input_spec set at build time.
--> 603 self.assert_input_compatibility(inputs)
604
605 # Handle mask propagation./usr/local/lib/python2.7/dist-packages/keras/engine/topology.pyc in assert_input_compatibility(self, inputs)
511 str(axis) + ' of input shape to have '
512 'value ' + str(value) +
--> 513 ' but got shape ' + str(x_shape))
514 # Check shape.
515 if spec.shape is not None:ValueError: Input 0 is incompatible with layer res2a_branch1: expected axis -1 of input shape to have value 64 but got shape (None, 55, 55, 256)
Thank you!
[Y ] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/keras-team/keras.git --upgrade --no-deps
[ Y] If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.
[Y ] If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
[Y ] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).
Had the same error
I have the same error and I think it's because of the architecture itself. There are quite a lot of points were branches are added so its not a purely sequential architecture like VGG.
Have the same issue.
Same issue here :(
@Chinmoy007 this is not an actual issue. The provided Resnet model in keras is done with a functional model and if you know the Resnet architecture it is anything but sequential. If you wanna do layer modification, below is the "original" code for how the Keras team implemented the Resnet50 architecture. Modify what you need and use it instead of the default:
def identity_block(input_tensor, kernel_size, filters, stage, block):
"""The identity block is the block that has no conv layer at shortcut.
# Arguments
input_tensor: input tensor
kernel_size: default 3, the kernel size of middle conv layer at main path
filters: list of integers, the filters of 3 conv layer at main path
stage: integer, current stage label, used for generating layer names
block: 'a','b'..., current block label, used for generating layer names
# Returns
Output tensor for the block.
"""
filters1, filters2, filters3 = filters
bn_axis = 3
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
x = Conv2D(filters1, (1, 1), name=conv_name_base + '2a')(input_tensor)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
x = Activation('relu')(x)
x = Conv2D(filters2, kernel_size,
padding='same', name=conv_name_base + '2b')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
x = Activation('relu')(x)
x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)
x = add([x, input_tensor])
x = Activation('relu')(x)
return x
def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
"""A block that has a conv layer at shortcut.
# Arguments
input_tensor: input tensor
kernel_size: default 3, the kernel size of middle conv layer at main path
filters: list of integers, the filters of 3 conv layer at main path
stage: integer, current stage label, used for generating layer names
block: 'a','b'..., current block label, used for generating layer names
strides: Strides for the first conv layer in the block.
# Returns
Output tensor for the block.
Note that from stage 3,
the first conv layer at main path is with strides=(2, 2)
And the shortcut should have strides=(2, 2) as well
"""
filters1, filters2, filters3 = filters
bn_axis = 3
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
x = Conv2D(filters1, (1, 1), strides=strides,
name=conv_name_base + '2a')(input_tensor)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
x = Activation('relu')(x)
x = Conv2D(filters2, kernel_size, padding='same',
name=conv_name_base + '2b')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
x = Activation('relu')(x)
x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)
shortcut = Conv2D(filters3, (1, 1), strides=strides,
name=conv_name_base + '1')(input_tensor)
shortcut = BatchNormalization(axis=bn_axis, name=bn_name_base + '1')(shortcut)
x = add([x, shortcut])
x = Activation('relu')(x)
return x
def resnet50_rgb(input_shape, pooling='avg'):
img_input = Input(shape=input_shape)
x = ZeroPadding2D(padding=(3, 3), name='pad1')(img_input)
x = Conv2D(64, (7, 7), strides=(2, 2), padding='valid', name='conv1')(x)
x = BatchNormalization(axis=3, name='bn_conv1')(x)
x = Activation('relu')(x)
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
x = identity_block(x, 3, [128, 128, 512], stage=3, block='b')
x = identity_block(x, 3, [128, 128, 512], stage=3, block='c')
x = identity_block(x, 3, [128, 128, 512], stage=3, block='d')
x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e')
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f')
x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
x = AveragePooling2D((7, 7), name='avg_pool')(x)
# Top is not included!
if pooling == 'avg':
x = GlobalAveragePooling2D()(x)
elif pooling == 'max':
x = GlobalMaxPooling2D()(x)
model = Model(img_input, x, name='resnet50')
return model
Can you use pretrained imagenet parameters on an altered model? E.g load the weights and then iterate the layers, building it up again with the 'new' architecture, as OP does? I don't know how often it would make sense to do, but I'm trying to alter the ResNet50 to use grayscale (not by stacking the input), using what is proposed here: How can I use a pre-trained neural network with grayscale images?
The same error still persists.
Has anyone found a solution to the error.
Most helpful comment
@Chinmoy007 this is not an actual issue. The provided Resnet model in keras is done with a functional model and if you know the Resnet architecture it is anything but sequential. If you wanna do layer modification, below is the "original" code for how the Keras team implemented the Resnet50 architecture. Modify what you need and use it instead of the default: