Keras: Ability to apply a model to different image sizes

Created on 20 Jul 2016 · 12Comments · Source: keras-team/keras

In some cases like with fully-convolutional nets, it makes sense to appy a network to different images sizes. This was possible until keras 1.0.3 but not with more recent versions. This is due to a new check on input shapes in standardize_input_data when predict is called.

Would it make sense to add an optional parameter to skip this check? Since standardize_input_data() skips the check if shapes is None, predict() could be modified as follows:

def predict(self, x, batch_size=32, verbose=0, check_input_shapes=True):
    if check_input_shapes:
        x = standardize_input_data(x, self.input_names,
                                   self.internal_input_shapes,
                                   check_batch_dim=False)
    else:
        x = standardize_input_data(x, self.input_names,
                                   check_batch_dim=False)

stale

Source

mmmikael

👍7

Most helpful comment

I think a FCN is just like the VGG model without fc/flatten layers, so if the latter works, the former should work as well.

Yes, it works if the check on input shapes is bypassed.

@fchollet I can make a PR if this sounds reasonable to you.

mmmikael on 20 Jul 2016

👍4

All 12 comments

I'm afraid skip this check won't be helpful.
If your model contains conv layers and fc layers, in most case you may have a "Flatten" layer at bottleneck. Once an input_shape is given, the shape of tensor after Flatten layer is determined. Feed images with wrong shape will incur a shape mismatch exception here.
I think this check_input_shape just report this error in advance, thus skip this check won't help.

MoyanZitto on 20 Jul 2016

@MoyanZitto you are right but this is actually for networks without Flatten layers (image in / image out). This is the case for image segmentation and denoising in particular. One could train a network on image crops or patches and then apply it to complete images.

mmmikael on 20 Jul 2016

@mmmikael Yes I got it, so you mean this check_input_shape will prevent you from feeding an image that has a different shape with trianing samples to a trained FCN network? That's unbelievable, cause after removing the fc and Flatten layers from a pretrained model, like VGG, you can feed images with any resonable shape (which means the shape is big enough to produce non-zero featuremaps at any conv layer). I think a FCN is just like the VGG model without fc/flatten layers, so if the latter works, the former should work as well.
If you are very sure that your code is correct, maybe it's time to @ the author.

MoyanZitto on 20 Jul 2016

I think a FCN is just like the VGG model without fc/flatten layers, so if the latter works, the former should work as well.

Yes, it works if the check on input shapes is bypassed.

@fchollet I can make a PR if this sounds reasonable to you.

mmmikael on 20 Jul 2016

👍4

I am trying to do exactly as you (training on patches to apply on complete images), how could I use the predict as you are saying (I'm willing to modify the code for myself, but I didn't find what should I modify)?

bernardohenz on 3 Aug 2016

@bernardohenz, at the moment I am simply skipping the check on input shapes by passing None to standardize_input_data:

diff --git a/keras/engine/training.py b/keras/engine/training.py
index 030d128..932f4c8 100644
--- a/keras/engine/training.py
+++ b/keras/engine/training.py
@@ -1158,7 +1158,7 @@ class Model(Container):
         '''
         # validate user data
         x = standardize_input_data(x, self.input_names,
-                                   self.internal_input_shapes,
+                                   None, #self.internal_input_shapes,
                                    check_batch_dim=False)

mmmikael on 3 Aug 2016

@mmmikael Are you referring to this line?

If I changed it to None as you suggest, would I be able to feed in an X_train/y_train which is a list of different sized numpy arrays - i.e. A dataset of different-sized inputs?

And it would still be able to process them as minibatches with size >1?

X_train = [np.array([1,2,3]), np.array([1,2,3,4,5,6]), np.array([1,2,3,4])]
y_train = [np.array([A,B,C]), np.array([A,B,C,D,E,F,]), np.array([A,B,C,D])]

9thDimension on 11 Aug 2016

@9thDimension I guess you'll still need to have the same input shapes within a minibatch but you should be able to call train_on_batch on batches of different sizes (in the spatial dims).

mmmikael on 11 Aug 2016

That's OK, but not ideal. I don't see why Masking should be incompatible with convolutional layers, in principle. For instance padding all of my variously sized inputs to be the same, and having Keras automatically ignore all the padding regions in training by masking over them.

Is there any chance of adding such functionality to future releases of Keras?

9thDimension on 11 Aug 2016

@mmmikael if the model has the merge layer, Even changed it to None as you suggest, also has some problem.
ddaeircnn5_3x3_train30
If you use the above model, there will be an error like GpuElemwise. Input dimension mis-match. Input 1 (indices start at 0) has shape[2] == 64, but the output's size on that axis is 576.
if the model has no merge layer , the results is fun!

generallc on 24 Nov 2016

I'm prototyping a denoising network with the same workflow as @mmmikael and running into the same problem. I'm extracting patches during training, but I was surprised that predict won't allow be to disable size checking during prediction since my network is fully convolutional. This seems like a significant problem for image in -> image out regression problems like denoising, super resolution, etc.