Keras: Conv2DTranspose ambigous output shape

Created on 8 Apr 2018 · 13Comments · Source: keras-team/keras

Currently, Conv2DTranspose infers the shape of the output using deconv_length but because the output shape of a transposed convolution is ambigous it can infer an undesired shape.

Example:

conv = Conv2D(16, 3, strides=2, padding='same')
transpose_conv = Conv2DTranspose(1, 3, strides=2, padding='same')

input_a = Input(shape=(23, 23, 1))
x_conv_a = conv(input_a)
x_transpose_a = transpose_conv(x_conv_a)

print("(a) Input shape: {}".format(int_shape(input_a)))
print("(a) Shape after convolution: {}".format(int_shape(x_conv_a)))
print("(a) Shape after transposed convolution: {}".format(int_shape(x_transpose_a)))
print()

input_b = Input(shape=(24, 24, 1))
x_conv_b = conv(input_b)
x_transpose_b = transpose_conv(x_conv_b)

print("(b) Input shape: {}".format(int_shape(input_b)))
print("(b) Shape after convolution: {}".format(int_shape(x_conv_b)))
print("(b) Shape after transposed convolution: {}".format(int_shape(x_transpose_b)))

The output:

(a) Input shape: (None, 23, 23, 1)
(a) Shape after convolution: (None, 12, 12, 16)
(a) Shape after transposed convolution: (None, 24, 24, 1)

(b) Input shape: (None, 24, 24, 1)
(b) Shape after convolution: (None, 12, 12, 16)
(b) Shape after transposed convolution: (None, 24, 24, 1)

From an input shape (None, 12, 12, 16) a transposed convolution can output either (None, 24, 24, 1) or (None, 23, 23, 1). Conv2DTranspose always outputs (None, 24, 24, 1).

Shouldn't the user have to supply the output_shape (like in Tensorflow) or an output padding (like in PyTorch) to resolve the ambiguity?

Source

davidtvs

👍5 ❤2

Most helpful comment

PR #10246 has been merged. The shape of the output can now be controlled through a new optional argument (output_padding).

davidtvs on 21 Jun 2018

🎉3

All 13 comments

I also met the same question as you described, expecting someone can answer this question.

yangzhe151 on 10 Apr 2018

This behavior also causes serious problems when defining convolutional autoencoders where a predictable dimensionality of the output is paramount.

The backend function conv2d_transpose that is used in the Conv2DTranspose class actually requires the desired output dimensions.

It might work to add an optional argument to specify the output dimensionality, and if used pass it to the backend. Am I missing something?

emiljoha on 15 May 2018

@emiljoha That is the way I do it here, if the output_shape argument is set I use that as the output shape, else I let the shape be inferred.

davidtvs on 15 May 2018

❤2

@davidtvs Nice! Your implementation indeed seems to fix this issue!

Why no pull request to solve the issue here in Keras?

emiljoha on 16 May 2018

@davidtvs After further testing, not specifying output_shape and letting it default to None causes a TypeError exception when the output_shape si cast to a touple.

self._output_shape = tuple(output_shape)

This is easily fixed by checking for that case and only casting to tuple if not None.

emiljoha on 16 May 2018

👍2

@emiljoha Nice catch. Will fix as soon as I get the opportunity.

Also going to run some tests and review the changes, then I'll make a PR.

davidtvs on 16 May 2018

@davidtvs @emiljoha Does it support dynamic length？ eg [None,None,3]

xiaomaxiao on 17 May 2018

@xiaomaxiao Passing [None, None, 3] as output_shape will give you a TypeError exception. Is there a way around it? I don´t know.

emiljoha on 17 May 2018

@emiljoha I means the input image is [None,None,3] , use K.shape(x) to get output_shape .
I have compile it sucess, but get a error when train it . I am debugging.

xiaomaxiao on 17 May 2018

@davidtvs can you fix it?

xiaomaxiao on 23 May 2018

@xiaomaxiao I don't think setting output_shape to (None, None, 3) is going to work. Because that shape is passed directly to keras.backend.conv2d_transpose which in turn calls tf.nn.conv2d_transpose if you are using Tensorflow as the backend, and it doesn't accept None in any dimension of output shape.

The output padding implementation from #10246 might solve your problem though, as long as you don't need the padding to change during training. From your description in #10185, you need output_padding=0.

davidtvs on 23 May 2018

@davidtvs

I want set output_shape = tf.shape not (None,None,3) like this.

x = Input([None,None,3])

output_shape = tf.shape(x)

x = Conv2D(32,kernel_size = (3,3), padding='same',strides=(2,2))(x)

x = Conv2DTranspose(ch,kernel_size=(3,3),strides=(2,2),padding='same',output_shape)

UPDATE: i solved by the following code . now i can set input =[None,None,3]

        output_shape = tf.stack([tf.shape(x)[0], self._output_shape[1], self._output_shape[2], self._output_shape[3]])
        outputs = tf.nn.conv2d_transpose(x ,
                                   self.kernel,
                                     output_shape,
                                     (1,2,2,1),
                                     'SAME'
                                     )

xiaomaxiao on 24 May 2018

PR #10246 has been merged. The shape of the output can now be controlled through a new optional argument (output_padding).

davidtvs on 21 Jun 2018

🎉3

Was this page helpful?

0 / 5 - 0 ratings