Currently, Conv2DTranspose infers the shape of the output using deconv_length but because the output shape of a transposed convolution is ambigous it can infer an undesired shape.
Example:
conv = Conv2D(16, 3, strides=2, padding='same')
transpose_conv = Conv2DTranspose(1, 3, strides=2, padding='same')
input_a = Input(shape=(23, 23, 1))
x_conv_a = conv(input_a)
x_transpose_a = transpose_conv(x_conv_a)
print("(a) Input shape: {}".format(int_shape(input_a)))
print("(a) Shape after convolution: {}".format(int_shape(x_conv_a)))
print("(a) Shape after transposed convolution: {}".format(int_shape(x_transpose_a)))
print()
input_b = Input(shape=(24, 24, 1))
x_conv_b = conv(input_b)
x_transpose_b = transpose_conv(x_conv_b)
print("(b) Input shape: {}".format(int_shape(input_b)))
print("(b) Shape after convolution: {}".format(int_shape(x_conv_b)))
print("(b) Shape after transposed convolution: {}".format(int_shape(x_transpose_b)))
The output:
(a) Input shape: (None, 23, 23, 1)
(a) Shape after convolution: (None, 12, 12, 16)
(a) Shape after transposed convolution: (None, 24, 24, 1)
(b) Input shape: (None, 24, 24, 1)
(b) Shape after convolution: (None, 12, 12, 16)
(b) Shape after transposed convolution: (None, 24, 24, 1)
From an input shape (None, 12, 12, 16) a transposed convolution can output either (None, 24, 24, 1) or (None, 23, 23, 1). Conv2DTranspose always outputs (None, 24, 24, 1).
Shouldn't the user have to supply the output_shape (like in Tensorflow) or an output padding (like in PyTorch) to resolve the ambiguity?
I also met the same question as you described, expecting someone can answer this question.
This behavior also causes serious problems when defining convolutional autoencoders where a predictable dimensionality of the output is paramount.
The backend function conv2d_transpose that is used in the Conv2DTranspose class actually requires the desired output dimensions.
It might work to add an optional argument to specify the output dimensionality, and if used pass it to the backend. Am I missing something?
@emiljoha That is the way I do it here, if the output_shape argument is set I use that as the output shape, else I let the shape be inferred.
@davidtvs Nice! Your implementation indeed seems to fix this issue!
Why no pull request to solve the issue here in Keras?
@davidtvs After further testing, not specifying output_shape and letting it default to None causes a TypeError exception when the output_shape si cast to a touple.
self._output_shape = tuple(output_shape)
This is easily fixed by checking for that case and only casting to tuple if not None.
@emiljoha Nice catch. Will fix as soon as I get the opportunity.
Also going to run some tests and review the changes, then I'll make a PR.
@davidtvs @emiljoha Does it support dynamic length? eg [None,None,3]
@xiaomaxiao Passing [None, None, 3] as output_shape will give you a TypeError exception. Is there a way around it? I don´t know.
@emiljoha I means the input image is [None,None,3] , use K.shape(x) to get output_shape .
I have compile it sucess, but get a error when train it . I am debugging.
@davidtvs can you fix it?
@xiaomaxiao I don't think setting output_shape to (None, None, 3) is going to work. Because that shape is passed directly to keras.backend.conv2d_transpose which in turn calls tf.nn.conv2d_transpose if you are using Tensorflow as the backend, and it doesn't accept None in any dimension of output shape.
The output padding implementation from #10246 might solve your problem though, as long as you don't need the padding to change during training. From your description in #10185, you need output_padding=0.
@davidtvs
I want set output_shape = tf.shape not (None,None,3) like this.
x = Input([None,None,3])
output_shape = tf.shape(x)
x = Conv2D(32,kernel_size = (3,3), padding='same',strides=(2,2))(x)
x = Conv2DTranspose(ch,kernel_size=(3,3),strides=(2,2),padding='same',output_shape)
UPDATE: i solved by the following code . now i can set input =[None,None,3]
output_shape = tf.stack([tf.shape(x)[0], self._output_shape[1], self._output_shape[2], self._output_shape[3]])
outputs = tf.nn.conv2d_transpose(x ,
self.kernel,
output_shape,
(1,2,2,1),
'SAME'
)
PR #10246 has been merged. The shape of the output can now be controlled through a new optional argument (output_padding).
Most helpful comment
PR #10246 has been merged. The shape of the output can now be controlled through a new optional argument (
output_padding).