Dear mxnet community,
The current documentation on Deconvolution layer is somehow difficult to catch up.
In particular, I want to reproduce u-net (for image segmentation purpose) available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/

I stopped by the example https://github.com/dmlc/mxnet/blob/master/example/fcn-xs/symbol_fcnxs.py
and its utilization is still fuzzy, too.
Could you give me some direction (or example) how to use Deconvolution layer for such image segmentation task as follows:
I have a collection of n training volume images (t+xy) and their associative segmentation
(n, 64, 128, 128) ~> (n, 64, 128, 128)
where n is number of training instances, 64 is the temporal dimension, 128 is the spatial dimension.
How to construct a simple fully convolutional network using mxnet on this problem?
data ~> convolutional layer ~> pooling (downsample by 2) ~> deconvolutional layer ~> Upsampling by 2 ~> segmentation?
Thanks a lot
deconvolution with 2x upsampling can be done like this:
scale = 2
pred1 = mx.symbol.Deconvolution(data=pred1, kernel=(2*scale, 2*scale), stride=(scale, scale), pad=(scale/2, scale/2), num_filter=33, no_bias=True, workspace=workspace, name=prefix+'deconv_pred1')
Thanks @piiswrong ,
I have another question regarding to the prediction.
My prediction is a segmented image (256x256) which has 2 classes (membrane at value 0 and non-membrane at value 255). In this case, how to write a softmax layer? (as the network can output the right image)

I appreciate your helps a lot.
Sincerely,
I think you should define your label as 0 and 1, where 0 is membrane and 1 is not. It is not a good option to use 255 unless you want this label to be ignored. @tmquan
@ascust
Thank you for response in this thread.
Actually, I tried that too as following:
I set the output from the network as (30,2,512,512) where (30,:0:,512,512) is the label of non-membrane and (30,:1:,512,512) for membrane classification.
Then I set the deconvolutional output as a LogisticRegressionOutput as following:
sm = mx.symbol.LogisticRegressionOutput(data=deconv_out, name="softmax")
In contrast, the example on fcn-xs folder said that I should use the SoftmaxOutput layer to make a prediction on entire segmentation.
It confused me and I am struggling with that issue quite a bit.
Best regards,
LogisticRegressionOutput is for binary classification, it is supposed to work. Since I never use this layer before, I can not say much about it. Anyway, SoftmaxOutput is for Multiclass classification and binary classification is only a special case, I am sure this will work. Just like your settings, if 30 is your batch size, there is a (2, 512, 512) score map for each image in the batch. For prediction task, you can simply choose use argmax along the first axis.
Most helpful comment
LogisticRegressionOutput is for binary classification, it is supposed to work. Since I never use this layer before, I can not say much about it. Anyway, SoftmaxOutput is for Multiclass classification and binary classification is only a special case, I am sure this will work. Just like your settings, if 30 is your batch size, there is a (2, 512, 512) score map for each image in the batch. For prediction task, you can simply choose use argmax along the first axis.