Cntk: Deep Autoencoder + Back Propagation for Classification

Created on 15 Jul 2016 · 9Comments · Source: microsoft/CNTK

What I am trying to do is:

train a deep autoencoder, which is symmetric around the bottleneck ("input --> network --> input" model)
once trained, delete the later half of the network (the decoder part, after the bottleneck layer)
treat the bottlenecked output layer as features extracted from input
add one hidden layer and a final layer for classification after the bottleneck layer
train the later two layers with backpropagation to learn the classification from the extracted features

I have completed first 4 steps without any issue. On the 5th step, I am not sure what to use. I tried adapt command with a reference node of the criterion node I added using MEL in 4th step. However, it seems to backpropagate and update all the weights and biases. I am not sure if there is any way to prevent this. I thought of setting learningRateMultiplier to 0 using SetProperty in MEL, but it appears that this property is not supported.

Any suggestions? Or, are there other ways of achieving the same behavior?

PS: I saw a new top-level command here, named DoEncoderDecoder. Is that something I can use for this purpose?

Source

zpbappi

Most helpful comment

Hi, as usual, we ran into some unexpected complications. This is now under review and testing.

once trained, delete the later half of the network (the decoder part, after the bottleneck layer)

Once it is there, your code would look something like this. Define a new network, and in that definition, you'd have code like this:

featExtNetwork = BS.Network.Load ("YOUR_TRAINED_AE_MODEL")
featExt = BS.Network.CloneFunction (
              featExtNetwork.input,    # input node that AE model read data from
              featExtNetwork.feat,     # output node in AE model that holds the desired features
              parameters="constant")   # says to freeze that part of the network

# define your new network, using featExt() like any old BrainScript function. E.g.
input = Input (...)
features = featExt (input)  # this will instantiate a clone of the above network
# and the rest is just BrainScript, e.g.
h = Sigmoid (W_hid * features + b_hid) # whatever your hidden layer looks like
z = W_out * h + b_out
ce = CrossEntropyWithSoftmax (labels, z)
criterionNodes = (ce)

The key is parameters="constant", which will lock all learnable parameters (setting learningRateMultiplier=0) inside so that they won't get updated during further training. It also locks BatchNormalization if you use it (that's where the unexpected complications came in).

Until the code is in master, you can already have a look at the documentation: [https://github.com/Microsoft/CNTK/wiki/CloneFunction]. And if you dare, you can try branch fseide/clonebs, but it may be premature.

frankseide on 22 Jul 2016

❤2

All 9 comments

Please expect a solution on Tuesday or Wednesday at latest.

An architectural change disabled the adapt command unfortunately. I will be back in office next week and fix if.

Sorry I cannot help earlier.

Get Outlook for iOShttps://aka.ms/o0ukef

On Fri, Jul 15, 2016 at 8:56 AM +0200, "Zp Bappi" <[email protected]notifications@github.com> wrote:

What I am trying to do is:

train a deep autoencoder, which is symmetric around the bottleneck ("input --> network --> input" model)
once trained, delete the later half of the network (the decoder part, after the bottleneck layer)
treat the bottlenecked output layer as features extracted from input
add one hidden layer and a final layer for classification after the bottleneck layer
train the later two layers with backpropagation to learn the classification from the extracted features

I have completed first 4 steps without any issue. On the 5th step, I am not sure what to use. I tried adapt command with a reference node of the criterion node I added using MEL in 4th step. However, it seems to backpropagate and update all the weights and biases. I am not sure if there is any way to prevent this. I thought of setting learningRateMultiplier to 0 using SetProperty in MEL, but it appears that this property is not supported.

Any suggestions? Or, are there other ways of achieving the same behavior?

PS: I saw a new top-level command herehttps://github.com/Microsoft/CNTK/wiki/Top-level-commands, named DoEncoderDecoder. Is that something I can use for this purpose?

You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com/Microsoft/CNTK/issues/672, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AP5hliUgoPKIJKL9Vi16ouD696H6AEbxks5qVy8UgaJpZM4JNINc.

frankseide on 15 Jul 2016

@frankseide thank you for the update. I will wait for the changes.

zpbappi on 15 Jul 2016

Hi, as usual, we ran into some unexpected complications. This is now under review and testing.

once trained, delete the later half of the network (the decoder part, after the bottleneck layer)

Once it is there, your code would look something like this. Define a new network, and in that definition, you'd have code like this:

featExtNetwork = BS.Network.Load ("YOUR_TRAINED_AE_MODEL")
featExt = BS.Network.CloneFunction (
              featExtNetwork.input,    # input node that AE model read data from
              featExtNetwork.feat,     # output node in AE model that holds the desired features
              parameters="constant")   # says to freeze that part of the network

# define your new network, using featExt() like any old BrainScript function. E.g.
input = Input (...)
features = featExt (input)  # this will instantiate a clone of the above network
# and the rest is just BrainScript, e.g.
h = Sigmoid (W_hid * features + b_hid) # whatever your hidden layer looks like
z = W_out * h + b_out
ce = CrossEntropyWithSoftmax (labels, z)
criterionNodes = (ce)

frankseide on 22 Jul 2016

❤2

@frankseide I dared and built from fseide/clonebs branch. It worked. :)

One finding- if I expose a node as output node from somewhere in the middle of the network, the CloneFunction cannot recognize the node from the loaded network. The network's bottleneck node I was interested in was displaying as L4.y in the log file, even before exposing as outputNodes=(L4.y) from the AE training section. However, after loading the network in another train action with CloneFunction, it was unable to find the node. What I had to do is:

out = Constant(1) .* L4 #L4 is the extracted features layer
#... rest of the code
#...then, at the end
outputNodes = (out)

Then I was able to load the network as CloneFunction(featExtNet.features, featExtNet.out, parameters="constant").

I am not sure whether it is not the proper way to expose a node from a network or is that a known issue, thought I should bring it into your attention. I am closing this issue as it solves the original problem I had.

Thank you very much for your time and very elaborate reply to all the questions I have asked. You rock. :+1:

zpbappi on 26 Jul 2016

Super!

The problem with L4.y is that saying network.L4.y is currently not traversed correctly, since the nodes inside a network are no longer true BrainScript records. This is non-trivial to do.

I did, however, create a (somewhat ugly) workaround for this case. Could you try saying L4_y? It matches all dots as _

frankseide on 26 Jul 2016

Works like a charm. Thanks again for the tip. :)

I see a lot of helpful tips and workarounds in issues. For example, someone was trying to load an already trained model for further training with more data (#680). The suggestion by @dongyu888 was to rename the final model model.dnn to model.dnn.0, delete everything else and start training on new data. Amazingly simple solution. I think it would be easier to for people to find these tips if they are included in the documentation wiki or at least compiled as FAQ. It would be unfair to ask more of your time to do this. But this can be community-driven as well. I see a lot of CNTK users hitting a dead-end and finding a workaround from discussions. I believe, they will be more than happy to contribute.

zpbappi on 27 Jul 2016

Yes, we need a tips & tricks section, "how do it...".

frankseide on 27 Jul 2016

Hi, this is now in master.

frankseide on 28 Jul 2016

👍1

Hi, I created a little "How do I..." text out of this Issue. Thanks again.

frankseide on 4 Aug 2016

👍1

Was this page helpful?

0 / 5 - 0 ratings