Caffe: L2 normalization of a vector

Created on 6 Oct 2014  Â·  25Comments  Â·  Source: BVLC/caffe

Before implementing one more new layer from scratch, I want do double check.
I need to implement a vector normalization of the type z / l2_norm(z) it is there any way of doing this in current caffe-dev (or a related branch in the caffe network) ?

Most helpful comment

All 25 comments

indeed it seems that only missing ingredient is the element wise division.
I have a cpu version that passes the unit test, will add the GPU version before doing a pull request.

@rodrigob I'm about to implement L2 normalization as well -- before I duplicate the effort, were you successful?

@rodrigob @seanbell Looking forward to you guys' L2 norm layer, it'll be great if you guys could release it asap, time is money...

@kuprel I've seen your code, it seems you've implemented angle loss for pair-wise learning. Does it go well, or could you share some experience on pair-wise learning?

Please ask usage questions on the caffe-users list. Thanks!

@seanbell Hi, is l2-normalization implemented?

You can already do L2 normalization using something like this (untested but should work with minor syntax changes, I've used it before):

from caffe import layers as L, params as P

def l2normed(vec, dim):
    """Returns L2-normalized instances of vec; i.e., for each instance x in vec,
    computes  x / ((x ** 2).sum() ** 0.5). Assumes vec has shape N x dim."""
    denom = L.Reduction(vec, axis=1, operation=P.Reduction.SUMSQ)
    denom = L.Power(denom, power=(-0.5))
    denom = L.Reshape(denom, num_axes=0, axis=-1, shape=dict(dim=[1]))
    denom = L.Tile(denom, axis=1, tiles=dim)
    return L.Eltwise(vec, denom, operation=P.Eltwise.PROD)

For numerical stability you might want to change the Power layer to something like L.Power(denom, power=(-0.5), shift=1e-12).

@ducha-aiki Thanks. Have you used it before?
Would
...
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "nfeat"
type: "NormalizeLayer"
bottom: "fc7"
top: "nfeat"
}

work?

@hyojinie, yes, it works. But you need to carefully initialize layers after the normalized, gaussian std=0.01 does not work :)

@ducha-aiki Thanks a lot. It works! I am trying to use it right before the loss layer (contrastive loss). I am guessing initialization wouldn't matter much in that case..?

@ducha-aiki I have tried many gaussian std = 0.1,0.01.....,but it doesn't work. But uniform can do. Do you know why this happens?

@hyojinie After adding "nfeat" layer and sending it to "fc8" classifier, the softmax loss decreases very slowly when training compared with no Normalize layer. The val accuracy first goes up and then down. Did anyone encounter this situation?

@jeffdonahue How does the division work in that code? The Reshape + Tile magic is a bit too magic for me. :sweat_smile:

@ducha-aiki Can I ask what you mean by "layers after the normalized"? Does it mean all the layers after the normalized layer?

@xwang90 yes

@MenglaiWang Sorry to bother you. What do you mean by "gaussian std = 0.1,0.01.....,but it doesn't work"? "doesn't work" means low clasification accuracy? Thank you. :)

@ducha-aiki Thanks a lot. So you mentioned "gaussian std=0.01 does not work", do you
mean that "gaussian std=0.01" would lead to low classification accuracy?
I just meet some low classification issues after incorporating this L2
normalization layer. Thanks again!

On Fri, Feb 3, 2017 at 1:28 PM, Dmytro Mishkin notifications@github.com
wrote:

@xwang90 https://github.com/xwang90 yes

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/BVLC/caffe/issues/1224#issuecomment-277323760, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AGBF8odk_Dkjnx_v184OebXumvmKYwMXks5rY3G7gaJpZM4CrQAe
.

@hyojinie

I too am trying to use it before contrastive loss. Did you manage to get this normalization layer to work?

For the normalization layer, where is it in the .proto file?

I did. I have not looked at the original proto file. I added it myself to my proto.

Can you share with me a prototxt on how you used the norm layer? Also, whats the input and output dimensions as an example?

I have a (N * 128) Feature Vector I need to normalize. Does this layer ingest that and output the same size?

There's nothing fancy for the proto. (see below //added)
Something like this:
// DEPRECATED: use LayerParameter.
message V1LayerParameter {
repeated string bottom = 2;
repeated string top = 3;
optional string name = 4;
repeated NetStateRule include = 32;
repeated NetStateRule exclude = 33;
enum LayerType {
NONE = 0;
ABSVAL = 35;
ACCURACY = 1;
ARGMAX = 30;
BNLL = 2;
CONCAT = 3;
CONTRASTIVE_LOSS = 37;
CONVOLUTION = 4;
DATA = 5;
DECONVOLUTION = 39;
DROPOUT = 6;
DUMMY_DATA = 32;
EUCLIDEAN_LOSS = 7;
ELTWISE = 25;
EXP = 38;
FLATTEN = 8;
HDF5_DATA = 9;
HDF5_OUTPUT = 10;
HINGE_LOSS = 28;
IM2COL = 11;
IMAGE_DATA = 12;
INFOGAIN_LOSS = 13;
INNER_PRODUCT = 14;
LRN = 15;
MEMORY_DATA = 29;
MULTINOMIAL_LOGISTIC_LOSS = 16;
MVN = 34;
POOLING = 17;
POWER = 26;
RELU = 18;
SIGMOID = 19;
SIGMOID_CROSS_ENTROPY_LOSS = 27;
SILENCE = 36;
SOFTMAX = 20;
SOFTMAX_LOSS = 21;
SPLIT = 22;
SLICE = 33;
TANH = 23;
WINDOW_DATA = 24;
THRESHOLD = 31;

// added
NORMALIZE = 41;

}

It should.

Was this page helpful?
0 / 5 - 0 ratings