I'd like to use BatchNorm at our network so I added BatchNorm Layer with our convolution layers
layer {
name: "conv1/7x7_s2"
type: "Convolution"
bottom: "data"
top: "conv1/7x7_s2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 3
kernel_size: 7
stride: 2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
bottom: "conv1/7x7_s2"
top: "conv1/7x7_s2_bn"
name: "conv1/7x7_s2_bn"
type: "BatchNorm"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
layer {
name: "conv1/relu_7x7"
type: "ReLU"
bottom: "conv1/7x7_s2_bn"
top: "conv1/7x7_s2_bn"
}
layer {
name: "pool1/3x3_s2"
type: "Pooling"
bottom: "conv1/7x7_s2_bn"
top: "pool1/3x3_s2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
However, Loss is not decreased.
How to use BatchNorm in caffe?
In training, you have to set use_global_stats to true false in your batch norm layer so the mean/var will get updated. In testing, set use_global_stats to false true . Here is an example for your layer definition in training.
layer {
bottom: "conv1/7x7_s2"
top: "conv1/7x7_s2_bn"
name: "conv1/7x7_s2_bn"
type: "BatchNorm"
batch_norm_param {
use_global_stats: truefalse
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
Dear happyharrycn
1.Could you explain what "use_global_stats" parameter means?
2.Shoud I modified as false in deploy.txt? and maintain as true in training state?
Thanks
I actually made a mistake in my previous reply. You should set use_global_stats = False in training, and use_global_stats = True in testing (deploy.txt).
When use_global_stats is set to False, the batch normalization layer is tracking the stats (mean/var) of its inputs. This is the desired behavior during training. When use_global_stats is set to True, the layer will use pre-computed stats (learned in training) to normalize the inputs.
Dear happyharrycn
I have a error as below
I1123 22:00:20.378729 5626 caffe.cpp:212] Starting Optimization
I1123 22:00:20.378775 5626 solver.cpp:287] Solving DrivingNet
I1123 22:00:20.378782 5626 solver.cpp:288] Learning Rate Policy: step
I1123 22:00:20.394110 5626 solver.cpp:340] Iteration 0, Testing net (#0)
I1123 22:00:20.570456 5626 solver.cpp:408] Test net output #0: bb-loss = 1.99914 (* 10 = 19.9914 loss)
I1123 22:00:20.570492 5626 solver.cpp:408] Test net output #1: pixel-loss = 0.689463 (* 1 = 0.689463 loss)
F1123 22:00:21.310832 5626 batch_norm_layer.cu:95] Check failed: !use_global_stats_
* Check failure stack trace: *
@ 0x7efd1cd05ea4 (unknown)
@ 0x7efd1cd05deb (unknown)
@ 0x7efd1cd057bf (unknown)
@ 0x7efd1cd08a35 (unknown)
@ 0x7efd1d4950dd caffe::BatchNormLayer<>::Backward_gpu()
@ 0x7efd1d37c3fb caffe::Net<>::BackwardFromTo()
@ 0x7efd1d37c45f caffe::Net<>::Backward()
@ 0x7efd1d303748 caffe::Solver<>::Step()
@ 0x7efd1d3043e5 caffe::Solver<>::Solve()
@ 0x409596 train()
@ 0x40571b main
@ 0x7efd1c205a40 (unknown)
@ 0x405eb9 _start
@ (nil) (unknown)
Below is my prototxt.
I changed use_global_stats: false in training stage.
train_val_obn.txt
@happyharrycn please, I wanna know what this parameters mean?
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
For your questions see batch_norm_layer.hpp:
This means by default (as the following is set in batch_norm_layer.cpp), you don't have to set use_global_stats at all in the prototxt.
use_global_stats_ = this->phase_ == TEST;
I am closing this thread, as this for tracking issues with caffes, which this is not. Please use Caffe users lists for that.
Is it still the case that param {lr_mult: 0} must be set three times in the BN layer definition?
@jeremy-rutman I believe it is not necessary to set lr_mult to 0 now, given the following lines in the code.
Is "set use_global_stats = False in training, and use_global_stats = True in testing (deploy.txt)" still required?
Most helpful comment
@happyharrycn please, I wanna know what this parameters mean?
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}