Hello,
Why define blobs_lr and weight_decay twice in the conv layer?
layers {
name: "conv1"
type: CONVOLUTION
blobs_lr: 1
blobs_lr: 2
weight_decay: 1
weight_decay: 0
...
First blobs_lr is for convolution filter weights, second blob_lr is for bias parameter.
The same for weight_decay.
Cf. MNISTtutorial for why people use 2 different strategy for learning rate:
blobs_lr are the learning rate adjustments for the layer鈥檚 learnable parameters. In this case, we will set the weight learning rate to be the same as the learning rate given by the solver during runtime, and the bias learning rate to be twice as large as that - this usually leads to better convergence rates.
Most helpful comment
Cf. MNISTtutorial for why people use 2 different strategy for learning rate:
blobs_lr are the learning rate adjustments for the layer鈥檚 learnable parameters. In this case, we will set the weight learning rate to be the same as the learning rate given by the solver during runtime, and the bias learning rate to be twice as large as that - this usually leads to better convergence rates.