I'm trying to train v3 on my own dataset. But the lr become 0 from the 1st iteration. It's still 0 after even 1k iterations.
1: 775.313843, 775.313843 avg, 0.000000 rate, 7.198354 seconds, 64 images
I met the same problem.Have you solved it?
Have you reseted the number of iterations?
Add "-clear 1" after the weights paths and the net will be trained as if it was its first image seen, so the learning rate should be properly set too.
It seems related to "burn_in" in cfg file. For example, I set learning_rate to 0.001, burn_in to 1000, then learning_rate will increase to 0.001 after 1000 iterations. But not sure the exact formula between them.
in src/network.c line 95:
if (batch_num < net->burn_in) return net->learning_rate * pow((float)batch_num / net->burn_in, net->power);
so: learning_rate * (iterations/burn_in)^4
net->power is = 4 according to src/parser.c line 686
net->power = option_find_float_quiet(options, "power", 4);
so at iteration 1:
LR = 0.001 * 1/1000^4 = 10^-3 * 10^(-3*4) = 10^-15 so yeah it is written 0.
@ralek67 Thx!
Thanks so much! But when I removed the "burn_in", I met a new problem.
Loaded: 0.000044 seconds
2, 0.003: inf, inf avg, 0.099920 rate, 1.575415 seconds, 64 images
Loaded: 0.000054 seconds
3, 0.004: inf, inf avg, 0.099880 rate, 1.549280 seconds, 96 images
Do you know what is a "inf" loss?
@ralek67, so what's the solution? How can we get a proper learning rate from iteration 1?
103: 486.020782, 338.144409 avg, 0.000000 rate, 22.209713 seconds, 13184 images
i meet the same problem
Noobs dont do it. Let it train 1k+ iterations and then check log
Most helpful comment
in src/network.c line 95:
if (batch_num < net->burn_in) return net->learning_rate * pow((float)batch_num / net->burn_in, net->power);so: learning_rate * (iterations/burn_in)^4
net->power is = 4 according to src/parser.c line 686
net->power = option_find_float_quiet(options, "power", 4);so at iteration 1:
LR = 0.001 * 1/1000^4 = 10^-3 * 10^(-3*4) = 10^-15 so yeah it is written 0.