@AlexeyAB could you provide a brief description on the following parameters in the .cfg files:
burn_in=1000
max_batches = 80200
policy=steps
steps=40000,60000
scales=.1,.1
Also, why does steps and scales have two numbers passed?
Thanks.
It means that total iterations will be 80200
Also at the step 40000 iterations the learning_rate will be multiplied by 0.1
and at the step 60000 iterations the learning_rate will be multiplied by 0.1
So if initial learning_rate=0.001, then for 1000-40000 iterations it will be 0.001, then for 40000-60000 iterations it will be 0.0001, and for 60000-80200 it will be 0.00001.
burn_in=1000 it means that for 0-1000 iterations the learning rate will be equal to: https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/src/network.c#L94
Thank you, that was very helpful. Is there a general rule as to when the learning rate should be multiplied by 0.1? For example, if I wanted to only run 10,000 iterations, then would I even need to decrease the learning rate?
Usually you should reduce learning rate at 90% and 95% of total number of iteration (max_batches)
So set steps=9000,9500 if max_batch=10000
Great, thank you for the quick responses!