I saw the parameters of latest matchs at http://zero.sjeng.org/ , and conclude a rule.
if first number fixed, the second number are double each time from 8k,
and if second number 128k, it return to 8k, and first number increase.
for example,
when 3.160M+128k failed, we began 3.175M+8.00k
why we did so? don't try second number 256k, or 136k? as not very long ago, the king 7fde81e8's parameter
was 2.861M+1.54M, the second number was quite big.
Neural network training for Leela Zero is done with training runs, where the neural network is sampled and tested at several points along the training process. The first number is the total number of games in the dataset at the time when data was parsed for the training run (i.e. it tells which games are included in the training for this network, and which aren't yet). The second number is the training steps used.
As you observed, @gcp is using an exponential step count schedule for this. Larger step counts (like 256k) were already used earlier in the training process, but aren't at the moment, since our learning rate is somewhat higher again than in the end stage of the 5x64 network. The network 7fde81e8 was trained differently from regular networks, since we increased the network size at the time, by a process called bootstrapping.
things changed
2eb817db 3.224M+48.0k
The step count schedule isn't always exactly the same. Sometimes more networks are tried, like now for example. For the most part, the increments were exponential however, like 8k, 16k, 32k, 64k ,128k.
I try to find a balance between testing networks and letting the clients generate self-play games. So it's just fiddling a bit around to see what works well.