Leela-zero: Silly question: Is network reset for each training run?

Created on 23 Oct 2018 · 1Comment · Source: leela-zero/leela-zero

Sorry, this is not exactly an issue but I could not find clear answer in Zero paper, but I thought someone with basic knowledge could immediately answer this for the help of me and all mankind:

1) Do we train each new network from scratch (random weights, games from training window)
2) Do we start from latest qualified network and train that with a new random batch of situations
3) Do we continuously train a network with new games, and only promote after it wins 54 % of previous?

The last one is the closest interpretation of my cursore read through Zero paper methodology, as it never says the network is reset, but it sounds like the network could stray completely in wrong direction, never to find it's way back from the woods... But if it's not that, not sure if 1 or 2 is correct.

Source

jokkebk

Most helpful comment

gcp's training starts from the latest promoted network (2 in your list); IIRC he thinks this helps avoiding training on similar dataset for too many steps leading to overfitting, as our self-play games are generated at irregular rate and not as fast as AGZ.