hi, I have a question. when in self-play mode, LZ use temperature to select node before 30 moves, and then use dirichlet noise for remain moves, so LZ can play many different games. But when evaluating weights, the temperature is set to near zero, so the selected move must be the node with maximal visits, there is no random, so I can鈥檛 understand why LZ can still play different games?
when evaluating of each position the board is randomly rotated (and reflected) giving 8 slightly different evaluations. Leelaz runs by default with two threads introducing randomness based on which completed first.
Most helpful comment
when evaluating of each position the board is randomly rotated (and reflected) giving 8 slightly different evaluations. Leelaz runs by default with two threads introducing randomness based on which completed first.