Stockfish: Bug and fix in do_null_move

Created on 5 Sep 2020  路  9Comments  路  Source: official-stockfish/Stockfish

I'm not really sure what the correct procedure is for bug fixes. The testing guidelines tell me to discuss it in the forum, but the forum doesn't seem to be the right place anymore.

do_null_move() copies the accumulator from the previous state into the new state, which is correct. It then clears the "computed_score" flag because the side to move has changed, and with the other side to move NNUE will return a completely different evaluation (normally with changed sign but also with different NNUE-internal tempo bonus).

The problem is that do_null_move() clears the wrong flag. It clears the computed_score flag of the old state, not of the new state.

It turns out that this almost never affects the search. For example, fixing it does not change the current bench (but it does change the previous bench).

If others agree that this is a bug, I can start a non-regression test on the following fix:
https://github.com/syzygy1/Stockfish/commit/503fa0177e62c7a3f9c765aaa330fc0969ee3927#

It would also be possible to simplify the bug away by removing the computed_score flag. SF never uses it (or at least never in a correct manner). Where SF needs the evaluation of the current position multiple times, it already keeps it in a local variable.

Lastly, it should be possible to speed up do_null_move() a little by copying the previous accumulator only when it is useful (when computed_accumulator is set). I tried this a while ago but (to my surprise) could not detect a clear speed up, so I am not convinced that it would pass as an improvement (I did not try it in fishtest). I could combine this small probable improvement with the bug fix in a non-regression test, but that would be kind of cheating. (Any ideas?)

Most helpful comment

I don't think it is a good idea to leave in a known bug just because it is almost never triggered. At some point it might start to get triggered more often without anyone noticing it.

The reason it is not often triggered are lines 798-801 and 1490-91 of search.cpp, which avoid calling evaluate() on the position after a null move. It seems to be triggered only if the TT entry's static eval field equals VALUE_NONE, see lines 785-86 and 1480-81. It's not immediately clear to me how that can happen at all but apparently it can.

I do agree that removing the flag altogether is probably an even better solution unless there is a good reason to keep it.

All 9 comments

Thanks for reporting, I agree that starting from an issue is the better way these days.

Your explanation sounds right, let's ask @nodchip and/or @dorzechowski, but feel free to start the non-regression test already.

Maybe it would be useful to know if the computed_score flag is useful for the learner. If not, I agree it could just go.

I agree that this is a bug but also fixing it doesn't matter.

The only place the flag computed_score is checked is in the ComputeScore() function in evaluate_nnue.cpp. However, when I add dbg_hit_on(!refresh && accumulator.computed_score) there, I get almost no hits.
Standard bench - zero hits: Total 669316 Hits 0 hit rate (%) 0.
Longer bench 128 1 20: only 3 hits: Total 9085482 Hits 3 hit rate (%) 0.

Looks like this flag is basically not used and can be simplified altogether (assuming we can't make it be useful). I'm not familiar with the learner code but from what I see, this flag is not used anywhere outside the playing code we already have in master here so if removing it is basically no-op, it shouldn't break anything.

As for other small optimizations, I think it's fine to try them if the code doesn't become too complicated in the result. But it's more a question to a maintainer.

I don't think it is a good idea to leave in a known bug just because it is almost never triggered. At some point it might start to get triggered more often without anyone noticing it.

The reason it is not often triggered are lines 798-801 and 1490-91 of search.cpp, which avoid calling evaluate() on the position after a null move. It seems to be triggered only if the TT entry's static eval field equals VALUE_NONE, see lines 785-86 and 1480-81. It's not immediately clear to me how that can happen at all but apparently it can.

I do agree that removing the flag altogether is probably an even better solution unless there is a good reason to keep it.

I meant that instead of fixing it, I prefer removing it directly, not that we should leave the bug in the code.

The STC test passed, but I agree we should probably remove it completely instead.

Removing computed_score also means removing "score" from the Accumulator struct.

How about also removing the "refresh" option, which is not used and will probably never be needed? The accumulator will anyway be refreshed when it is necessary.
In this case compute_eval() could be removed and ComputeScore() could become evaluate().

The update_eval() function is currently not used but could become useful in the future, so it seems better to keep it. (On the other hand, it could also be re-added very easily.)

I have no strong preferences here, but if we remove computed_score it seems more consistent to remove the other unused (and probably unnecessary) parts as well.

I think that score from Accumulator, refresh, compute_eval(), update_eval() and UpdateAccumulatorIfPossible() both in evaluate_nnue and feature_transformer could just go away in one commit. IMO it's better to reintroduce things when needed and not keep dead code that might prove useful some day. Also cleaning up and simplifying NNUE code will make it easier to read and understand and as a result maybe improve it.

sounds good.

Then I'll prepare something.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rayoh123 picture rayoh123  路  5Comments

ZagButNoZig picture ZagButNoZig  路  6Comments

Silver-Fang picture Silver-Fang  路  7Comments

NightlyKing picture NightlyKing  路  7Comments

mstembera picture mstembera  路  5Comments