Recently a patch intended to correctly implement the 3-fold repetition rule has been applied. (See the discussion of #925) Although it indeed fixes the behaviour of the engine in the playing mode, of the two competing versions the one which doesn't completely fixes the issue in the analysis mode with MultiPV > 1 was chosen.
This mode isn't used for playing but is a very useful tool for analysis.
The difference between the applied version of the patch and the correct version (https://github.com/atumanian/Stockfish/tree/3fold_root) is that the former doesn't allow a 2-fold repetition of the root position during a search while the latter does.
I've invented a position to illustrate this.
5r1k/6n1/B1b4N/8/8/8/P4P2/1K5R w - - 0 1
If you analyze it with the current version and with large enough MultiPV, you'll see that move 1.Rh4 is scored about 2 pawns while 1.Nf7+ is permanently scored as a draw. But if you play 1.Nf7+, the score will jump up to about 2! This is a more correct score because white can force returning to the initial position and play the continuation move: 1. Nf7+ Kg8 2. Kh6+ Kh8 3. Rh4. The current version doesn't see this move as winning because it incorrectly assumes a 2-fold repetition of the root position is an immediate draw. Moreover, the engine orders many moves before 1.Nf7+, e.g. move 1.Rd1 as 0.87, which is actually worse than 1.Nf7+.
The other version of the patch scores 1.Nf7+ about the same as 1.Rh4, which is correct.
There can also be practical reasons to play 1.Nf7+. For example, when a player is in a time pressure and has to play a few moves before a time control, they may decide to go for the repetition so that they can think more deeply about the continuation. Another possible reason: white can try to trap the opponent to play 1. Nf7+ Kg8 2. Nh6+ Kh7, which is replied by 3.Bd3+ and is a faster win.
Komodo does the same and I don't think there have been complaints about that. (I tried with Komodo 8.)
Nf7+ simply does not make progress, so scoring it as a draw for analysis purposes seems completely fine to me. There is no reason that I can see why this would be "wrong". (What does "wrong" mean anyway in this context? MultiPV is just a tool to aid with analysis and all that counts is whether its output helps the user understand the position.)
While it might sometimes be useful to know that Nf7+ can be used to repeat the position and gain some time on the clock, scoring it as +2 does not give you that information anyway.
Komodo does the same and I don't think there have been complaints about that. (I tried with Komodo 8.)
It's an own problem of Komodo. Even if there haven't been complaints, it doesn't mean it isn't a bug and we shouldn't fix it.
Nf7+ simply does not make progress, so scoring it as a draw for analysis purposes seems completely fine to me. There is no reason that I can see why this would be "wrong". (What does "wrong" mean anyway in this context? MultiPV is just a tool to aid with analysis and all that counts is whether its output helps the user understand the position.)
While it might sometimes be useful to know that Nf7+ can be used to repeat the position and gain some time on the clock, scoring it as +2 does not give you that information anyway.
The score of a move is an estimate of the result of the position that follows the move. Whether the moves makes progress isn't relevant here.
Well, I guess you have still some ways to go in convincing people other than yourself that this is a "bug"...
@syzygy1, I've already explained here why this is a bug and you haven't provided valid counterarguments.
I've scheduled a test to see how much the correct version is stronger at LTC.
MultiPV is just a tool to aid with analysis and all that counts is whether its output helps the user understand the position.
I agree and to that effect, returning a draw value (zero) violates the principle of least astonishment. When the user plays the "drawing" move during analysis, the evaluation changes because the new position is in fact not drawn.
The point is to aid the user in understanding the position, not to avoid any possible surprise. If a position really has just one move that makes progress, then how does it help the user to get 2, 3 or more moves that all seem to be winning? That only a single move makes progress means that the position is critical and that - in analysis - the single good continuation has to be looked at very carefully (for the same reason why singular extensions work). If two or three moves are equally good, that creates the impression that the side to move has plenty of resources and that even if one move turns out to be bad later, there is no need now to analyse the position any further.
This is a best a matter of taste and certainly not an objective bug. And for game play / multipv = 1 it is objectively correct to score a repetition of the root position as a draw. Do we really want to complicate the implementation of multipv to fix a non-issue?
I've scheduled a test to see how much the correct version is stronger at LTC.
You're testing a "fix" of a multipv > 1 issue at LTC? Earlier you were pretty clear about this just being an analysis thing. Why waste resources?
Regarding "no valid counterargument", I cannot explain what you choose to not want to understand. (Really, if you want a serious discussion, then acknowledge the counterarguments instead of pretending there are none.)
Note that if this would need to be fixed for multipv > 1, then one should still treat repetition of the root position as a draw for the first (best) move being searched.
And one should then make sure that if the first line found leads to a repetition of the root position, then all other repetitions of the root position are scored as draw. Because otherwise a 2nd line might show a positive score (and be sorted to the top) even though it merely repeats the root position.
So really "fixing" this correctly needs a lot more thought (and code) than what it might seem at first.
@syzygy1, the program can aid the user in understanding a position only if they understand what the scores of moves mean. I've given the definition: the score of a move is an estimate of the result of the position that follows the move. The output scores aren't designed to show other information about the position such as the progress of the move or whether the position is critical. If you don't agree with my definition, provide your alternative.
You're testing a "fix" of a multipv > 1 issue at LTC? Earlier you were pretty clear about this just being an analysis thing. Why waste resources?
To ensure the new version isn't weaker and to see if it's stronger so that it would be another reason to accept the patch.
@syzygy1, if you have a better version that fixes the bug then we can consider it. But I've shown that at least the version I provided is better than the current one.
@syzygy1, though your idea may be good. We can score the first repetition of the root position as a draw if PVIdx == 1 and with the score of the 1st move otherwise. However, I haven't thought about this thoroughly.
The result of the match between the version which allows a 2-fold repetition of the root position and the current version, LTC:
ELO: 0.38 +-1.7 (95%) LOS: 66.8%
Total: 40000 W: 5166 L: 5122 D: 29712
http://tests.stockfishchess.org/tests/view/587910bc0ebc5915193f754b
Maintainers/developers, Just wanted to highlight this issue, will it be possible to make a decision on atumanian's implementation and whether it shud be committed?
Please advise.
this patch should be approved, it fixes an undesirable artifact when using SF for analysis, the solution is so simple, it is code neutral and it's probably a slight elo gain as at LTC, nice work @atumanian
ps, it will be in McBrain 1.3
pv 10
with:
info depth 26 seldepth 51 multipv 1 score cp 227 nodes 162798575 nps 2296787 hashfull 999 tbhits 0 time 70881 pv h6f7 h8g8 f7h6 g8h8 h1h4 c6d7 a6d3 d7e6 a2a4 g7e8 h6g4 h8g8 a4a5 e8c7 a5a6 f8a8 h4h7 a8a6 g4f6 g8f8 d3a6 c7a6 b1c1 a6c5 c1d2 e6f7 d2e3 f8e7 f6e4 c5e6 f2f4 e6f8 h7h6 f7g6 e3d4 f8e6 d4e5 g6e4 e5e4
without:
info depth 26 seldepth 50 multipv 7 score cp 0 nodes 204301094 nps 2316680 hashfull 999 tbhits 0 time 88187 pv h6f7 h8g8 f7h6 g8h8
edit: nice work @saproj!
@MichaelB7, thank you! The idea is of @saproj 's though.
@snicolet, @VoygerOne, @zamar @mcostalba Will it be possible to decide on @atumanian implementation here. Just wanted to bring this thread back to life.
Anyone can test changes in fishtest and open a PR.
In that case, @atumanian, as a lot has changed in SF since the last time you tested this, not sure if this mandates your patch to be tested again but then maybe it is better to once again test it and confirm the behavior of your patch and also verify that it doesn't regress the current SF master in any way. Could you then open a PR again for this? @atumanian
Sorry for the late reply. Of course, I'll test the patch with the new version for regression.
Most helpful comment
this patch should be approved, it fixes an undesirable artifact when using SF for analysis, the solution is so simple, it is code neutral and it's probably a slight elo gain as at LTC, nice work @atumanian
ps, it will be in McBrain 1.3
pv 10
with:
info depth 26 seldepth 51 multipv 1 score cp 227 nodes 162798575 nps 2296787 hashfull 999 tbhits 0 time 70881 pv h6f7 h8g8 f7h6 g8h8 h1h4 c6d7 a6d3 d7e6 a2a4 g7e8 h6g4 h8g8 a4a5 e8c7 a5a6 f8a8 h4h7 a8a6 g4f6 g8f8 d3a6 c7a6 b1c1 a6c5 c1d2 e6f7 d2e3 f8e7 f6e4 c5e6 f2f4 e6f8 h7h6 f7g6 e3d4 f8e6 d4e5 g6e4 e5e4without:
info depth 26 seldepth 50 multipv 7 score cp 0 nodes 204301094 nps 2316680 hashfull 999 tbhits 0 time 88187 pv h6f7 h8g8 f7h6 g8h8edit: nice work @saproj!