Stockfish: SF DC and future TCEC/tournament submissions

Created on 18 Apr 2020  路  13Comments  路  Source: official-stockfish/Stockfish

It was agreed here
https://github.com/official-stockfish/Stockfish/issues/2082
to have a volunteer submit SF binaries for TCEC. This includes picking the binary and appropriate UCI parameters. However, it is overstepping it's authority to submit modified non official versions of SF. In fact this was agreed very explicitly. Quoting @mcostalba

"@Alayan-stk-2 I agree with you on this point. This role should have some flexibility in picking the binary, as long as the engine is still the official Stockfish and not something else. It should be an engine, not necessary from the current master, but IMO picked from this official SF repository. The flexibility I see is regarding the commit from which to pick, the compiled binary and the choice of UCI parameters values."

@Alayan-stk-2 himself gave this comment a thumbs up. SF is a team project and given what was submitted for for the TCEC 17 SuFi I am opening this issue to confirm the above. Not only was a modified poorly tested SF submitted but it also contains a bug as a direct result of the submitters modification. We are lucky it's mostly harmless and everyones hard work isn't jeopardized in the biggest of all CC tournaments. I know this is all just for fun but having a modified buggy SF play in the SuFi isn't fun.

I hope this will not happen going forward. If we need to make an exception I hope that:

  • there is a good reason
  • the modification is discussed by the team
  • the modification is tested for efficacy and well in advance

I welcome comments from other contributors and maintainers.

Regards Fisherman

Most helpful comment

I agree that submitting a modified version of the code was a poor choice, and beyond the mandate given. I'm pretty sure @Alayan-stk-2 realizes this now, and this will not repeat in the future.
Fortunately, the misfix was harmless, and the wrong decision clearly happened in a rush of excitement. As such I feel it should remain inconsequential. Future tournaments should use a SF from the official repo, and any change from default input parameters should be discussed in advance, and motivated with statistically relevant data.

Having said that, this misfix greatly contributed to the entertainment factor and the aura of scam around the event. Rumors have it that I realized the impact of this mistake prior to submission, and saw it as the easiest way to have SF play, essentially, with default contempt ;-)

All 13 comments

I agree that submitting a modified version of the code was a poor choice, and beyond the mandate given. I'm pretty sure @Alayan-stk-2 realizes this now, and this will not repeat in the future.
Fortunately, the misfix was harmless, and the wrong decision clearly happened in a rush of excitement. As such I feel it should remain inconsequential. Future tournaments should use a SF from the official repo, and any change from default input parameters should be discussed in advance, and motivated with statistically relevant data.

Having said that, this misfix greatly contributed to the entertainment factor and the aura of scam around the event. Rumors have it that I realized the impact of this mistake prior to submission, and saw it as the easiest way to have SF play, essentially, with default contempt ;-)

SF isn't doing too bad at TCEC, -9 elo performance is pretty decent. Sadly enough we were not in time with our latest influx of elo but nothing much can be done for sure.
For future I think that prior to submitting modified versions to TCEC @Alayan-stk-2 should create issue on github and tag the most active devs and maintainer :) Well, maybe just for finals, since divP is a whitewash for sf and leela/

Please, simplify this configurable contempt value out and keep the switches only for analysis(cosmetic) reasons.

Things like this are good to know in advance before trying them.

This idea of different contempt with black and white might have merit or it might be completely useless.

Hence, I consider valuable a test of opposite uneven Contempt, fixed games at inexpensive 5 + 0.05. If it is within error we can forget the idea, but if the side with the higher white contempt shows considerable superiority, we have future options.

The requested test, for magnifying effect, could be with C40 for white and C0 for black vs C0 for white and C40 for black.

@noobpwnftw I don't think its a good idea to get rid of contempt configuration, because on one hand contempt 0 has proved a clear elo gain vs contempt 24 (with the most recent dynamic formula improvement by SG) & furthermore contempt 24, 22 and 20 failed non-regression bounds vs C0.

So its natural that many people prefer non-default contempt, and should be able to tweak it easy as they please.

If the defaults are losing ELO then it should be changed as a parameter tweak, if removing the parameter passes a non-regression test, then it should be simplified away.

Contempt is a special case, C0 would probably pass elo gaining bounds over C24, but the strategy is sacrificing a few elo in order to enjoy lowered draw rate & increased resolution for patch selection.

Contempt in my opinion is crucial to any minimax search especially with a sloppy eval function, it indirectly penalizes adding unrelated compensations here and there within a narrow margin around 0.00(CT), preventing accumulation of inaccuracies as depth goes up(DCT). I do not see it as much configurable as you do, and it should be treated just like any other parameters in the program.
Reason there is a switch to my understanding is more of a visual effect, or have other use of the scores dependent on them being absolutely objective.
Finally, in a tournament play, the last thing you want to do is to be pessimistic, if not only for the sportsmanship part.

@Alayan-stk-2
There is now only around 24 hours until the end of the TCEC superfinal, which I understand is the deadline for submissions for the next TCEC Cup. Have you submitted the version and contempt settings for this yet?

But if we treat the contempt just like other parameters, we would set its default value to the maximum self-play elo. And its a proven fact that in self play C24 is inferior to C0.

A pessimistic contempt is a negative one, and I agree its the least useful and bad for the sport. With C0 SF plays more objective ("correct") chess, and this naturally means more draws. More draws in framework means reduced ability to identify improvements.

I agree that if by setting C=0 improves self-play elo, then I see no reason not to make it default, while DCT may be still in effect to cover some problems I mentioned(and it proves to give positive elo).

Draws can be reduced in many ways not necessarily limited to C=24, or we can use that only during testing.

I find your suggestion logical, I guess default is mostly a matter of taste, and also field of opponents. Furthermore, at the era that SN introduced the rule "default = highest CT that passes non-regression bound vs C0" , SF was head&shoulders above competition, but not anymore.
Note that this rule was abruptly abandoned after the last functional DCT improvement, in order to not lower the default: https://github.com/official-stockfish/Stockfish/pull/2382

I think its good time for the community to discuss what to do with this. Imo NN's being that good make a strong case for a lowered default, or 0.

Regarding the framework we can indeed use what we like, and in this case I think that @snicolet 's rule should re-apply. This because contempt derived resolution is not 100% clean, but includes some bias (regarding transferrable correlation to C0). So using same default & testing contempt has the benefits of simplicity and cohesion.

Since the discussion has become really off-topic, I'll close this issue, even though I assumed there would have been a short reaction by @Alayan-stk-2

If you really insist on a discussion of default contempt, please open a new issue. However, I don't think this is the most interesting nor pressing topic, to be honest.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ghost picture ghost  路  5Comments

bftjoe picture bftjoe  路  5Comments

GBeauregard picture GBeauregard  路  7Comments

nguyenpham picture nguyenpham  路  4Comments

MoonstoneLight picture MoonstoneLight  路  5Comments