Stockfish: Set contempt to the max self-play strength

Created on 7 Feb 2019  路  18Comments  路  Source: official-stockfish/Stockfish

In order to approach max objective playing strength.

Till now we opted to sacrifice some self-play elo for higher benefits vs lower opposition, more spectacular chess, lowering the chance of misfortune in stages (due to small sample size) and lowering drawrate to help resolution on our patch selectivity.

If contempt 12 ie passes STC+LTC (0,4) vs 24, would you prefer it as default or not? @mcostalba @snicolet
I am asking, because if not, then maybe its better to not do the test, unless we prefer to have this information available for usage in special purposes.

Most helpful comment

I think some people prefer to generally/ always use the same (default)
setting, and I have some sympathy for this view.

When sf was the dominant engine, some way ahead of the competition, it was
easy to argue for a higher contempt setting, to get more interesting play
and better results against weaker opposition.

Now that leela (and perhaps other nn engines in future) is giving sf real
competition, I think we have to consider using a more conservative value,
say 12 or so for approx absolute best play, or even lower for more
conservative play. This could be just in key matches (where allowed) or as
a new default.
Ideally we would test this, but who has the resources? What book/tc would
we use? Would a lower contempt help anyway?
Plenty of questions to consider ...

All 18 comments

If anything, I'd rather test higher default contempt for non-regression (in the 30 to 40 range), the difference in self play strength is small. My personal opinion.

Its about 3 self-play elo that we pay for the aforementioned benefits, maybe not too much.
But I am also worried that the high contempt search tree deteriorates positional play, while being more resourceful tactically. A medium one would keep a balance.

But I am also worried that the high contempt search tree deteriorates positional play, while being more resourceful tactically. A medium one would keep a balance.

I'm not so sure about this. Contempt means SF play more with more pieces on the board. So improvements to positional play should become more apparent and bring more benefits with contempt, as SF would have a better idea of if a simplification is worth it or not. i.e., contempt may push to do some more positional mistakes but it will also make improvements in that area easier.

I had mixed feelings but I gave it a thought and I am convinced that eventually all parameters anyway converge to max self-play elo with any contempt, and this quest helps with the resolution.
So the max self play keeps rising as we rise the default by comparing to 0. But its a slow process, and we regularly check if we can raise it, meaning that the max self play estimation always lies a little higher than the default/2. (Or a little lower in the case that we just raised it and have everything tuned to the old one) Extensive tests a year ago showed indeed that with default 18 the max self play was 10.
I just keep the argument to use max self-play for special occasions.
(ie equal strength opponent matches)

For finals of tournaments (like TCEC SuFi), ideally tests would be run with different contempt value to find out what works out better against the specific opponent. This has not been done for the current TCEC SuFi (not enough interest ? not enough SF supporters with adequate hardware to do so ?)

For tournaments against multiple opponents, higher default contempt would help.

I think some people prefer to generally/ always use the same (default)
setting, and I have some sympathy for this view.

When sf was the dominant engine, some way ahead of the competition, it was
easy to argue for a higher contempt setting, to get more interesting play
and better results against weaker opposition.

Now that leela (and perhaps other nn engines in future) is giving sf real
competition, I think we have to consider using a more conservative value,
say 12 or so for approx absolute best play, or even lower for more
conservative play. This could be just in key matches (where allowed) or as
a new default.
Ideally we would test this, but who has the resources? What book/tc would
we use? Would a lower contempt help anyway?
Plenty of questions to consider ...

As the framework is empty, I ran a non-regression test for C=32 against C=0, and not too surprisingly it failed : http://tests.stockfishchess.org/tests/view/5c62f6640ebc5925cffbd930

Elo | -1.47 [-3.64,0.83] (95%)

@Alayan-stk-2
You might want to consider a separate contempt when playing black with something like this,

  int w_ct = int(Options["W_Contempt"]) * PawnValueEg / 100; // From centipawns
  int b_ct = int(Options["B_Contempt"]) * PawnValueEg / 100; // From centipawns
  int ct = (us == WHITE ) ? w_ct : b_ct ;

I came up with -17 for black (when playing against a SF like engine in strength, keeping +24 for white).
Also update ucioption.cpp

@MichaelB7 After seeing your post I've tried this in local tests and it doesn't seem to work for me :(

I've created a test for contempt 12 against master on fishtest, see here
Do people think this is ok? I'm assuming contempt 12 is somewhere around the value for strongest play, although a different value might be best against leela of course.

Well, at first sf plays vs pool of weaker engines usually. Contempt brings quite a lot of elo there.
Also contempt lowers draw rate thus making tests to converge faster which is especially critical with new SPRT bounds.
At 3rd the small data we have is that contempt is basically harmless vs leela up to really high values like 50+. Don't oversertimate it impact, c=24 even vs much stronger engine is maybe 5 elo loss if engine is like 100 elo stronger, but who cares if engine is actually 100 elo stronger anyway?
And about positional play - if anything contempt encourages sf to play more positionally because it keeps more pieces on the board. I don't think that lowering contempt w/o signifficant proof that it will be good is a thing to go because default value has a lot of gains compared to 0 and basically no proven cons.

High default contempt has many benefits and should be kept, but I would like a c=12 vs c=24 fixed games test just for info. Self-play performance should be closely related to objective best play, and objective best play imo is the best bet vs a similar strength opponent. So if +3 or +4 elo is shown for c=12, I would blindly send this to fight Leela any day of the week. But for adamant proof, granted, specific tests are required.

Interesting, I haven't seen any data on results of different contempt values against leela.

On positional play, encouraging more positional play might be good if we want to improve that aspect of sf's play, but in terms of playing to win against leela it seems like a bit of an own goal.

After seeing a comment in TCEC chat I ran a few tests with varying contempt according to game phase. See in particular half to full contempt here and full to half contempt here. Both tests gave approx +1.2 Elo, suggesting the gain actually came from the partially lower contempt rather than any particular mg/eg effect. From this I guess contempt 12 would gain around 2 - 2.5 Elo at fishtest STC so would probably pass as an Elo gainer if we wanted to try it.

I think that when it come to tournaments like TCEC, the idea of always using default contempt should be scrapped. DivP should use a high contempt value (like 50 or even more) to beat up weaker engines and have more spectacular play, while the final should use a value which performs best in test against the expected opponent.

The default contempt value doesn't need to be at the max self-play setting because if and when it is needed, the contempt value can be adapted, but for writing new patches a higher default contempt helps to make SF better at "knowing" when it should exchange despite contempt and when it can safely complicate.

SFI's tests revealed something interesting.

After about 25K games each, in non-regressions tests vs contempt 0, contempt 24 performed better (+1.17 elo) than contempt 22 (+0.24) which performed better than contempt 20 (-0.57).

Of course, this is still largely within error bars, and it's not possible to conclude this is more than noise. But this may indicate that the contempt-to play-strength curve is not so simple, with SF being tuned for default contempt altering it ?

There has been some extensive tests ran by @Vizvezdenec in regard to different contempt values(I can't find the results at the moment, but people were joking that he is going to measure the optimal contempt value up to nth decimal point accuracy).

The behavior is pretty straightforward: with less contempt you will get more draws, thus less expected ELO difference in those tests. The idea was to pick the largest value that does not end up regressing and that's all. In any case, looking for more opportunities away from draws at the expense of a few centipawn seems good regardless, unless you are mostly playing against significantly stronger opponents and playing for draws is the best expected outcome.

https://github.com/glinscott/fishtest/wiki/UsefulData
actual data about contempt.
Apart from noise behaviour is pretty straight -
With contempt increase we have increase of elo up to certain number and then fast decrease.
Unless there are some more sufficient proofs I don't believe that there is some "contempt extremum" in terms of elo in selfplay. It's just "close to 0, close to 0, close to 0, rapid decline" in elo.

It's probably not optimal solution to use default optimistic contempt in selfplay for setup positions. Particularly for the underdog. Isn't it correct that static value can influence dynamic if adjusted to zero?

See here - https://groups.google.com/forum/?nomobile=true#!topic/fishcooking/h-sYL8s0QBs

Was this page helpful?
0 / 5 - 0 ratings

Related issues

fun8 picture fun8  路  4Comments

Alayan-stk-2 picture Alayan-stk-2  路  5Comments

NightlyKing picture NightlyKing  路  7Comments

NKONSTANTAKIS picture NKONSTANTAKIS  路  6Comments

ghost picture ghost  路  5Comments