Stockfish: Idea by Darth_Sidious for a possible speedup

Created on 17 Sep 2020  路  8Comments  路  Source: official-stockfish/Stockfish

This was posted on discord by @Darth_Sidious a few days ago. I'm replicating it here for the future.

Hi guys. I have some suggestions that i wrote in archived channel. Please read these messages: https://discord.com/channels/435943710472011776/744337346585166016/753992057650937856 and https://discord.com/channels/435943710472011776/744337346585166016/753995112773582888
I've made a simple test to prove my first idea. https://pastebin.com/eKNuvT5c On my computer it's about 1.4 speedup even without removing zero-vectors. Feel free to launch this test and to ask questions. Is there anyone who wants to help me with the frequencies map?

Most helpful comment

Discord's uselessness confirmed: http://talkchess.com/forum3/viewtopic.php?t=74353

All 8 comments

Those links don't do anything for me.

Discord's uselessness confirmed: http://talkchess.com/forum3/viewtopic.php?t=74353

Ok, so he wants to sort the 641*64 vectors by frequency to improve caching. I doubt that will make any difference since the L1/L2/L3 caches anyway work by 64-byte cache lines and each of those vectors is much longer than a cache line. In other words, unused vectors should anyway not end up in cache.

A better idea seems to be to prefetch the relevant cache lines. This is on my TODO list already.

yes, very likely the right prefetches will improve things.

@syzygy1 did you look on the prefetch approach you mentioned before?

Not yet. I'll probably look into it soon. (If someone else beats me, then that's fine.)

@vondele
I've made a number of attempts to use prefetch for refreshing the accumulator, but I could not get a speedup. I suspect that feature weights are almost always already in one of the caches.

OK, sounds reasonable given the current small network size. I'll close the issue.

Was this page helpful?
0 / 5 - 0 ratings