I lately have problems with builds with MSYS2 (it's updated to the latest packages, so I use gcc 9.1)
The problematic thing is running a bench command under the MSYS2 shell.
I build for example with make build ARCH=x86-64-modern -j
So I can create a stockfish.exe.
When I run a ./stockfish.exe bench, it very rarely runs to the end, like here:
$ ./stockfish.exe bench > /dev/null
Position: 1/42
Position: 2/42
Position: 3/42
Position: 4/42
Position: 5/42
Position: 6/42
Position: 7/42
Position: 8/42
Position: 9/42
Position: 10/42
Position: 11/42
Position: 12/42
Position: 13/42
Position: 14/42
Then it's just stalled, CPU is not running, but nothing happens. I can stop it with CTRL+C.
It does not stop always on position 14, also in 24, 32, seems to be completely random.
In about 50% of the cases it runs to the end.
The same binary has no problem when invoked from the windows "cmd", always runs to the end.
I don't need to run the binary from the msys2 bash, so you can argue, that's no problem
But in fact it IS a problem for me, because of this it's not possible (or very hard) to run a profile build.
When I put this machine to fishtest, I have a similar problem (I start fishtest directly from the MSYS2 bash)
Anybody else has this problem, too?
Any clues?
Don't see how this could be a Stockfish issue.
Perhaps you can get more info by running under strace?
$strace ./stockfish.exe bench
It is ok under MSYS2 shell. Maybe "pacman -Syuu" command needed?
dew@asus MSYS ~/REPOS/Stockfish/src
$ ./stockfish.exe bench > /dev/null
Position: 1/42
Position: 2/42
Position: 3/42
Position: 4/42
Position: 5/42
Position: 6/42
Position: 7/42
Position: 8/42
Position: 9/42
Position: 10/42
Position: 11/42
Position: 12/42
Position: 13/42
Position: 14/42
Position: 15/42
Position: 16/42
Position: 17/42
Position: 18/42
Position: 19/42
Position: 20/42
Position: 21/42
Position: 22/42
Position: 23/42
Position: 24/42
Position: 25/42
Position: 26/42
Position: 27/42
Position: 28/42
Position: 29/42
Position: 30/42
Position: 31/42
Position: 32/42
Position: 33/42
Position: 34/42
Position: 35/42
Position: 36/42
Position: 37/42
Position: 38/42
Position: 39/42
Position: 40/42
Position: 41/42
Position: 42/42
===========================
Total time (ms) : 2047
Nodes searched : 3206912
Nodes/second : 1566639
@d3vv
That's weird, can you confirm that you have
gcc.exe (Rev3, Built by MSYS2 project) 9.1.0?
Did you run the bench with your self-compiled binary 10 times, and it always worked?
Which Windows version do you use?
I tried today on a different machine with completely different hardware and could reproduce it:
The binary produced under MSYS2 has problems to run bench under MSYS2. I then tried builds from Abrok.eu (they say it's produced with gcc 7.1) => no problem.
Then I downloaded gcc 8.1 from sourceforge, and integrated it to msys2, when I compile with this version, there is also no problem.
@LouisZulli adding strace does not give any additional output.
If your original binary worked from Windows command line, then it seems that this is neither a Stockfish issue nor a gcc issue. But since I don't use Windows, I'll retire from this discussion.
Under Linux, I can compile using gcc-9.1.0, and the binary running under bash has no issues with bench.
@CoffeeOne I compiled latest SF from github (git clone https://github.com/official-stockfish/Stockfish.git)
into src directory do your cmds ("make build ARCH=x86-64-modern -j" and "./stockfish.exe bench > /dev/null" several times then)". No problem. So:
C:\Users\dew>ver
Microsoft Windows [Version 10.0.18362.239]
$ uname -a
MSYS_NT-10.0-18362 asus 3.0.7-338.x86_64 2019-07-11 10:58 UTC x86_64 Msys
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-msys/9.1.0/lto-wrapper.exe
Target: x86_64-pc-msys
Configured with: /msys_scripts/gcc/src/gcc-9.1.0/configure --build=x86_64-pc-msys --prefix=/usr --libexecdir=/usr/lib --enable-bootstrap --enable-shared --enable-shared-libgcc --enable-static --enable-version-specific-runtime-libs --with-arch=x86-64 --with-tune=generic --disable-multilib --enable-__cxa_atexit --with-dwarf2 --enable-languages=c,c++,fortran,lto --enable-graphite --enable-threads=posix --enable-libatomic --enable-libgomp --enable-libitm --enable-libquadmath --enable-libquadmath-support --disable-libssp --disable-win32-registry --disable-symvers --with-gnu-ld --with-gnu-as --disable-isl-version-check --enable-checking=release --without-libiconv-prefix --without-libintl-prefix --with-system-zlib --enable-linker-build-id --with-default-libstdcxx-abi=gcc4-compatible
Thread model: posix
gcc version 9.1.0 (GCC)
md5-c33cc4a4d119ac1166a59119d8875a4a
dew@asus MSYS ~/REPOS/Stockfish/src
$ ldd stockfish.exe
ntdll.dll => /c/WINDOWS/SYSTEM32/ntdll.dll (0x7ffc0f5c0000)
KERNEL32.DLL => /c/WINDOWS/System32/KERNEL32.DLL (0x7ffc0e1f0000)
KERNELBASE.dll => /c/WINDOWS/System32/KERNELBASE.dll (0x7ffc0d280000)
msys-2.0.dll => /usr/bin/msys-2.0.dll (0x180040000)
msys-stdc++-6.dll => /usr/bin/msys-stdc++-6.dll (0x526840000)
msys-gcc_s-seh-1.dll => /usr/bin/msys-gcc_s-seh-1.dll (0x5e8160000)
@d3vv Thx for posting the details.
I have the exact same Windows version, ver and uname -a are the same
But:
$ gcc -v
Using built-in specs.
COLLECT_GCC=C:\msys64\mingw64\bin\gcc.exe
COLLECT_LTO_WRAPPER=C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/9.1.0/lto-wrapper.exe
Target: x86_64-w64-mingw32
Configured with: ../gcc-9.1.0/configure --prefix=/mingw64 --with-local-prefix=/mingw64/local --build=x86_64-w64-mingw32 --host=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --with-native-system-header-dir=/mingw64/x86_64-w64-mingw32/include --libexecdir=/mingw64/lib --enable-bootstrap --with-arch=x86-64 --with-tune=generic --enable-languages=c,lto,c++,fortran,ada,objc,obj-c++ --enable-shared --enable-static --enable-libatomic --enable-threads=posix --enable-graphite --enable-fully-dynamic-string --enable-libstdcxx-filesystem-ts=yes --enable-libstdcxx-time=yes --disable-libstdcxx-pch --disable-libstdcxx-debug --disable-isl-version-check --enable-lto --enable-libgomp --disable-multilib --enable-checking=release --disable-rpath --disable-win32-registry --disable-nls --disable-werror --disable-symvers --enable-plugin --with-libiconv --with-system-zlib --with-gmp=/mingw64 --with-mpfr=/mingw64 --with-mpc=/mingw64 --with-isl=/mingw64 --with-pkgversion='Rev3, Built by MSYS2 project' --with-bugurl=https://sourceforge.net/projects/msys2 --with-gnu-as --with-gnu-ld
Thread model: posix
gcc version 9.1.0 (Rev3, Built by MSYS2 project)
You have a different compiler. Most obvious is there is no "Built by MSYS2 project" string.
it seems that you are not using the mingw-w64-x86_64-gcc 9.1.0-3 package.
Also ldd information is very different:
peter@DESKTOP-S4LGVH7 MINGW64 ~/gitrepos/stockfish/src
$ ldd stockfish.exe
ntdll.dll => /c/WINDOWS/SYSTEM32/ntdll.dll (0x7ffdeb5a0000)
KERNEL32.DLL => /c/WINDOWS/System32/KERNEL32.DLL (0x7ffde9660000)
KERNELBASE.dll => /c/WINDOWS/System32/KERNELBASE.dll (0x7ffde88c0000)
msvcrt.dll => /c/WINDOWS/System32/msvcrt.dll (0x7ffdea360000)
libstdc++-6.dll => /mingw64/bin/libstdc++-6.dll (0x6fc40000)
libwinpthread-1.dll => /mingw64/bin/libwinpthread-1.dll (0x64940000)
libgcc_s_seh-1.dll => /mingw64/bin/libgcc_s_seh-1.dll (0x61440000)
I will try re-installing MSYS2 in a new folder, but I did this already on another PC (That's why I have written I reproduced the issue), so I am not optimistic
Can you post the output of pacman -Qe of your installation?
So I guess you compile with mingw-w64-cross-gcc hmm, have to check
no
i have msys-2.0.dll into ldd, and you have no (you compiled with mingw64 gcc, not msys2 pure)
@CoffeeOne
dew@asus MSYS ~
$ pacman -Qe
asciidoc 8.6.10-2
autoconf 2.69-5
autoconf2.13 2.13-2
autogen 5.18.16-1
automake-wrapper 11-1
bash 4.4.023-1
bash-completion 2.9-1
bc 1.07.1-2
bison 3.4.1-1
bsdcpio 3.4.0-3
bsdtar 3.4.0-3
bzip2 1.0.7-1
ccache 3.7.1-1
clang-svn 60106.1d5b05f-1
cmake 3.14.5-1
cocom 0.996-2
coreutils 8.31-1
crypt 1.3-1
ctags 5.8-2
curl 7.65.1-1
cvs 1.11.23-3
dash 0.5.10.2-1
diffstat 1.62-1
diffutils 3.7-1
dos2unix 7.4.0-1
doxygen 1.8.15-1
file 5.37-1
filesystem 2018.12-1
findutils 4.6.0-1
fish 3.0.2-1
flex 2.6.4-1
gawk 5.0.1-1
gcc 9.1.0-1
gcc-fortran 9.1.0-1
gcc-libs 9.1.0-1
gdb 8.2.1-3
gettext-devel 0.19.8.1-1
git 2.22.0-1
glib2-devel 2.54.3-1
gperf 3.1-1
grep 3.0-2
gzip 1.10-1
help2man 1.47.10-1
inetutils 1.9.4-2
info 6.6-1
intltool 0.51.0-2
lemon 3.21.0-1
less 551-1
libcrypt-devel 2.1-2
libtool 2.4.6-7
libunrar 5.7.5-1
libunrar-devel 5.7.5-1
lndir 1.0.3-1
make 4.2.1-1
man-db 2.8.5-2
mc 4.8.23-1
mercurial 5.0.2-1
mingw-w64-cross-crt-clang-git 5.0.0.4631.3deeda3-1
mingw-w64-i686-binutils 2.30-6
mingw-w64-i686-clang 8.0.0-10
mingw-w64-i686-crt-git 7.0.0.5482.43d67723-1
mingw-w64-i686-drmingw 0.9.1-2
mingw-w64-i686-gcc 9.1.0-3
mingw-w64-i686-gcc-ada 9.1.0-3
mingw-w64-i686-gcc-fortran 9.1.0-3
mingw-w64-i686-gcc-libgfortran 9.1.0-3
mingw-w64-i686-gcc-libs 9.1.0-3
mingw-w64-i686-gcc-objc 9.1.0-3
mingw-w64-i686-gdb 8.3-8
mingw-w64-i686-glib2 2.60.4-2
mingw-w64-i686-headers-git 7.0.0.5482.43d67723-1
mingw-w64-i686-jasper 2.0.16-1
mingw-w64-i686-libmangle-git 7.0.0.5230.69c8fad6-1
mingw-w64-i686-libwinpthread-git 7.0.0.5480.e14d23be-1
mingw-w64-i686-make 4.2.1-4
mingw-w64-i686-pkg-config 0.29.2-1
mingw-w64-i686-qt5-static 5.12.4-1
mingw-w64-i686-tools-git 7.0.0.5479.8db8dd5a-1
mingw-w64-i686-winpthreads-git 7.0.0.5480.e14d23be-1
mingw-w64-i686-winstorecompat-git 7.0.0.5479.8db8dd5a-1
mingw-w64-x86_64-SDL2 2.0.9-1
mingw-w64-x86_64-SDL2_image 2.0.5-1
mingw-w64-x86_64-SDL2_mixer 2.0.4-1
mingw-w64-x86_64-binutils 2.30-6
mingw-w64-x86_64-boost 1.70.0-2
mingw-w64-x86_64-clang 8.0.0-10
mingw-w64-x86_64-clang-analyzer 8.0.0-10
mingw-w64-x86_64-clang-tools-extra 8.0.0-10
mingw-w64-x86_64-cmake 3.14.5-2
mingw-w64-x86_64-codelite 13.0.0-1
mingw-w64-x86_64-cppcheck 1.88-1
mingw-w64-x86_64-crt-git 7.0.0.5482.43d67723-1
mingw-w64-x86_64-ctags 5.8-5
mingw-w64-x86_64-enet 1.3.14-1
mingw-w64-x86_64-gcc 9.1.0-3
mingw-w64-x86_64-gcc-ada 9.1.0-3
mingw-w64-x86_64-gcc-fortran 9.1.0-3
mingw-w64-x86_64-gcc-libgfortran 9.1.0-3
mingw-w64-x86_64-gcc-libs 9.1.0-3
mingw-w64-x86_64-gcc-objc 9.1.0-3
mingw-w64-x86_64-gdb 8.3-8
mingw-w64-x86_64-glew 2.1.0-1
mingw-w64-x86_64-headers-git 7.0.0.5482.43d67723-1
mingw-w64-x86_64-iconv 1.16-1
mingw-w64-x86_64-jemalloc 5.2.0-1
mingw-w64-x86_64-lcov 1.14-1
mingw-w64-x86_64-libmangle-git 7.0.0.5230.69c8fad6-1
mingw-w64-x86_64-libwinpthread-git 7.0.0.5480.e14d23be-1
mingw-w64-x86_64-libxml++ 3.0.1-1
mingw-w64-x86_64-libzip 1.5.2-2
mingw-w64-x86_64-lldb 8.0.0-10
mingw-w64-x86_64-make 4.2.1-4
mingw-w64-x86_64-mesa 19.1.2-1
mingw-w64-x86_64-nasm 2.14.02-1
mingw-w64-x86_64-ntldd-git r15.e7622f6-2
mingw-w64-x86_64-opencl-headers 2~2.2.20170516-1
mingw-w64-x86_64-pkg-config 0.29.2-1
mingw-w64-x86_64-python2-certifi 2019.6.16-1
mingw-w64-x86_64-python2-ipython 5.8.0-1
mingw-w64-x86_64-python2-setuptools 41.0.1-1
mingw-w64-x86_64-python2-urllib3 1.25.3-1
mingw-w64-x86_64-qemu 4.0.0-1
mingw-w64-x86_64-qt-creator 4.9.2-1
mingw-w64-x86_64-qt-installer-framework 3.1.1-1
mingw-w64-x86_64-qt5-static 5.12.4-1
mingw-w64-x86_64-sqlite-analyzer 3.16.1-1
mingw-w64-x86_64-tools-git 7.0.0.5479.8db8dd5a-1
mingw-w64-x86_64-usbview-git 42.c4ba9c6-1
mingw-w64-x86_64-vulkan-loader 1.1.112-1
mingw-w64-x86_64-winpthreads-git 7.0.0.5480.e14d23be-1
mingw-w64-x86_64-winstorecompat-git 7.0.0.5479.8db8dd5a-1
mingw-w64-x86_64-wxWidgets 3.0.4-3
mingw-w64-x86_64-yasm 1.3.0-4
mintty 1~3.0.1-1
moreutils 0.63-1
msys2-keyring r9.397a52e-1
msys2-launcher-git 0.3.32.56c2ba7-2
msys2-runtime 3.0.7-6
nano 4.3-1
nasm 2.14.02-1
ncurses 6.1.20190615-1
ncurses-devel 6.1.20190615-1
pacman 5.1.3-3
pacman-mirrors 20180604-2
pactoys-git r2.07ca37f-1
patch 2.7.6-1
patchutils 0.3.4-1
pax-git 20161104.2-1
pkgfile 19-1
procps 3.2.8-2
psmisc 23.2-1
python 3.7.4-1
quilt 0.66-2
rcs 5.9.4-2
rebase 4.4.4-1
rsync 3.1.3-1
scons 3.0.4-1
screenfetch 3.8.0-1
sed 4.7-1
subversion 1.12.0-1
swig 4.0.0-1
texinfo 6.6-1
texinfo-tex 6.6-1
tftp-hpa 5.2-3
time 1.9-1
tmux 2.9-1
ttyrec 1.0.8-2
tzcode 2019.a-1
ucl 1.03-2
ucl-devel 1.03-2
unrar 5.7.5-1
unzip 6.0-2
upx 3.95-2
util-linux 2.34-1
which 2.21-2
whois 5.4.3-1
winln 1.1-1
xmlto 0.0.28-2
xorriso 1.5.0-1
yasm 1.3.0-2
zip 3.0-3
I have gcc 9.1.0-1 for MSYS2 shell
and mingw-w64-i686-gcc 9.1.0-3 for mingw64 shell
both works perfectly for me
for mingw64 shell you must use packages from http://repo.msys2.org/mingw/x86_64/
and for msys2 shell must use pure packages from http://repo.msys2.org/msys/x86_64/ as well
otherwise you will get a hell with include-files and a ton of bugs
and that it is
Moreover, I see into ur ldd:
libstdc++-6.dll => /mingw64/bin/libstdc++-6.dll (0x6fc40000)
libwinpthread-1.dll => /mingw64/bin/libwinpthread-1.dll (0x64940000)
libgcc_s_seh-1.dll => /mingw64/bin/libgcc_s_seh-1.dll (0x61440000)
So, for pure msys2 with pure msys2-gcc u can to use just "make build ARCH=x86-64-modern -j" of cause
But for mingw64 environment the safest way to use "comp=mingw" for gcc's "-static" flag.
Please try to use "make profile-build ARCH=x86-64-modern COMP=mingw" for that.
And u have native windows SEH, native threads, and portable binary which run on any system w/o mingw environment:
dew@asus MINGW64 ~/REPOS/Stockfish/src
$ ldd stockfish.exe
ntdll.dll => /c/WINDOWS/SYSTEM32/ntdll.dll (0x7ffc0f5c0000)
KERNEL32.DLL => /c/WINDOWS/System32/KERNEL32.DLL (0x7ffc0e1f0000)
KERNELBASE.dll => /c/WINDOWS/System32/KERNELBASE.dll (0x7ffc0d280000)
msvcrt.dll => /c/WINDOWS/System32/msvcrt.dll (0x7ffc0e150000)
Hello again.
I have still this issue, can reproduce it on 2 machines:
I repeat what the issue is:
After building a stockfish binary in MSYS2 Mingw-w64 64 bit shell, the created exe makes problems, when it executed in the MSYS2 Mingw-w64 64 bit shell. Please note that I don't see the issue in the normal windows cmd.
So LouisZulli maybe is right, that's probably not a stockfish issue, but using MSYS2 is the recommended procedure to build a windows binary, it's described on the fishtest pages.
Nobody else confirmed the issue so far, but since I can reproduce it 2 installations, it can't be just bad luck.
So the issue is that the "bench" sometimes does not process through all 42 positions, it sometimes just stops before.
I did some poor man debugging and added 2 lines
diff --git a/src/uci.cpp b/src/uci.cpp
index a4235f2b..adcd9523 100644
--- a/src/uci.cpp
+++ b/src/uci.cpp
@@ -159,7 +159,9 @@ namespace {
{
cerr << "\nPosition: " << cnt++ << '/' << num << endl;
go(pos, is, states);
+ cerr << "\nAfter go(pos, is, state)";
Threads.main()->wait_for_search_finished();
+ cerr << "\nAfter Threads.main()->wait_for_search_finished()";
nodes += Threads.nodes_searched();
}
else if (token == "setoption") setoption(is);
So when it stops it looks like this:
After Threads.main()->wait_for_search_finished()
Position: 14/42
After go(pos, is, state)
So it seems that the program does not return from
Threads.main()->wait_for_search_finished();
and there is no more output on the console.
Can some developer help to debug it, that we know what happens here?
@vondele ?
@CoffeeOne I'm afraid I can't help with a windows specific problem, this needs some debugging with access to a system which reproduces the issue. I'm also traveling and will soon be without/sporadic internet for a couple of weeks.
Your debugging is a first step, but you're only seeing the output of the main thread, and I would suspect a threading issue (but that's just a guess). If you have access to gdb on the system, you could try to recompile with 'debug=yes optimize=no' and run under gdb to figure out where all threads are (./gdb stockfish and on the prompt write run bench. Once it hangs press CTRL+C and type thread apply all bt).
Unfortunately I have some problems with that ....
gdb can be easily installed in msys2, so I start the binary with gdb stockfish
When it hangs I cannot press CTRL+C anymore. I mean I can press CTRL+C, but nothing happens...
But what I see is something like this:
After Threads.main()->wait_for_search_finished()
Position: 6/42
After go(pos, is, state)[Thread 10012.0x1634 exited with code 0]
[Thread 10012.0x6a0 exited with code 0]
[Thread 10012.0x18a8 exited with code 0]
The 3 lines with starting with [Thread come about 1 minute after the hang.
So I still see stockfish.exe under details in windows task manager, still CTRL+C does not nothing.
I killed some minutes later stockfish.exe with windows taskman.
Then I get 2 more lines:
[Thread 10012.0x6cc exited with code 1]
[Inferior 1 (process 10012) exited with code 01]
(gdb)
and I come back to the gdb prompt.
What shall thread apply all bt do? I don't know how I can come back to the gdb prompt without killing stockfish, I guess when it is killed the command makes no sense.
In Linux and under gdb ctrl+c will interrupt (not kill) the process, and thread apply all bt will show the backtrace (stack) of all these active but interrupted threads. I'm afraid I can't help much further, but maybe google for some tricks?
I tried to narrow down this strange effect:
I checked out stockfish 10 code, and redid the test, same effect, ....
It鈥檚 definitely a system issue. I would consider uninstall and re-install Mingw.
@CoffeeOne I don't understand how u can run stockfish in normal cmd shell if u have:
libstdc++-6.dll => /mingw64/bin/libstdc++-6.dll (0x6fc40000)
libwinpthread-1.dll => /mingw64/bin/libwinpthread-1.dll (0x64940000)
libgcc_s_seh-1.dll => /mingw64/bin/libgcc_s_seh-1.dll (0x61440000)
It is possible only if u have gcc from msys2 or another mingw-gcc installation in the system or user PATH variable to use those DLLs.
@d3vv
That's simple and also described step by step here:
https://github.com/glinscott/fishtest/wiki/Building-stockfish-on-Windows
You just copy those files :D
Update: I edited the heading, the bench hangs also in the normal Windows environment.
I tried to run a series of 100 times bench 16 1 17, it's not possible.
@vondele
When you have some time, please have a look at the issue, I opened at mingw packages.
I could not solve the CTRL+C problem, but I can attach gdb in another msys2 shell.
The last backtrace is with a debug build of stockfish + debug package of the winpthread library.
@CoffeeOne I have one question: do u have an issue w/o pthreads with COMP=mingw (-static) build option?
@d3vv I can reproduce the hang with binaries compiled with both types of targets (COMP=mingw and COMP=gcc).
Now I need a developer to analyse the backtraces
@CoffeeOne MSYS2 came from Cygwin.. Just interesting, do u have an issue under pure cygwin shell?
https://www.cygwin.com/
And I don't understand about backtraces (backtrace log??) - do u have "core-dumps" under Windows???
As for me, I analyse windows-crashes with this way:
https://helgeklein.com/blog/2018/10/creating-an-application-crash-dump/
Any progress on this? Just updated msys2 to get gcc 9.2.0 and ran into the bench problem as @CoffeeOne described. Was at 8.3.0 and did not run into this problem, but then again, I'm not sure how many times I ran bench in the past six months to trigger it.
No news from my side, I requested support, but the STOCKFISH developers/maintainers are not interested.
Do you have the same problem as I have?
You use msys2, right?
For me it's an issue with libwinpthread and not with the gcc version, so when I downgrade the lib to version 5325, the problem goes away.
Do the downgrade with:
pacman -U mingw-w64-x86_64-libwinpthread-git-7.0.0.5325.11a5459d-1-any.pkg.tar.xz mingw-w64-x86_64-winpthreads-git-7.0.0.5325.11a5459d-1-any.pkg.tar.xz
Is it the same for you?
I use msys2. I will do the downgrade as you suggest and hope it helps!
Update: had to do it manually as you suggested elsewhere (to 5325). Now it seems to work.
U can try to ask msys2 maintainers, but .. in most cases they are sending all out of the space (to the cygwin mailing list :))
A lot of people now cry "Bug, bug, there is a bug in a dll"
Nice, but what exactly is happening?
Unfortunately I have zero experience and knowledge about multi-threaded application written in C++, so I stopped at this point.
But I would be interested: How many threads should run in a one thread search of stockfish?
3? One for the UCI thread, one for the main thread, one for the worker thread???
See my gdb output in the issue opened in mingw, there were up to 7 threads running, how is this possible?
So it seems the function wait_for_search_finished does not return. OK, fine, but where exactly is the dead-lock?
From the output is seems, that the deadlock appears at the very beginning of the search, because there is not any output like
Position: 27/42
info depth 1 seldepth 2 multipv 1 score cp 276 nodes 64 nps 32000 tbhits 0 time 2 pv c5d3
info depth 2 seldepth 2 multipv 1 score cp 309 nodes 101 nps 50500 tbhits 0 time 2 pv c5d3 a5a6
info depth 3 seldepth 3 multipv 1 score cp 404 nodes 147 nps 73500 tbhits 0 time 2 pv e1e2 a5c5 e2b2
info depth 4 seldepth 4 multipv 1 score cp 290 nodes 385 nps 192500 tbhits 0 time 2 pv c5d3 b2a1 e1e6 g8f7
info depth 5 seldepth 5 multipv 1 score cp 290 nodes 513 nps 256500 tbhits 0 time 2 pv c5d3 b2a1 e1e6 g8f7 d3f4
info depth 6 seldepth 6 multipv 1 score cp 248 nodes 889 nps 444500 tbhits 0 time 2 pv c5d3 b2a1 e1e6 a1f6 d3f4 g8f7
info depth 7 seldepth 10 multipv 1 score cp 153 nodes 3114 nps 778500 tbhits 0 time 4 pv c5d3 b2a1 e1e6 f4f3 d3f4 a1e5 e6e5 a5e5 g2f3
info depth 8 seldepth 12 multipv 1 score cp 199 nodes 3708 nps 927000 tbhits 0 time 4 pv c5d3 b2a1 e1e6 f4f3 d3f4 a1e5
info depth 9 seldepth 12 multipv 1 score cp 250 nodes 4733 nps 946600 tbhits 0 time 5 pv c5d3 b2a1 e1e6 f4f3 e6c6 f3g2
info depth 10 seldepth 14 multipv 1 score cp 135 nodes 16902 nps 1300153 tbhits 0 time 13 pv c5d3 b2f6 e1e6 f4f3 g2f3 g8f7 e6c6 f7e7 d3c5 a5a3 c5e4 f6e5
info depth 11 seldepth 19 multipv 1 score cp 126 nodes 36362 nps 1298642 tbhits 0 time 28 pv c5d3 b2d4 e1e8 g8h7 e8e6 a5a1 g1h2 a1a3 d3f4 d4f2 f4e2 a3a2 g2g4 f2c5 h2g2
info depth 12 seldepth 19 multipv 1 score cp 146 nodes 45571 nps 1340323 tbhits 0 time 34 pv c5d3 b2d4 e1e8 g8f7 e8d8 d4f6 d8a8 a5a1 g1h2 f6d4 d3f4 d4f2 f4e2
info depth 13 seldepth 23 multipv 1 score cp 176 nodes 120459 nps 1400686 tbhits 0 time 86 pv e1e6 b2f6 c5d3 a5a3 e6d6 a3a2 f2f3 g8f7 d3f4 f6e5 d6d7 f7e8
bestmove e1e6 ponder b2f6
Position: 28/42
Here the program is deadlocking.
So in this example it hangs in position 28, so it seems like the search isn't started at all.
And last but not least, it would make sense to support the MINGW64 package maintainer to correct the problem (If it is a problem, I am still only 99% convinced).
I don't think it's easy to provide a small sample program, which causes the hang, too, because the issue is timing dependant, does not happen on every windows computer, even if the exact same toolchain is used.
But can somebody try to provide such a thing?
@CoffeeOne trying to follow the 3+1 issues/PRs we have open on this. AFAICT, the problem is an incorrect condition variable implementation in the threading library of mingw. This should have been corrected in:
https://sourceforge.net/p/mingw-w64/mingw-w64/ci/330025c54b85512d54b6960fad07498365c8fee3/
if the condition variable implementation is wrong in the supporting library, this can show up as hangs in user code, which is what is happening. I think it would support the mingw64 package maintainer if you could test the version with the above commit, and indicate on the PR if this solved the problem for you.
@vondele I rebuild locally the libwinpthread (no changes), and it seems to be good.
That was about to expect because the changes in cond.c were reverted. So the change is there but there wasn't a new pkg created yet.
I will write something in the issue that I have opened at MINGW-packages.
The issue seems to be well understood now (see the related https://github.com/official-stockfish/Stockfish/issues/2291), so I shall close it. What remains to be done is to document the buggy pthread library prblem very carefully in our Wiki.
@CoffeeOne
My apologies for failing to react fast enough when you first told us in July!