Zig: Compiler hangs, continues to hang every time it is invoked until full reboot of computer on windows

Created on 21 Jun 2019  Â·  13Comments  Â·  Source: ziglang/zig

If I attempt to compile this project on Windows https://github.com/termhn/thunderclap it will full hang while not using basically any CPU time. If I stop it with Ctrl+C, then attempt to build any other zig project (even one known to compile and run properly before), it will also hang in the same manner. Not sure what this means, if I'm doing something wrong, or what :)

bug os-windows

Most helpful comment

Thanks for the report. Making a note that current master branch of that project is at https://github.com/termhn/thunderclap/commit/f70a9ea4236273989a7d008d5c4ad7a11e48b7f5 so that's the commit to checkout when debugging this.

All 13 comments

are all zig processes actually killed? if not this might be #859. what are you using to compile zig / what version are you using?

When it hangs or when I try to manually kill it? When it hangs no, there’s
still multiple zig processes, but they aren’t seemingly doing anything. And
after I kill it I didn’t check if there were any still running.

On Fri, Jun 21, 2019 at 8:06 AM emekoi notifications@github.com wrote:

are all zig processes actually killed?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/ziglang/zig/issues/2724?email_source=notifications&email_token=AAGYXH4IDMQ3KEW5DMZZSS3P3TU7BA5CNFSM4H2LPI72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYIXFRY#issuecomment-504459975,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAGYXH5WGHTANUG5LVNVWX3P3TU7BANCNFSM4H2LPI7Q
.

did you build zig yourself or did you download a build from the website?

Oh yea shoulda mentioned I installed latest version from swooter

On Fri, Jun 21, 2019 at 8:14 AM emekoi notifications@github.com wrote:

did you build zig yourself or did you download a build from the website?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/ziglang/zig/issues/2724?email_source=notifications&email_token=AAGYXH7OSDWLQH7UHNQSK3TP3TV4FA5CNFSM4H2LPI72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYIX2KA#issuecomment-504462632,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAGYXHY2HAYZA5DN6DPT7RTP3TV4FANCNFSM4H2LPI7Q
.

Scoop** sorry, phone haha

On Fri, Jun 21, 2019 at 8:15 AM Gray Olson gray@grayolson.com wrote:

Oh yea shoulda mentioned I installed latest version from swooter

On Fri, Jun 21, 2019 at 8:14 AM emekoi notifications@github.com wrote:

did you build zig yourself or did you download a build from the website?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/ziglang/zig/issues/2724?email_source=notifications&email_token=AAGYXH7OSDWLQH7UHNQSK3TP3TV4FA5CNFSM4H2LPI72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYIX2KA#issuecomment-504462632,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAGYXHY2HAYZA5DN6DPT7RTP3TV4FANCNFSM4H2LPI7Q
.

Thanks for the report. Making a note that current master branch of that project is at https://github.com/termhn/thunderclap/commit/f70a9ea4236273989a7d008d5c4ad7a11e48b7f5 so that's the commit to checkout when debugging this.

this is really weird. as libc isn't linked anywhere trying to build it should fail right away.

I am experiencing this bug as well and decided to investigate a bit. Debugging the compiler while it's hung shows that it's hanging out in the function os_file_open_lock_rw forever, continually retrying to exclusively open a cache file. However, looking at the process in Process Explorer shows that it already has a handle to that file.

I suspect this bug is related to different platform behavior with regards to file locking. The function os_file_open_lock_rw uses the posix function fcntl with F_SETLKW to ensure no other processes are using the file. I believe the posix behavior for subsequent calls with F_SETLKW in the same process is to succeed. However on Win32, opening a file for exclusive read/write causes subsequent calls to CreateFile() to fail, even when they're in the same process. I verified this behavior using these 2 simple programs:
file_open_experiment.zip

I'm happy to work on this, but I'm not sure what the desired behavior would be. Maybe the Win32 behavior should be to keep a map of filenames to handles opened for exclusive r/w and use that lookup first? In that case I believe it would need to evict the matching entry from ira->codegen->caches_to_release.

disclaimer: Windows is not my primary or even secondary platform

When I build on windows, it's through a MSYS2 environment. I ssh into the box to a MSYS2 sshd server and run my fav zsh shell.

If you're not running zsh, bash, csh, ksh or whatever, then the rest of this comment is not applicable.

Here's the important part:

ctrl-c or command line kill <PID> is not sent to non-MSYS2 processes. That's right, if you run zig.exe, or msvc compiler or any other procs like that from a MSYS2 shell, the signal never gets to them. What happens is MSYS2 simply detaches the process from the foreground.

A MSYS2 ps -ef will not show these processes as still running. For that, you need to use Windows task manager, or ps -W from MSYS2 to see those procs. To kill the proc, again use Windows task manager, or if you must, shunt through a native windows shell with a command like powershell kill WINDOWS-PID.

That's been my experience -- thinking I've stopped building zig from sources meanwhile it keeps running, consuming resources. I presume locks are held exponentially longer as the system bogs down.

I suppose I should have clarified - I'm seeing zig hang forever, but the behavior doesn't go away on an OS reboot. I've also made sure no other zig processes are running. I can delete zig-cache, rebuild, and the problem will immediately repro again. Is this behavior different enough to warrant a new issue?

I'm reproing this behavior with commit https://github.com/rdunnington/zigtroids/commit/5924b0492dc6728520e137ac4e7f06cd8e504afd.

Sorry I didn't read your comment closer.

FYI, I just verified your belief with .c that fcntl(fd, F_SETLKW, ...) succeeds if same open/fcntl is repeated before first is closed.

I believe this issue is now resolved, given that we are using @leroycep's windows locking code in the std lib + the self-hosted Cache implementation, and #6250 moved the stage1 cache system to become self-hosted. The C++ implementation that caused this bug is now deleted.

Was this page helpful?
0 / 5 - 0 ratings