The most recent nightly builds are failing at dist_linux
(and dist_linux32
) step while compiling shards 0.8.1 (https://circleci.com/gh/crystal-lang/crystal/19143). MacOS (dist_darwin
) seems to be unaffected (https://circleci.com/gh/crystal-lang/crystal/19145).
The same build works fine with a Crystal 0.27.2 compiler. The last successful nightly build was on February 8 (https://circleci.com/gh/crystal-lang/crystal/18125) with 0da24ec and the first failing build was on February 11 (https://circleci.com/gh/crystal-lang/crystal/18365) with d956786. For some reason, there seems to have been no nightly builds on February 9 and 10.
Changes between these commits: https://github.com/crystal-lang/crystal/compare/0da24ec...d956786
It seems the most likely candidate is #6945 because its change is specific to static builds on musl (which fits the affected shards build) and doesn't impact MacOS.
/cc @j8r
We should definitely revert #6945 , I just saw it and it makes no sense at all. Require LLVM stuff in every program? Why? And how does that fix anything?
No, this isn't requiring LLVM in every program - only statically linked ones in musl. It puts this hack which is in distribution-scripts, in mainstream.
This line in distribution-script can therefore be removed now.
Hmmm... okay. Now I'm not sure #6945 is the reason it's failing.
I can compile shards
successfully on jrei/crystal-alpine
, which has crystal 0.27.2. The command I ran is crystal build --stats --target x86_64-linux-musl src/shards.cr -o shards --static ${release:+--release}
-D -lunwind
has to be added at the compilation to avoid a segmentation fault when running ./shards
.
@j8r Yes, 0.27.2 is fine. The error was introduced in 0.28.0-dev somewhere between https://github.com/crystal-lang/crystal/commit/0da24ec631b99f75c74e330802122540f796b1cb and https://github.com/crystal-lang/crystal/commit/d95678626e6a079176e3bf1611d91bc7a6f7fcd5.
If you'd like to investigate, it should be very easy to identify the first bad commit using binary search (git bisect
could help for that).
-D -lunwind
has to be added at the compilation to avoid a segmentation fault when running./shards
.
That's what #6945 fixed, and previously the hack in distribution-scripts, right?
In fact we can run ./shards
, but after installing minitest it returns a segfault – strange. Not every applications behave like this.
I'll compile shards
with Crystal master then.
We must revert #6945 anyway. I accepted the initial patch (link against libunwind
) as a mitigation to the bug, but the libLLVM hack that came afterwards is horrible —it shouldn't even be in the distribution scripts. A proper solution would be to get libstdc++
to raise the exceptions.
@ysbaddaden I agree. Let's revert it. Maybe that will also solve this issue.
But it's still a compiler error and we'd need to investigate it's reasons even if it doesn't pop up any more.
A proper solution would be to get libstdc++ to raise the exceptions.
Any ideas how this can be achieved?
I've tested with Crystal 0.28.0-dev [4a5c263ba] (2019-03-05)
in an Alpine container: .build/crystal build --static shards/src/shards.cr -o shard
No problem so far with the compilation nor when running inside and outside the container.
Maybe it has to do with the debian multi-stage build.
Yeah, I was not yet able to reproduce the error locally (running manual steps) as well. Seems like it somehow depends on the CI environment.
@straight-shoota nevertheless I was able to reproduce it with distribution-scripts. I highly suspect that the Debian-Alpine multistage build has to do with the issue.
I don't know if it's because of glibc-musl or on the LLVM side.
After reverting the commit https://github.com/crystal-lang/crystal/commit/ead1b6b75eca42da56981474f242133e36ce89d0,when building on Alpine in the multi-stage build, the compilation succeed. shards
is still segfaulting, even with using -lunwind
(in both crystal compilations and shards compilation).
The early experiments in https://github.com/crystal-lang/crystal/pull/6945 about linking against lunwind
to prevent segfaults won't be enough in this case.
Reverting the commit on both Debian and Alpine crystal compilations and using the previous hack solves this issues.
We can revert this commit, then find a proper solution after https://github.com/crystal-lang/crystal/pull/7479 merged.
Thanks @j8r for investigating! So when we revert #6945 builds should be working again.
This issue is still about a compiler error which isn't fixed by reverting #6945. It is just avoided. However, it seems to be difficult to reproduce this compiler error.
A proper solution to #6934 would be to fix #4276 as @ysbaddaden suggested.
Could this be given a better title?
Duplicate of #4276 ?
Should we assume this is fixed now? However the original error Missing hash key: NoReturn (KeyError)
makes me think this was actually unrelated with static linking.
Most helpful comment
We must revert #6945 anyway. I accepted the initial patch (link against
libunwind
) as a mitigation to the bug, but the libLLVM hack that came afterwards is horrible —it shouldn't even be in the distribution scripts. A proper solution would be to getlibstdc++
to raise the exceptions.