Following from #9348, which exposes the fact that some language constructs don't codegen correctly on Windows, one can see that usages of "special vars" $? $~ fail to compile.
This is a minimal code sample that, like the specs, also produces the error below:
def foo(z)
$~ = 1
end
foo(2)
p $~
If someone wants to look into this, the starting point (also assumes that you have #9348 merged / checked out) would be this diff:
(certainly wasn't easy to find even this, because normally the nil assertion gives you no info whatsoever where it's coming from without stack traces)
Click to see diff
diff --git a/src/compiler/crystal/codegen/codegen.cr b/src/compiler/crystal/codegen/codegen.cr
index 09eef8806..d19c0d42d 100644
--- a/src/compiler/crystal/codegen/codegen.cr
+++ b/src/compiler/crystal/codegen/codegen.cr
@@ -1342,7 +1342,7 @@ module Crystal
obj_type = node.obj.type
type_id = type_id @last, obj_type
- filtered_type = yield(obj_type).not_nil!
+ filtered_type = yield(obj_type) || raise "BUG: filtering #{obj_type} #{node} produced nil"
@last = match_type_id obj_type, filtered_type, type_id
diff --git a/spec/compiler/codegen/special_vars_spec.cr b/spec/compiler/codegen/special_vars_spec.cr
index 14323c96d..280da6caa 100644
--- a/spec/compiler/codegen/special_vars_spec.cr
+++ b/spec/compiler/codegen/special_vars_spec.cr
@@ -2,7 +2,7 @@ require "../../spec_helper"
describe "Codegen: special vars" do
["$~", "$?"].each do |name|
- pending_win32 "codegens #{name}" do
+ it "codegens #{name}" do
run(%(
class Object; def not_nil!; self; end; end
@@ -15,7 +15,7 @@ describe "Codegen: special vars" do
)).to_string.should eq("hey")
end
- pending_win32 "codegens #{name} with nilable (1)" do
+ it "codegens #{name} with nilable (1)" do
run(%(
require "prelude"
@@ -35,7 +35,7 @@ describe "Codegen: special vars" do
)).to_string.should eq("ouch")
end
- pending_win32 "codegens #{name} with nilable (2)" do
+ it "codegens #{name} with nilable (2)" do
run(%(
require "prelude"
@@ -74,7 +74,7 @@ describe "Codegen: special vars" do
)).to_string.should eq("hey")
end
- pending_win32 "works lazily" do
+ it "works lazily" do
run(%(
require "prelude"
@@ -145,7 +145,7 @@ describe "Codegen: special vars" do
)).to_string.should eq("hey")
end
- pending_win32 "codegens after block" do
+ it "codegens after block" do
run(%(
require "prelude"
> crystal spec spec\compiler\codegen\special_vars_spec.cr -Di_know_what_im_doing -Dwithout_playground
...
BUG: filtering NilAssertionError _.responds_to?(:finalize) produced nil (Exception)
...
Meanwhile this diff makes it work:
diff --git a/src/compiler/crystal/semantic/new.cr b/src/compiler/crystal/semantic/new.cr
index ddfe4c4e5..6724a1609 100644
--- a/src/compiler/crystal/semantic/new.cr
+++ b/src/compiler/crystal/semantic/new.cr
@@ -214,8 +214,6 @@ module Crystal
exps = Array(ASTNode).new(4)
exps << assign
exps << init
- exps << If.new(RespondsTo.new(obj.clone, "finalize").at(self),
- Call.new(Path.global("GC").at(self), "add_finalizer", obj.clone).at(self))
exps << obj
# Forward block argument if any
One more reason to remove finalize in favor of an explicit GC.add_finalizer, eh? :D
no
Mmmm very interesting finding, these specs actually mostly pass, only those that require "prelude" fail. Also same for non-spec usage, of course, as prelude is added by default.
Just a note that $~ doesn't depend on any platform specific code, it just uses regular LLVM code. So the issue might be somewhere else.
Right. Well, looks like there's some universal latent issue that only gets triggered by a particular layout of prelude -- which does differ significantly on Windows
Wow that was a costly instance of tunnel vision.
This can be easily reproduced on any other OS, with cross-compilation, and the compiler there has stack traces!!
(_test.cr_ as in the original post)
$ bin/crystal build --cross-compile --target x86_64-pc-windows-msvc test.cr
Using compiled compiler at .build/crystal
Nil assertion failed (NilAssertionError)
from src/nil.cr:106:5 in 'not_nil!'
from src/compiler/crystal/codegen/codegen.cr:1345:23 in 'visit'
from src/compiler/crystal/syntax/visitor.cr:27:12 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:2193:7 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:849:7 in 'codegen_cond'
from src/compiler/crystal/codegen/codegen.cr:843:12 in 'codegen_cond_branch'
from src/compiler/crystal/codegen/codegen.cr:787:9 in 'visit'
from src/compiler/crystal/syntax/visitor.cr:27:12 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:2193:7 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:628:9 in 'visit'
from src/compiler/crystal/syntax/visitor.cr:27:12 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:2193:7 in 'accept'
from src/compiler/crystal/codegen/fun.cr:162:11 in 'codegen_fun'
from src/compiler/crystal/codegen/fun.cr:51:3 in 'codegen_fun'
from src/compiler/crystal/codegen/fun.cr:8:54 in 'target_def_fun'
from src/compiler/crystal/codegen/call.cr:426:12 in 'codegen_call'
from src/compiler/crystal/codegen/call.cr:37:7 in 'visit'
from src/compiler/crystal/syntax/visitor.cr:27:12 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:2193:7 in 'accept'
from src/compiler/crystal/codegen/call.cr:102:7 in 'prepare_call_args_non_external'
from src/compiler/crystal/codegen/call.cr:59:7 in 'prepare_call_args'
from src/compiler/crystal/codegen/call.cr:23:26 in 'visit'
from src/compiler/crystal/syntax/visitor.cr:27:12 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:2193:7 in 'accept'
from src/compiler/crystal/codegen/fun.cr:162:11 in 'codegen_fun'
from src/compiler/crystal/codegen/fun.cr:51:3 in 'codegen_fun'
from src/compiler/crystal/codegen/fun.cr:8:54 in 'target_def_fun'
from src/compiler/crystal/codegen/call.cr:426:12 in 'codegen_call'
from src/compiler/crystal/codegen/call.cr:37:7 in 'visit'
from src/compiler/crystal/syntax/visitor.cr:27:12 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:2193:7 in 'accept'
from src/compiler/crystal/codegen/call.cr:386:11 in 'codegen_dispatch'
from src/compiler/crystal/codegen/call.cr:15:7 in 'visit'
from src/compiler/crystal/syntax/visitor.cr:27:12 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:2193:7 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:964:9 in 'codegen_assign'
from src/compiler/crystal/codegen/codegen.cr:928:7 in 'visit'
from src/compiler/crystal/syntax/visitor.cr:27:12 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:2193:7 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:628:9 in 'visit'
from src/compiler/crystal/syntax/visitor.cr:27:12 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:2193:7 in 'accept'
from src/compiler/crystal/codegen/codegen.cr:67:7 in 'codegen'
from src/compiler/crystal/codegen/codegen.cr:65:5 in 'codegen:debug:single_module'
from src/compiler/crystal/compiler.cr:22:7 in 'codegen'
from src/compiler/crystal/compiler.cr:169:16 in 'compile'
from src/compiler/crystal/command.cr:289:3 in 'compile'
from src/compiler/crystal/command.cr:287:5 in 'compile'
from src/compiler/crystal/command.cr:180:5 in 'build'
from src/compiler/crystal/command.cr:72:16 in 'run'
from src/compiler/crystal/command.cr:49:5 in 'run'
from src/compiler/crystal/command.cr:48:3 in 'run'
from src/compiler/crystal.cr:11:1 in '__crystal_main'
from src/crystal/main.cr:105:5 in 'main_user_code'
from src/crystal/main.cr:91:7 in 'main'
from src/crystal/main.cr:114:3 in 'main'
from __libc_start_main
from _start
from ???
OK so the real difference is visible here:
https://github.com/crystal-lang/crystal/blob/c4b87dc4acec45cb9f4fcd08daff797ad2e9d373/src/compiler/crystal/codegen/fun.cr#L162
if I p! target_def.body
On Linux I see:
target_def.body # => raise(NilAssertionError.new)
target_def.body # => _ = allocate
_.initialize
if false
::GC.add_finalizer(_)
end
_
On Windows I see:
target_def.body # => raise(NilAssertionError.new)
target_def.body # => _ = allocate
_.initialize
if _.responds_to?(:finalize)
::GC.add_finalizer(_)
end
_
So there's some reason that on Linux this branch is eliminated before it gets to codegen, while on Windows it doesn't. If it wasn't eliminated on Linux as well, the same bug would probably manifest. And the only reason that this can't be eliminated is some difference in stdlib. So this is surely a compiler bug but I still can't figure out what exact difference in stdlib can cause it.
I have quite extensively looked into the theory that maybe there exists an exception type that happens to have a finalizer on Windows, but that doesn't appear to be the case.
I also tried selectively excluding parts of prelude, but that's just not viable as it's so interconnected
Further proof that it's a latent bug triggered only by a particular makeup of the standard library.
If the compiler is modified like this:
Click to see diff
diff --git a/src/compiler/crystal/crystal_path.cr b/src/compiler/crystal/crystal_path.cr
index 9b2bdd982..414221599 100644
--- a/src/compiler/crystal/crystal_path.cr
+++ b/src/compiler/crystal/crystal_path.cr
@@ -33,7 +33,7 @@ module Crystal
end
private def add_target_path(codegen_target)
- target = "#{codegen_target.architecture}-#{codegen_target.os_name}"
+ target = "x86_64-windows-msvc"
@entries.each do |path|
path = File.join(path, "lib_c", target)
diff --git a/src/compiler/crystal/semantic/flags.cr b/src/compiler/crystal/semantic/flags.cr
index 76ecd26a5..62c3eaeab 100644
--- a/src/compiler/crystal/semantic/flags.cr
+++ b/src/compiler/crystal/semantic/flags.cr
@@ -29,25 +29,11 @@ class Crystal::Program
flags = Set(String).new
flags.add target.architecture
- flags.add target.vendor
- flags.concat target.environment_parts
flags.add "bits#{target.pointer_bit_width}"
- flags.add "armhf" if target.armhf?
-
- flags.add "unix" if target.unix?
- flags.add "win32" if target.win32?
-
- flags.add "darwin" if target.macos?
- if target.freebsd?
- flags.add "freebsd"
- flags.add "freebsd#{target.freebsd_version}"
- end
- flags.add "openbsd" if target.openbsd?
- flags.add "dragonfly" if target.dragonfly?
-
- flags.add "bsd" if target.bsd?
+ flags.add "win32"
+ flags.add "windows"
flags
end
Then it triggers the same bug on the example program even when compiling completely "normally" on Linux.
@oprypin Do you have a small snippet of Crystal code, a diff and instructions on how I can reproduce this on mac, without relating it to Windows?
@asterite yes! I had mentioned it.
Snippet of Crystal code (_test.cr_):
$~
Then run this:
crystal build --cross-compile --target x86_64-pc-windows-msvc test.cr
Alternatively (feel free to disregard):
Apply the diff from https://github.com/crystal-lang/crystal/issues/9439#issuecomment-646220570
Then run this:
crystal build test.cr
Running:
crystal build --cross-compile --target x86_64-pc-windows-msvc test.cr
gives me:
Nil assertion failed (NilAssertionError)
Is that the bug?
Yes that's the bug, it's nil assertion in the compiler, not in normal code.
Feel free to expand this error with this diff:
```patch
diff --git a/src/compiler/crystal/codegen/codegen.cr b/src/compiler/crystal/codegen/codegen.cr
index 09eef8806..d19c0d42d 100644
--- a/src/compiler/crystal/codegen/codegen.cr
+++ b/src/compiler/crystal/codegen/codegen.cr
@@ -1342,7 +1342,7 @@ module Crystal
obj_type = node.obj.type
type_id = type_id @last, obj_type
- filtered_type = yield(obj_type).not_nil!
+ filtered_type = yield(obj_type) || raise "BUG: filtering #{obj_type} #{node} produced nil"
@last = match_type_id obj_type, filtered_type, type_id
```
I don't understand why it breaks, but I found out how to fix it! :-D
Actually, I do understand why it breaks... I don't understand why it didn't break in non-windows.
@asterite Thanks!
The closest I got to the difference is https://github.com/crystal-lang/crystal/issues/9439#issuecomment-640133327
I think the difference is that raise(NilAssertionFailed.new) in non-windows is expanded earlier by the compiler and then the _.responds_to?(:finalize) is expanded correctly. But in windows it probably only happens in $~.not_nil! and an expansion was missing there and so it broke. Or that's at least my guess.
Most helpful comment
I don't understand why it breaks, but I found out how to fix it! :-D