While working on #5303, I've noticed that formatting std.meta.Vector(N, bool) generates erroneous results. Take this snippet for example:
const std = @import("std");
pub fn main() !void {
const vs: std.meta.Vector(4, bool) = [_]bool{true, false, true, false};
std.debug.warn("{}\n", .{vs});
}
{ true, false, true, false }
{ true, true, true, true }
wasmtime as well as wasmer) which makes me believe it's something in Zig's compiler.Vector (or it wrapped inside some structure) to std.fmt. For example, when the elements of Vector are printed one-by-one, everything works as expected:const std = @import("std");
pub fn main() !void {
const vs: std.meta.Vector(4, bool) = [_]bool{true, false, true, false};
var i: usize = 0;
while (i < 4) : (i += 1) {
std.debug.warn("{}, ", .{vs[i]});
}
std.debug.warn("\n", .{});
}
true, false, true, false,
md5-bf63245378646c4c2d4613ddd184e6d4
{ false, false, false, false }
I tried to figure out where the source of this error might be coming from, but it seems I know too little about the structure of Zig's compiler. My hunch here is that this is something to do with the way comptime and var are handled in wasm32 as the target, and if anyone points me in the right direction, I'll be happy to investigate further and fix it.
Quick update:
I believe I've narrowed the issue down to storing/loading ZigTypeIdVector in function calls. So this:
const vs: std.meta.Vector(2, bool) = [_]bool{ true, false };
fn doit(vs: var);
doit(.{vs}); // seems OK
doit(vs); // stores everything equal to the first declared element, in this case, "true"
Hence, we observe the problem in std.debug.warn but not at this function callsite, but actually (if you trace along) at std.fmt.formatValue callsite in the body of std.fmt.formatType.
On the LLVM IR, the types for the element all seem fine and equal to i1 as they should be.
And here's the (close to) minimal working example:
const Vector = @import("lib/zig/std/meta.zig").Vector;
fn hello(vs: Vector(4, bool)) void {
var i: usize = 0;
while (i < 4) : (i += 1) {
_ = vs[i];
}
}
export fn hmm() void {
const vs: Vector(4, bool) = [_]bool{ false, false, true, false };
hello(vs);
}
pub fn panic(msg: []const u8, error_return_trace: ?*@import("builtin").StackTrace) noreturn {
while (true) {}
}
@mikdusan would you be able to point me in the right direction in the compiler source to track this down further? Apologies if I confused some bits here, still learning the internals of zig as well as LLVM itself, but I think the issue is somewhere in codegen.cpp where the var gets created and the pointer gets stored for later use. When we load the Vector back from the stored pointer the values are already corrupted, i.e., IrInstGenLoadPtr *instruction in ir_render_load_ptr @ codegen.cpp:3582 when dereferenced, already points at a vector with false elements only (for the example immediately above).
@kubkon, not really sure where to start on this except there seem to be some relevant issues open with Vector and various arch; see #4486 and #3317
@kubkon, not really sure where to start on this except there seem to be some relevant issues open with Vector and various arch; see #4486 and #3317
Ah, thanks for the links! Somehow it didn鈥檛 occur to me the same problem might appear on different target archs.
@andrewrk this might be of interest as it seems related to #3317, although I didn't have the time to compare both in detail yet.
OK, after a bit of tinkering, I think I've managed to confirm that this is also a bug with the LLVM itself, although I wouldn't mind another pair of eyes on this! But I'm getting ahead of myself. First of, here's the reduction that generates faulty load/store Wasm instructions. Oh, and make sure you compile with wasm32-freestanding target to avoid unnecessary bloat.
const std = @import("std");
const Vector = std.meta.Vector;
fn foo(vs: Vector(2, bool)) bool {
var a = vs[0];
var b = vs[1];
return b;
}
export fn entrypoint() bool {
const vs: Vector(2, bool) = [_]bool{ false, true };
return foo(vs);
}
pub fn panic(msg: []const u8, error_return_trace: ?*@import("builtin").StackTrace) noreturn {
while (true) {}
}
; Function Attrs: nobuiltin nounwind
define i1 @entrypoint() #1 {
Entry:
%result = alloca i1, align 1
%0 = call fastcc i1 @foo(<2 x i1> <i1 false, i1 true>)
store i1 %0, i1* %result, align 1
%1 = load i1, i1* %result, align 1
ret i1 %1
}
; Function Attrs: nobuiltin nounwind
define internal fastcc i1 @foo(<2 x i1> %0) unnamed_addr #1 {
Entry:
%result = alloca i1, align 1
%a = alloca i1, align 1
%b = alloca i1, align 1
%vs = alloca <2 x i1>, align 2
store <2 x i1> %0, <2 x i1>* %vs, align 2
%1 = load <2 x i1>, <2 x i1>* %vs
%2 = extractelement <2 x i1> %1, i32 0
store i1 %2, i1* %a, align 1
%3 = load <2 x i1>, <2 x i1>* %vs
%4 = extractelement <2 x i1> %3, i32 1
store i1 %4, i1* %b, align 1
%5 = load i1, i1* %b, align 1
store i1 %5, i1* %result, align 1
%6 = load i1, i1* %result, align 1
ret i1 %6
}
1 (module
2 (type (;0;) (func (result i32)))
3 (type (;1;) (func (param i32 i32) (result i32)))
4 (func $entrypoint (type 0) (result i32)
5 (local i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32)
6 global.get 0
7 local.set 0
8 i32.const 16
9 local.set 1
10 local.get 0
11 local.get 1
12 i32.sub
13 local.set 2
14 local.get 2
15 global.set 0
16 i32.const 1
17 local.set 3
18 i32.const 0
19 local.set 4
20 local.get 4
21 local.get 3
22 call $foo
23 local.set 5
24 i32.const 1
25 local.set 6
26 local.get 5
27 local.get 6
28 i32.and
29 local.set 7
30 local.get 2
31 local.get 7
32 i32.store8 offset=15
33 local.get 2
34 i32.load8_u offset=15
35 local.set 8
36 i32.const 16
37 local.set 9
38 local.get 2
39 local.get 9
40 i32.add
41 local.set 10
42 local.get 10
43 global.set 0
44 local.get 8
45 return)
46 (func $foo (type 1) (param i32 i32) (result i32)
47 (local i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32)
48 global.get 0
49 local.set 2
50 i32.const 16
51 local.set 3
52 local.get 2
53 local.get 3
54 i32.sub
55 local.set 4
56 i32.const 1
57 local.set 5
58 local.get 0
59 local.get 5
60 i32.and
61 local.set 6
62 local.get 1
63 local.get 5
64 i32.shl
65 local.set 7
66 local.get 6
67 local.get 7
68 i32.or
69 local.set 8
70 i32.const 3
71 local.set 9
72 local.get 8
73 local.get 9
74 i32.and
75 local.set 10
76 local.get 4
77 local.get 10
78 i32.store8 offset=10
79 local.get 4
80 i32.load8_u offset=10
81 local.set 11
82 local.get 11
83 local.get 5
84 i32.and
85 local.set 12
86 local.get 4
87 local.get 12
88 i32.store8 offset=14
89 local.get 4
90 i32.load8_u offset=10
91 local.set 13
92 i32.const 1
93 local.set 14
94 local.get 13
95 local.get 14
96 i32.and
97 local.set 15
98 local.get 4
99 local.get 15
100 i32.store8 offset=13
101 local.get 4
102 i32.load8_u offset=13
103 local.set 16
104 i32.const 1
105 local.set 17
106 local.get 16
107 local.get 17
108 i32.and
109 local.set 18
110 local.get 4
111 local.get 18
112 i32.store8 offset=15
113 local.get 4
114 i32.load8_u offset=15
115 local.set 19
116 local.get 19
117 return)
118 (table (;0;) 1 1 funcref)
119 (memory (;0;) 2)
120 (global (;0;) (mut i32) (i32.const 66560))
121 (export "memory" (memory 0))
122 (export "entrypoint" (func $entrypoint)))
If my maths is correct and provided I didn't do any silly mistake along the way anywhere, the last i32.store8_u offset=15 op in line 112 will store the value of 0 (which in Wasm is equivalent to false), and this value will then be stored in the function's local vector at index 19, loaded and returned. This is obviously incorrect as we'd expect the function to return anything but 0 (any other value would but 0 would be equivalent to true in Wasm AFAIK).
cc @sunfishcode if you have a spare couple of minutes, if you could have a look at the LLVM IR and WAT, that'd be great!
cc @andrewrk could you have a look and quickly double check this makes sense? If we agree this is indeed a bug with the LLVM, I'll file an issue in their upstream.
I've had an offline chat with @sunfishcode about this, and he pointed me to this long-standing LLVM issue: 1784 - vectors of i1 and vectors x86 long double don't work. I've done some more tests and I can confirm that for any "exotic" types such as i1 or u30, etc., this problem persists in Wasm, and vector gets miscompiled.
Given that the LLVM bug report has been there for some time (since 2007!), and a fix is not easily provided (I'm being told it would require reworking a lot of the codegen surface in LLVM), I'm trying to figure out if there's anything in the meantime we could do on our end. When it comes to handling vector of bool, I was thinking if, when targeting Wasm, we could represent bool as a type which is a multiple of a byte, such as i8. Then, the problem disappears. However, this still wouldn't fix the case of other exotic types such as u30 or whatnot. Perhaps in that case, we could throw a compilation error for now? @andrewrk what are your thoughts on this I wonder? Do you think we should try addressing it on our end, or perhaps wait it out until self-hosted arrives and handle it properly there?
Just some heads-up that after an even closer analysis this bug looks like it suffers from the same problem as #3317, and I've now submitted an upstream bug report which can be viewed here: 46127 - wasm miscompilation of vector instructions.
A quick update, I've been asked by LLVM devs to compile the generated *.ll with the latest upstream, and the output is correct with -O0 flag. This means when LLVM-11 is out, this bug will likely disappear, however, for the time being, I suggest leaving this issue as a reminder/tracker. See the LLVM upstream bug for more details.
Most helpful comment
A quick update, I've been asked by LLVM devs to compile the generated
*.llwith the latest upstream, and the output is correct with-O0flag. This means when LLVM-11 is out, this bug will likely disappear, however, for the time being, I suggest leaving this issue as a reminder/tracker. See the LLVM upstream bug for more details.