Zig: Miscompiled Vector(N, bool) in wasm32

Created on 14 May 2020  路  9Comments  路  Source: ziglang/zig

While working on #5303, I've noticed that formatting std.meta.Vector(N, bool) generates erroneous results. Take this snippet for example:

const std = @import("std");

pub fn main() !void {
    const vs: std.meta.Vector(4, bool) = [_]bool{true, false, true, false};
    std.debug.warn("{}\n", .{vs});
}

Expected result:

{ true, false, true, false }

Actual result:

{ true, true, true, true }

Some observations

  1. This is runtime agnostic (the same behaviour observed in wasmtime as well as wasmer) which makes me believe it's something in Zig's compiler.
  2. Somehow, this seems to be the issue with passing Vector (or it wrapped inside some structure) to std.fmt. For example, when the elements of Vector are printed one-by-one, everything works as expected:
const std = @import("std");

pub fn main() !void {
    const vs: std.meta.Vector(4, bool) = [_]bool{true, false, true, false};
    var i: usize = 0;
    while (i < 4) : (i += 1) {
        std.debug.warn("{}, ", .{vs[i]});
    }
    std.debug.warn("\n", .{});
}
true, false, true, false,



md5-bf63245378646c4c2d4613ddd184e6d4



{ false, false, false, false }

I tried to figure out where the source of this error might be coming from, but it seems I know too little about the structure of Zig's compiler. My hunch here is that this is something to do with the way comptime and var are handled in wasm32 as the target, and if anyone points me in the right direction, I'll be happy to investigate further and fix it.

arch-wasm32 bug upstream

Most helpful comment

A quick update, I've been asked by LLVM devs to compile the generated *.ll with the latest upstream, and the output is correct with -O0 flag. This means when LLVM-11 is out, this bug will likely disappear, however, for the time being, I suggest leaving this issue as a reminder/tracker. See the LLVM upstream bug for more details.

All 9 comments

Quick update:

I believe I've narrowed the issue down to storing/loading ZigTypeIdVector in function calls. So this:

const vs: std.meta.Vector(2, bool) = [_]bool{ true, false };

fn doit(vs: var);
doit(.{vs}); // seems OK
doit(vs); // stores everything equal to the first declared element, in this case, "true"

Hence, we observe the problem in std.debug.warn but not at this function callsite, but actually (if you trace along) at std.fmt.formatValue callsite in the body of std.fmt.formatType.

On the LLVM IR, the types for the element all seem fine and equal to i1 as they should be.

And here's the (close to) minimal working example:

const Vector = @import("lib/zig/std/meta.zig").Vector;

fn hello(vs: Vector(4, bool)) void {
    var i: usize = 0;
    while (i < 4) : (i += 1) {
        _ = vs[i];
    }
}

export fn hmm() void {
    const vs: Vector(4, bool) = [_]bool{ false, false, true, false };
    hello(vs);
}

pub fn panic(msg: []const u8, error_return_trace: ?*@import("builtin").StackTrace) noreturn {
    while (true) {}
}

@mikdusan would you be able to point me in the right direction in the compiler source to track this down further? Apologies if I confused some bits here, still learning the internals of zig as well as LLVM itself, but I think the issue is somewhere in codegen.cpp where the var gets created and the pointer gets stored for later use. When we load the Vector back from the stored pointer the values are already corrupted, i.e., IrInstGenLoadPtr *instruction in ir_render_load_ptr @ codegen.cpp:3582 when dereferenced, already points at a vector with false elements only (for the example immediately above).

@kubkon, not really sure where to start on this except there seem to be some relevant issues open with Vector and various arch; see #4486 and #3317

@kubkon, not really sure where to start on this except there seem to be some relevant issues open with Vector and various arch; see #4486 and #3317

Ah, thanks for the links! Somehow it didn鈥檛 occur to me the same problem might appear on different target archs.

@andrewrk this might be of interest as it seems related to #3317, although I didn't have the time to compare both in detail yet.

OK, after a bit of tinkering, I think I've managed to confirm that this is also a bug with the LLVM itself, although I wouldn't mind another pair of eyes on this! But I'm getting ahead of myself. First of, here's the reduction that generates faulty load/store Wasm instructions. Oh, and make sure you compile with wasm32-freestanding target to avoid unnecessary bloat.


Zig source code

const std = @import("std");
const Vector = std.meta.Vector;

fn foo(vs: Vector(2, bool)) bool {
    var a = vs[0];
    var b = vs[1];
    return b;
}

export fn entrypoint() bool {
    const vs: Vector(2, bool) = [_]bool{ false, true };
    return foo(vs);
}

pub fn panic(msg: []const u8, error_return_trace: ?*@import("builtin").StackTrace) noreturn {
    while (true) {}
}

Generated LLVM IR

; Function Attrs: nobuiltin nounwind
define i1 @entrypoint() #1 {
Entry:
  %result = alloca i1, align 1
  %0 = call fastcc i1 @foo(<2 x i1> <i1 false, i1 true>)
  store i1 %0, i1* %result, align 1
  %1 = load i1, i1* %result, align 1
  ret i1 %1
}

; Function Attrs: nobuiltin nounwind
define internal fastcc i1 @foo(<2 x i1> %0) unnamed_addr #1 {
Entry:
  %result = alloca i1, align 1
  %a = alloca i1, align 1
  %b = alloca i1, align 1
  %vs = alloca <2 x i1>, align 2
  store <2 x i1> %0, <2 x i1>* %vs, align 2
  %1 = load <2 x i1>, <2 x i1>* %vs
  %2 = extractelement <2 x i1> %1, i32 0
  store i1 %2, i1* %a, align 1
  %3 = load <2 x i1>, <2 x i1>* %vs
  %4 = extractelement <2 x i1> %3, i32 1
  store i1 %4, i1* %b, align 1
  %5 = load i1, i1* %b, align 1
  store i1 %5, i1* %result, align 1
  %6 = load i1, i1* %result, align 1
  ret i1 %6
}

Disassembled WAT

    1 (module
    2   (type (;0;) (func (result i32)))
    3   (type (;1;) (func (param i32 i32) (result i32)))
    4   (func $entrypoint (type 0) (result i32)
    5     (local i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32)
    6     global.get 0
    7     local.set 0
    8     i32.const 16
    9     local.set 1
   10     local.get 0
   11     local.get 1
   12     i32.sub
   13     local.set 2
   14     local.get 2
   15     global.set 0
   16     i32.const 1
   17     local.set 3
   18     i32.const 0
   19     local.set 4
   20     local.get 4
   21     local.get 3
   22     call $foo
   23     local.set 5
   24     i32.const 1
   25     local.set 6
   26     local.get 5
   27     local.get 6
   28     i32.and
   29     local.set 7
   30     local.get 2
   31     local.get 7
   32     i32.store8 offset=15
   33     local.get 2
   34     i32.load8_u offset=15
   35     local.set 8
   36     i32.const 16
   37     local.set 9
   38     local.get 2
   39     local.get 9
   40     i32.add
   41     local.set 10
   42     local.get 10
   43     global.set 0
   44     local.get 8
   45     return)
   46   (func $foo (type 1) (param i32 i32) (result i32)
   47     (local i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32)
   48     global.get 0
   49     local.set 2
   50     i32.const 16
   51     local.set 3
   52     local.get 2
   53     local.get 3
   54     i32.sub
   55     local.set 4
   56     i32.const 1
   57     local.set 5
   58     local.get 0
   59     local.get 5
   60     i32.and
   61     local.set 6
   62     local.get 1
   63     local.get 5
   64     i32.shl
   65     local.set 7
   66     local.get 6
   67     local.get 7
   68     i32.or
   69     local.set 8
   70     i32.const 3
   71     local.set 9
   72     local.get 8
   73     local.get 9
   74     i32.and
   75     local.set 10
   76     local.get 4
   77     local.get 10
   78     i32.store8 offset=10
   79     local.get 4
   80     i32.load8_u offset=10
   81     local.set 11
   82     local.get 11
   83     local.get 5
   84     i32.and
   85     local.set 12
   86     local.get 4
   87     local.get 12
   88     i32.store8 offset=14
   89     local.get 4
   90     i32.load8_u offset=10
   91     local.set 13
   92     i32.const 1
   93     local.set 14
   94     local.get 13
   95     local.get 14
   96     i32.and
   97     local.set 15
   98     local.get 4
   99     local.get 15
  100     i32.store8 offset=13
  101     local.get 4
  102     i32.load8_u offset=13
  103     local.set 16
  104     i32.const 1
  105     local.set 17
  106     local.get 16
  107     local.get 17
  108     i32.and
  109     local.set 18
  110     local.get 4
  111     local.get 18
  112     i32.store8 offset=15
  113     local.get 4
  114     i32.load8_u offset=15
  115     local.set 19
  116     local.get 19
  117     return)
  118   (table (;0;) 1 1 funcref)
  119   (memory (;0;) 2)
  120   (global (;0;) (mut i32) (i32.const 66560))
  121   (export "memory" (memory 0))
  122   (export "entrypoint" (func $entrypoint)))

If my maths is correct and provided I didn't do any silly mistake along the way anywhere, the last i32.store8_u offset=15 op in line 112 will store the value of 0 (which in Wasm is equivalent to false), and this value will then be stored in the function's local vector at index 19, loaded and returned. This is obviously incorrect as we'd expect the function to return anything but 0 (any other value would but 0 would be equivalent to true in Wasm AFAIK).

cc @sunfishcode if you have a spare couple of minutes, if you could have a look at the LLVM IR and WAT, that'd be great!

cc @andrewrk could you have a look and quickly double check this makes sense? If we agree this is indeed a bug with the LLVM, I'll file an issue in their upstream.

I've had an offline chat with @sunfishcode about this, and he pointed me to this long-standing LLVM issue: 1784 - vectors of i1 and vectors x86 long double don't work. I've done some more tests and I can confirm that for any "exotic" types such as i1 or u30, etc., this problem persists in Wasm, and vector gets miscompiled.

Given that the LLVM bug report has been there for some time (since 2007!), and a fix is not easily provided (I'm being told it would require reworking a lot of the codegen surface in LLVM), I'm trying to figure out if there's anything in the meantime we could do on our end. When it comes to handling vector of bool, I was thinking if, when targeting Wasm, we could represent bool as a type which is a multiple of a byte, such as i8. Then, the problem disappears. However, this still wouldn't fix the case of other exotic types such as u30 or whatnot. Perhaps in that case, we could throw a compilation error for now? @andrewrk what are your thoughts on this I wonder? Do you think we should try addressing it on our end, or perhaps wait it out until self-hosted arrives and handle it properly there?

Just some heads-up that after an even closer analysis this bug looks like it suffers from the same problem as #3317, and I've now submitted an upstream bug report which can be viewed here: 46127 - wasm miscompilation of vector instructions.

A quick update, I've been asked by LLVM devs to compile the generated *.ll with the latest upstream, and the output is correct with -O0 flag. This means when LLVM-11 is out, this bug will likely disappear, however, for the time being, I suggest leaving this issue as a reminder/tracker. See the LLVM upstream bug for more details.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

S0urc3C0de picture S0urc3C0de  路  3Comments

DavidYKay picture DavidYKay  路  3Comments

dobkeratops picture dobkeratops  路  3Comments

andersfr picture andersfr  路  3Comments

bheads picture bheads  路  3Comments