Emscripten: Using wasm_v32x4_load_splat triggers assertion in visitSIMDLoad

Created on 28 Oct 2019  路  9Comments  路  Source: emscripten-core/emscripten

When compiling with -munimplemented-simd128, I'm getting:

wasm-opt: /b/s/w/ir/cache/builder/emscripten-releases/binaryen/src/wasm-interpreter.h:1091: wasm::Flow wasm::ExpressionRunner::visitSIMDLoad(wasm::SIMDLoad *) [SubType = wasm::PrecomputingExpressionRunner]: Assertion `false' failed.
shared:ERROR: '/mnt/c/work/emsdk/upstream/bin/wasm-opt build/meshopt_decoder.wasm -o build/meshopt_decoder.wasm --post-emscripten --no-exit-runtime -O3 --low-memory-unused --strip-debug --strip-producers --pass-arg=emscripten-sbrk-ptr@1536 --pass-arg=emscripten-sbrk-val@26272 --mvp-features --enable-simd' failed (-6)

When using wasm_v32x4_load_splat.

Maybe it shouldn't work in the first place and the unimplemented is supposed to suggest this? :) Although https://github.com/WebAssembly/simd/blob/master/proposals/simd/ImplementationStatus.md page lists implementation status for many "unimplemented" intrinsics so I'm not sure. Feel free to close this if this is expected and there's some other issue tracking this.

All 9 comments

... additionally using a few other intrinsics that are available in "unimplemented" category leads to issues like this:

fatal error: error in backend: Cannot select: t207: v16i8 = WebAssemblyISD::LOAD_SPLAT t196
  t196: i32,ch = load<(load 1 from %ir.arrayidx5.i, !tbaa !5), zext from i8> t0, t141, undef:i32
    t141: i32 = add t170, t210
      t170: i32 = and t169, Constant:i32<255>
        t169: i32 = or t168, t161
          t168: i32 = or t167, t159
            t167: i32 = or t166, t157
              t166: i32 = or t165, t155
                t165: i32 = or t164, t153
...

This is without load_splat being present in the source so I suspect it's generated by the optimizer.

cc @tlively in case you want to reproduce this - https://github.com/zeux/meshoptimizer/tree/vertexcodec-simd has this code under -munimplemented-simd ifdef guards, so

emcc src/vertexcodec.cpp src/indexcodec.cpp -O3 -DNDEBUG -s EXPORTED_FUNCTIONS='["_meshopt_decodeVertexBuffer", "_meshopt_decodeIndexBuffer", "_sbrk"]' -s ALLOW_MEMORY_GROWTH=1 -s TOTAL_STACK=24576 -s TOTAL_MEMORY=65536 -o build/meshopt_decoder.wasm -munimplemented-simd128 -mbulk-memory

triggers the above.

Thanks, I can reproduce the instruction selection failure (second comment) locally. How can I reproduce the binaryen failure from the opening comment?

Oh, sorry - forgot about this 馃槗 this requires disabling the rest of the code that uses -munimplemented-simd128 and changing load to load_splat on the line 1003. I've attached the modified source file so that it's easier to reproduce, the command line is the same.

vertexcodec.zip

https://reviews.llvm.org/D69640 and https://reviews.llvm.org/D69604 fix the LLVM issues. Investigating Binaryen issue now!

Binaryen fix here: https://github.com/WebAssembly/binaryen/pull/2409. I'll close this issue now, but please do let me know if you find any more bugs!

Thank you! I will try this again once the LLVM fixes land. Excited to get SIMD results on this somewhat challenging example :) (doesn't map directly to 4-wide SIMD, uses almost all possible SIMD types, uses pshufb etc.)

It looks like setcc fix was accidentally reverted but not relanded? See https://github.com/llvm/llvm-project/commit/e5cae5692b5899631b5bfe5c23234deb5efda10c#diff-ab02df96323b1f9aa8168e78a46f029c

Oh wow I didn't notice that they reverted both my changes :upside_down_face: Should be fixed now.

Thanks! I've confirmed that all of this works and managed to get an implementation together that even works in Chrome Canary: https://github.com/zeux/meshoptimizer/pull/72. It's pretty hard to do right now because as it turns out Chrome Canary doesn't support load_splat (or v128.const or v128.andnot or a few other instructions) so the code I wrote tries to carefully sidestep all of this in codegen. But hey it works and it's pretty fast as a result :D

Was this page helpful?
0 / 5 - 0 ratings