Emscripten: PURE_WASI option

Created on 28 Aug 2020  Â·  7Comments  Â·  Source: emscripten-core/emscripten

In #9479 it was mentioned that PURE_WASI might be eventually implemented in Emscripten, to enable targeting WASI. This was a year ago. I wonder what is the current status. Is it still under consideration?

To give you some context, I ported LLVM's llc tool to WASI. I tried wasi-sdk first but ran into multiple issues, like the lack of atomics in the shipped C++ library, so I switched to Emscripten.

I implemented a subset of syscalls llc depends on. Since my library goes first in the linker invocation, this alternative syscall implementation displaces the Emscripten's one.

So I have my problem solved, but I wonder if there's any interest to have this code in Emscripten to jumpstart PURE_WASI mode.

Note: in the linked issue it was mentioned that Emscripten might eventually adopt a subset of WASI in the regular mode when it doesn't lead to regressions (e.g: UNIX permissions not exposed). The problem here is that WASI filesystem model does not assume the whole filesystem to be exposed. You get a set of "mapped" directories available via pre-opened file descriptors. When accessing a file by path, one has to find the matching pre-opened directory, convert the path to a one relative to that directory and pass the directory descriptor and a relative path to WASI call.

So in order to do anything with the filesystem in WASI you need this mapping logic. It is completely unnecessary when targeting non-WASI mode. Therefore I doubt if Emscripten "regular" mode could adopt much from WASI.

Most helpful comment

@sbc100 I agree that where the wasi-sdk is the right fit it's definitely the right choice. But this may be a case where it isn't because of atomics as @mejedi said, which the emsdk has had stable support for for a long time because it's been supported on the Web. (Unless there is some simple trick to avoid the issue in that codebase?)

In general I'd like to see the emsdk and wasi-sdk converge at some point. That's pretty far off from being practical to consider. But reducing the differences between them is the best way towards that. I think adding more complete WASI support in emscripten helps there, in parallel to improving the wasi-sdk.

All 7 comments

@mejedi I think that would be good! It hasn't been a priority for me personally, but I'd like to see that happen.

Yes, the differences between WASI and POSIX would require that this be a separate mode. I'm not sure what the flag should be called or how it should work exactly, we need to think about that, but in general, yes, I'm guessing it would link in the code you mention and so override the normal POSIX-like behavior.

(In the long term I hope we can make emscripten always emit WASI, and maybe even switch to the wasi-libc, but that would require more work on the spec and toolchain to avoid regressions.)

I'm think if you are targeting pure WASI and only pure WASI you should really use the wasi-sdk and work with wasi-sdk to address any issues you have. I'm curious about the issues you faces because others have been able to get clang and lld work compile and run successfully with wasi-sdk. Did I already point you to https://github.com/binji/wasm-clang?

@sbc100

I'm curious about the issues you faces because others have been able to get clang and lld work compile and run successfully with wasi-sdk.

The lack of atomics is a deal-breaker. It’s hard to tell which clang version was built for the web by the project you are referring to. llvm 10+ doesn’t compile without atomics.

Also wasi-sdk headers produce plenty of warnings during compilation. It’s tolerable but is a clear indication that the project is either in an early stage or is fine with doing things in a messy way.

I'm think if you are targeting pure WASI and only pure WASI you should really use the wasi-sdk and work with wasi-sdk to address any issues you have.

I disagree. My agenda is rather pragmatic. For me wasm is a delivery medium for my work which saves me the trouble of building packages for multiple operating systems. I couldn’t care less if it is considered cool and hip at the moment; what I do care is the user experience. It’s way more seamless with wasmer as opposed to asking people to e.g. configure a custom Debian repository and apt install from it.

I don’t want to have “fix wasi-sdk” as a prerequisite to shipping my work. Until the changes are accepted, I’d have to maintain a forked c++ library and use it in my builds. Compare that to Emscripten. Even if it doesn’t accept my changes, it’s not a big deal - it is a 400 LOC of C in my repository to augment libc at link time.

Most importantly, I have to be reasonably confident that the project compiled to wasm is equally robust as compiled to native code. Emscripten is way more trustworthy since it was used to build high profile projects for the web for years.

@sbc100 I agree that where the wasi-sdk is the right fit it's definitely the right choice. But this may be a case where it isn't because of atomics as @mejedi said, which the emsdk has had stable support for for a long time because it's been supported on the Web. (Unless there is some simple trick to avoid the issue in that codebase?)

In general I'd like to see the emsdk and wasi-sdk converge at some point. That's pretty far off from being practical to consider. But reducing the differences between them is the best way towards that. I think adding more complete WASI support in emscripten helps there, in parallel to improving the wasi-sdk.

I implemented a subset of syscalls llc depends on. Since my library goes first in the linker invocation, this alternative syscall implementation displaces the Emscripten's one.

Hi, I've tried to do the same with, e.g.

void (*__sys_getdents64(int sig, int i ))(int)
{
    return 0;
}

And putting that as the first file to wasm-ld but I still get it in the wasm imports

  (import "env" "__sys_getdents64" (func $__syscall220 (type 6)))

Did you do something different?

If you want to implement this syscall on the native side you would implement void __syscall220(long fd, long dirp, long count). If the linker sees that it will choose it over importing __sys_getdents64

Thanks, yes that works.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

answer1103 picture answer1103  Â·  4Comments

nerddan picture nerddan  Â·  4Comments

HolgerStrauss picture HolgerStrauss  Â·  4Comments

jcfr picture jcfr  Â·  4Comments

rpellerin picture rpellerin  Â·  3Comments