Design: Way of validating imports and exports are as expected

Created on 27 Jun 2019  路  15Comments  路  Source: WebAssembly/design

To help with linking, identifying and verifying the ABIs of modules for our package manager, and making using our runtime easier to use in plugin systems (which expect modules to conform to their own custom, possibly hybrid interface), we've made something called "wasm contracts" which are an s-expression text format that can be merged together and can be used to validate a module.

The README linked above explains the details of it.

In some ways this is a subset of 鉀勶笍 bindings, but it serves a different enough purpose and is simple enough that we're able to use it immediately and benefit from it, that we thought that we'd share it here.

We're looking for feedback on this -- please let us know your thoughts!

Most helpful comment

I'd suggest to stick as close to WAT syntax as possible, to minimise confusion. That suggests not to use asserts but a module "pattern" as description. For example:

(module
   ;; imports
   (memory (import "m" "mem") 1 10)
   (func (import "m" "f") (param $x i32) (result 164))
   ;; exports
   (func (export "g") (param i32 i32) (result i32))
)

All 15 comments

Nice, this looks useful. I'd have some syntax nits, but structurally it makes sense to me. My main comment is that I'd recommend a rename, though. These are module interfaces/signatures, plain and simple. When you say "contract", many people will expect something way richer, especially in the S-expression community, where contracts are a very sophisticated thing.

Great feedback. We will rename from "wasm contracts" to "wasm interfaces".

You mentioned you had some syntax suggestions. We would love to get your input on this, as we keep iterating on it :)
Let us know!

I'd suggest to stick as close to WAT syntax as possible, to minimise confusion. That suggests not to use asserts but a module "pattern" as description. For example:

(module
   ;; imports
   (memory (import "m" "mem") 1 10)
   (func (import "m" "f") (param $x i32) (result 164))
   ;; exports
   (func (export "g") (param i32 i32) (result i32))
)

@rossberg Should we have some separate top-level identifier than module, though, to clearly distinguish normal module definitions from signatures? Say, module-signature, signature (or sig ;)? For example, maybe we'd want to allow signatures in .wast along with some assert_module_has_signature statement for testing purposes.

Thanks for the suggestions!
We adapted the WebAssembly Interface syntax based on your feedback.

We thought about using module, but as @lukewagner suggested it might be better if it doesn't collide with the text-spec module... so we went with the most obvious name: interface.

Example

The WASI interface would look like:

(interface "wasi_unstable"
  ;; Here's a bunch of function imports!
  (func (import "wasi_unstable" "args_get") (param i32 i32) (result i32))
  (func (import "wasi_unstable" "args_sizes_get") (param i32 i32) (result i32))
  (func (import "wasi_unstable" "clock_res_get") (param i32 i32) (result i32))
  (func (import "wasi_unstable" "clock_time_get") (param i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "environ_get") (param i32 i32) (result i32))
  (func (import "wasi_unstable" "environ_sizes_get") (param i32 i32) (result i32))
  (func (import "wasi_unstable" "fd_advise") (param i32 i64 i64 i32) (result i32))
  (func (import "wasi_unstable" "fd_allocate") (param i32 i64 i64) (result i32))
  (func (import "wasi_unstable" "fd_close") (param i32) (result i32))
  (func (import "wasi_unstable" "fd_datasync") (param i32) (result i32))
  (func (import "wasi_unstable" "fd_fdstat_get") (param i32 i32) (result i32))
  (func (import "wasi_unstable" "fd_fdstat_set_flags") (param i32 i32) (result i32))
  (func (import "wasi_unstable" "fd_fdstat_set_rights") (param i32 i64 i64) (result i32))
  (func (import "wasi_unstable" "fd_filestat_get") (param i32 i32) (result i32))
  (func (import "wasi_unstable" "fd_filestat_set_size") (param i32 i64) (result i32))
  (func (import "wasi_unstable" "fd_filestat_set_times") (param i32 i64 i64 i32) (result i32))
  (func (import "wasi_unstable" "fd_pread") (param i32 i32 i32 i64 i32) (result i32))
  (func (import "wasi_unstable" "fd_prestat_get") (param i32 i32) (result i32))
  (func (import "wasi_unstable" "fd_prestat_dir_name") (param i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "fd_pwrite") (param i32 i32 i32 i64 i32) (result i32))
  (func (import "wasi_unstable" "fd_read") (param i32 i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "fd_readdir") (param i32 i32 i32 i64 i32) (result i32))
  (func (import "wasi_unstable" "fd_renumber") (param i32 i32) (result i32))
  (func (import "wasi_unstable" "fd_seek") (param i32 i64 i32 i32) (result i32))
  (func (import "wasi_unstable" "fd_sync") (param i32) (result i32))
  (func (import "wasi_unstable" "fd_tell") (param i32 i32) (result i32))
  (func (import "wasi_unstable" "fd_write") (param i32 i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "path_create_directory") (param i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "path_filestat_get") (param i32 i32 i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "path_filestat_set_times") (param i32 i32 i32 i32 i64 i64 i32) (result i32))
  (func (import "wasi_unstable" "path_link") (param i32 i32 i32 i32 i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "path_open") (param i32 i32 i32 i32 i32 i64 i64 i32 i32) (result i32))
  (func (import "wasi_unstable" "path_readlink") (param i32 i32 i32 i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "path_remove_directory") (param i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "path_rename") (param i32 i32 i32 i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "path_symlink") (param i32 i32 i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "path_unlink_file") (param i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "poll_oneoff") (param i32 i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "proc_exit") (param i32))
  (func (import "wasi_unstable" "proc_raise") (param i32) (result i32))
  (func (import "wasi_unstable" "random_get") (param i32 i32) (result i32))
  (func (import "wasi_unstable" "sched_yield") (result i32))
  (func (import "wasi_unstable" "sock_recv") (param i32 i32 i32 i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "sock_send") (param i32 i32 i32 i32 i32) (result i32))
  (func (import "wasi_unstable" "sock_shutdown") (param i32 i32) (result i32))
)

From this, someone could eventually use it in wast tests, doing something like:

(assert_module_has_interface "wasi_unstable")

Let us know if you have more feedback. We would love to keep iterating on this!
馃

Here's the article that we created showcasing WebAssembly Interfaces, if you want to take a look!

https://medium.com/wasmer/introducing-webassembly-interfaces-bb3c05bc671

Sorry for the late reply.

@lukewagner, from my point of view, interface would be a category error, since this construct is not binding an interface but specifying a module. In more conventional syntactic terms, it means module $X : <sig>, not interface $X = <sig>. Using the name of the description's category would be like specifying functions with the keyword type instead of func.

It's better to keep these straight, since it may well be that you'd eventually want both, interface $X = <sig> and module $X : <sig>. That allows you to define an interface once and specify multiple modules with it (i.e., sig could be $Y, where $Y is bound as an interface).

What you seem to be getting at is that it might be useful to have interface X = <sig> in .wast files, and I agree. But as just said, that's really a different construct from what's proposed here.

@rossberg Actually, what I assumed we were talking about was just the <sig> itself, leaving it up to .wast to say how the <sig> shows up in statements of either the module $X : <sig> or interface $X = <sig> variety. For example, in your comment, I don't see the signature being ascribed to any module or bound to any name and thus the sig in a (sig ...) S-Expression would serve the same role as the sig keyword in a sig/end block in ML.

@lukewagner, well, as a file, it supposedly is the counterpart to a wat file, so it would conceptually be the $x : <sig> for that module, with the wat file being the $x = <mod>.

But even if you view it as just the <sig> itself it would be a stylistic mismatch IMO. To explain why, I need to make a short excursion into the two schools for type syntax. Lacking better names, let me call them as follows:

  • the "types-mirror-the-terms" school: e.g., tuple types are written (T,U,V) like tuple values. This is typically used in down-to-earth programming languages, since it requires inventing and memoising less syntax.

  • the "types-are-their-own-construct" school: e.g., tuple types are written as products T*U*V, different from the term syntax. That's typically chosen in more powerful, especially dependently typed systems, since using the same syntax for both types and terms would cause syntactic ambiguity.

For Wasm, we also have gone with the former: e.g., types of funcs, globals, etc (as occurring in imports) use the same keyword as the respective definitions ("terms"). To be consistent, that would suggest to analogously use the module keyword for module types.

Hope I'm making sense. :)

Hah, interesting point; I hadn't appreciated that distinction. So then I suppose the context tells you whether you're parsing a type or an expression (specifically, either the statement in .wast or the file extension when used standalone in a file).

Ok, so then if we wanted to spec this, could the diff to the spec more-or-less be:

  • add a moduletype to the "Validation" > "Types" section
  • add a production for moduletype in the "Text Format" > "Types" section
  • add a non-normative note in the "Text Format" > "Conventions" section, explaining that the intended "root" productions are: the existing one for module and the new one for moduletype as well as a suggested file extension for the latter (analogous to the .wat suggestion for the former).

?

It seems like moduletype could be useful in the future for defining first-class references to modules that didn't lose the static type info of the imports/exports, allowing an instantiation operation to avoid doing fallible dynamic string-matching. It also seems nice to express this concept as the more mundane wasm concept of "type" rather than inventing an open-ended "IDL".

So then I suppose the context tells you whether you're parsing a type or an expression (specifically, either the statement in .wast or the file extension when used standalone in a file).

Yes, that would be the idea.

If you wanted to add this notion to the spec itself you'd probably need to do slightly more:

  • define abstract syntax for moduletype,
  • define how the text format parses into it,
  • change validation to classify modules with moduletypes instead of the current [externtype] -> [externtype],
  • adjust the embedding appendix accordingly (and the JS API spec using it),
  • perhaps define subtyping on module types.

For the time being I'd be a bit hesitant to overload this with binding-related extensions, though, since it might be preferable to keep some layering separation. You may want to be able to talk about both, the "raw" core Wasm interface and the more high-level bindings interface, because a given bindings interface can be implemented by different raw Wasm modules, by ways of overlaying different binding expressions.

Yes, those bullet points make sense.

For the final point: I agree that there should be a layering separation. I was imagining that the core wasm spec defines a moduletype and the binding spec defines a, let's say, 鈽僲oduletype which would have the same abstract syntax as moduletype but with 鈽僢indingtype in place of valtype in function signatures, etc. From an external perspective, when bindings are used, you can only see the 鈽僲oduletype; the inner core moduletype should be fully encapsulated as well as the particular binding expressions used to map the moduletype to the 鈽僲oduletype. Thus, both moduletype and 鈽僲oduletype would serve the same purpose as defining the external interface of a wasm module. Does that make sense?

Yes, makes sense. Bindings-level module types would essentially be a (syntactic & semantic) super-set language of core module types.

We voted to move this to stage 0, but I don't see that reflected at https://github.com/webassembly/proposals. Has a repo been set up for this proposal yet? I have an issue to raise if only I could find out where.

Oops, sorry! Created https://github.com/WebAssembly/module-types. It has a placeholder overview atm, but I'll try to get at least the motivation and an example written shortly.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

arunetm picture arunetm  路  7Comments

dpw picture dpw  路  3Comments

Artur-A picture Artur-A  路  3Comments

spidoche picture spidoche  路  4Comments

thysultan picture thysultan  路  4Comments