Rescript-compiler: Proposal to provide cleaner representation for poly variants

Created on 10 Apr 2020  路  14Comments  路  Source: rescript-lang/rescript-compiler

One of the pain points of using bucklescript over typescript are the type representations when compiled to js. Having the unboxed option type turns out to be extremely useful both for targeting js (ie: writing libs in bucklescript that can be used in js projects) and ingesting js libs.

The proposal here is provide a cleaner js representation for poly variants or both poly and regular variants. For example:

let a = `foo

becomes

const a = "foo"

and

let b = `bar 123

becomes

const b = { bar: 123 }

This would make it much easier to interact with js libs, making something like [@bs.unwrap] unnecessary. It would also make it easier to write libraries that can be used as both a bucklescript library and a plain js library, because the representations would compile to clean js code without the need for a compatibility module for js with jsConverters that could potentially lead to a really large bundle size.

enhancement

Most helpful comment

so for non-debug mode it would be

{ HASH: xx, VAL : xx}

for debug mode, it is

{ HASH: xx,  VAL : xx, [Symbol.for("name")] : "bar"}

Note that the current encoding is better but still not ideal, we will stop here and revisit it when we upgrade the upstream compiler

All 14 comments

This is a cool idea, but perhaps as an optional extension? I really like how poly variants are represented in JS at the moment, at a certain size binary tree lookup will be more efficient than a switch statement (although it would be nice if the numbers used were smaller to reduce bundle size).

What about tuples though?

let b = `bar(123, 456)

var b = { bar: [123, 456] } is still a bit weird.

There's also a question of whether this could be implemented in a way that would allow modeling of discriminated/tagged unions like we've seen from @ryyppy https://dev.to/ryyppy/tagged-unions-and-reasonml-variants-4428 (and @zth has mentioned this idea as well)

It might be out of scope for this idea, but it also might be a nice thing to consider to see if it's capable of being succinctly represented.

So in #3801 Bob is proposing to encode ordinary variants like Error "foo" ==> { tag: 1, _1: "foo" } (note numeric tag), I think it then makes sense to encode polyvariants like `Error "foo" ==> { tag: "Error", _1: "foo" } (note string tag).

In that case, you'd presumably want to do the equivalent of [@bs.discriminator_as "kind/tag/type"] ?

Polymorphic variants also have a numeric tag today, it's just a long generated number. That might make it quite difficult to use strings for the tag field.

I am thinking to make it like this {hash: long_number, value: .. , name : "Error"}

@bobzhang Is that also for flat polymorphic variants?

To model a lot of the enums that underlying JavaScript uses strings for, it'd be great to be able to compile out to strings (especially if it also works with [@bs.as]):

module Column = {
    [@bs.deriving abstract]
    type props = {
      [@bs.optional]
      align: [ | `left | `right | [@bs.as "Center"] `center],
    };
  }
}

Have left, right, and center compile out to "left", "right", and "Center" would really lighten the burden of good, safe interop.

For _interop_ I think what would be _really_ nice is:

  • Like @sgrove says, raw strings corresponding to the variant name for poly and regular variants that don't have a payload. Again, just like @sgrove says, this would make bindings _much_ cleaner.
  • For variants _with_ a payload, compiling them to objects with a (configurable?) discriminator key.

Let me expand a bit on that last point. As can be seen in the article from @ryyppy and from looking at how most ASTs (and unions overall) are represented in TS/JS, there's a whole host tools that use a structure like this for unions in TypeScript/Flow:

interface VariableNode {
  kind: "VariableNode"; // String literal
  value: string; // ...any props specific to this particular node is just put right on the node itself
  name: string;
}

interface ArgumentNode {
  kind: "ArgumentNode"; // String literal
  argumentName: string;
}

type ast = VariableNode | ArgumentNode;

If we could get BuckleScript to somehow natively understand that type of structure without needing conversion, that would be very powerful. Some pseudo code:

type ast [@bs.discriminator "kind"] = VariableNode({ value: string, name: string }) | ArgumentNode({ argumentName: string });

This would mean that we could bind to a whole host of ASTs in JS with zero runtime costs for the developer (Babel, TypeScript, GraphQL are just a few I've seen that uses that structure, but I think using kind as a discriminator is a convention that most use), which in turn would unlock some really interesting use cases since Reason is very well fit for working with ASTs and unions.

Again, this is just what I think would be great for interop, no other cases taken into consideration here ;)

@zth yeah, similar to https://github.com/BuckleScript/bucklescript/pull/3801#issuecomment-615026999 . Fwiw I think for polymorphic variants (with payloads) a tag field with the actual variant tag as a string makes a lot of sense. It makes exporting to JS/TS/Flow simpler.

For polyvariants a [@bs.discriminator] etc. attribute makes less sense because polyvariant types are often not defined beforehand and their values are often not annotated anyway. That's part of their lightweight nature. Maybe a compiler flag would make sense (not sure).

@yawaramin I can see a use case for both forms:

type t = [ `foo | `bar of int ]
let a = `foo
let b = `bar 123
  1. Compiled to simple strings:
var a = "foo"
var b = { type: "bar", value: 123 }
  1. Compiled to uniform syntax so that you can 'pattern match' on it with a switch statement (with hopefully the value staying in a single key value for ergonomics):
var a = { type: "foo" }
var b = { type: "bar", value: 123 }

I don't think a compiler flag would fix all the issues, because you might want to mix and match styles. And I can also see there being a big split on this issue since there's a desire for both styles.

The thing is though that, unlike #1, the #2 form can be achieved with some reworking of the code:

type t = [ `foo of unit | `bar of 123 ]
let a = `foo ()
let b = `bar 123

compiling to:

var a = { type: "foo" }
var b = { type: "bar", value: 123 }

Which would make the #1 version preferable in my opinion, since #2 is still achievable and not unpleasant to work with.
And the same could be applied to ordinary variants
cc @bobzhang @chenglou

@Risto-Stevcev oh I didn't mean a compile flag for the simple string vs uniform shape outputs, I meant for the name of the type prop. Some people call it tag, others kind, etc. For polyvariants without a payload I kind of assumed a simple string is the best option.

As an intermediate step, in next version , for non-nullable poly-variant
`bar 123 is going to be compiled into { hash : HASH_OF_BAR, value : 123, name : "bar}
I agree the data-representation could be made even better, that's something we plan to do in the future

Ah so you're keeping the hash for all, nice. At least my concern won't be a problem, even if it's still less than ideal. I can guess how hard it would be to not have the hash available at runtime.

so for non-debug mode it would be

{ HASH: xx, VAL : xx}

for debug mode, it is

{ HASH: xx,  VAL : xx, [Symbol.for("name")] : "bar"}

Note that the current encoding is better but still not ideal, we will stop here and revisit it when we upgrade the upstream compiler

Was this page helpful?
0 / 5 - 0 ratings

Related issues

chenglou picture chenglou  路  4Comments

tanaka-de-silva picture tanaka-de-silva  路  5Comments

andares picture andares  路  5Comments

bobzhang picture bobzhang  路  5Comments

jordwalke picture jordwalke  路  4Comments