Language: Set literal

Created on 8 Oct 2018 · 16Comments · Source: dart-lang/language

Solution for #36, feature specification.

This issue is for discussing the pros and cons of this particular proposal.
General discussion about the issue should go in #36 where everybody can see it.

I propose to add set literals to Dart by using <int>{1, 2, 3} syntax. This is distinguishable from a map literal in every case except the empty set with not type variable, {}. In that case, we should use the context type to pick the correct Set or Map type, defaulting to a map if the context type is not prohibiting it from being a map.

(Historical design proposal).

feature

Source

lrhn

👍6

Most helpful comment

Closing; expected to launch in Dart 2.2

mit-mit on 22 Feb 2019

🎉5

All 16 comments

It has been brought up that this proposal intercepts with another proposed future feature: Collection spreads which allows you to write [1, 2, ...someListExpression, 5] as a literal. It works for both lists and maps, and should also work for sets.
Since the syntax for spreading an iterable in a list literal and a map in a map literal is the same, and it would be the same for set literals, it means that there are more ambiguous expressions than the empty map/set, like {... something}. That is, any map/set literal with no type arguments and containing only spreads needs to be disambiguated.

If there is a useful context type, then we can still use that to determine whether it's a set or map literal.
If the context type allows both sets and maps, then the fact that we have an exact type assumption on literals allows us to make that a compile-time error, so we can assume that either there is no context type, or it's a type that neither Iterable nor Map is assignable to.

If the literal is empty (zero spreads) it still needs to be a map. With no context type, that also cannot be an error.

If there is no useful context type and at least one spread, then we can either give up (make it a compile-time error) or use the static type of the spread expressions to determine whether it's a set or map literal.

Giving up is a valid and safe option - you want to write {... e}, you have to say whether it's a map or a set, since the syntax doesn't say anything.

Or we can look at the static type of e.

The complication here is that the static type of e depends on the context type for e, which again depends on whether the surrounding context is a set or a map, which is what we are trying to deduce.
In a set/list literal with static type Set<T>/List<T>, the context type of e in ...e is Iterable<T>.
We can even be in a position where we know that the context type is a set, but the type argument is still unknown (to-be-inferred), like

void foo<T>(Set<T> set) {}
main() {
   foo({... [1, 2]});  // context type of `[1, 2]` is Iterable<?>, it gets filled out with <int> by the list.
}

So, if we try to find the static type of the spread expressions, we do it with no context type, which is still possible, but may give a different result than what we would get if we had written the inferred type explicitly.
We then need to figure out how to deduce the set/map-ness of the container from those types.

I see three options:

If all spread expressions have static types assignable to Iterable<Object>, and none have static types assignable to Map<Object, Object>, it's a set. In the opposite case (all maps, no iterables), it's a map. Otherwise it's a compile-time error. That's probably unnecessarily harsh.
If all spread expressions have static types assignable to Iterable<Object>, and not all have static types assignable to Map<Object, Object>, it's a set. In the opposite case (all maps, not all iterables), it's a map. Otherwise it's a compile-time error (that is, either not all can be used in a set literal or not all can be used in a map literal, or all of them can be used in both - the first two cases are plain errors, and the last one is unavoidably ambiguous - and it can easily happen because dynamic is down-castable to both).
If all spread expressions have static types assignable to Iterable<Object>, and not all have static types assignable to Map<Object, Object>, it's a set. Otherwise it's a map.

The last case generalizes the default handling of the completely empty {} to all ambiguous literals that are not obviously sets. That may still cause a compile-time error if the spread expressions are not actually maps, it's just a different error message ("spreading non-map inside map literal" rather than "cannot determine if map or set literal"). It will allow more programs to compile, but perhaps not with the author's intended meaning.

An example of a pitfall using any of the above cases (assume it's generated code, not just me being randomly silly):

var setUnion = {...{1, 2}, ...{}, ...{5}};

With no context type, the ...{} will deduce that {} is a map, and since {1, 2} and {5} are clearly sets, the union itself is inconsistent.
There are many ways to disambiguate the code, but the fact that it's necessary, even though it's "obvious" to a human what it means, suggests that we have a dangerous design.

I'm leaning towards the second option - use the static type of the spread expressions, and if that gives a unique answer which ensures the code will compile, use that. If it doesn't give an answer, or doesn't give a unique answer, then make it an error. We will not make a guess that will only be proven wrong at run-time.

Another example to consider:

var x = {...{...{...{}}}};

This is a map. There is no context type on the way down, so the inner {} must be a map, and then the rest follows by propagating the choices back up.

(Also, note to self: Never make Map implement Iterable!)

lrhn on 9 Oct 2018

Here's an attempt to keep it simple.

The rule of thumb would be "When we have evidence for a set and no evidence for a map, it will be a set. Otherwise, it will be a map". This means that we have no ambiguity error, all ambiguous forms will be checked assuming that they specify a map, and we may then get errors because they are half-and-half, but the point is that we have a consistent and backward compatible default: "Otherwise, it will be a map".

In more detail, {} with no context type is a Map<dynamic, dynamic> (forced by existing code). With {...spread1 ...spread2} with no context, it is a Set<E> (with bottom-up inference of E, defaulting to dynamic) if one or more of the spread expressions has a static type which is a subtype of Iterable<dynamic>, and all of the spread expressions are not a subtype of Map<dynamic, dynamic>. Otherwise it is a Map<K, V> (where K and V are obtained from bottom-up inference, defaulting to dynamic).

This means that we require unambiguous evidence for sets (noting "Never make Map implement Iterable!"!), and otherwise insist that the analysis must be performed assuming a map.

Note that I'm using 'subtype' rather than 'assignable', because the latter will always apply for dynamic expressions (and similar stuff), and only the former will positively indicate that there is a request for making it a set or a map.

I think this is a rule which is easy to remember and understand, and this means that we don't need to cook up any complex ways to determine whether {...myList ...myMap ...myOtherList} is a map that wrongly got some lists added to it, or a list that wrongly got a map added to it.

eernstg on 9 Oct 2018

I don't actually think that makes it simpler. It provides a simple system, but you have to remember and understand the system.

I still think that assuming Map without evidence is a bad design.

If you write:

var x = { ... dynamicExpression };

we would then accept and compile that as a map literal, and potentially fail at run-time if dynamicExpression is a list.

I think the user would be better helped by us flagging the ambiguity at compile-time and refusing to continue.
That problem does not happen for the empty {} because it cannot be disproven as a map at run-time.

The example:

var x = { ... listAndMapExpression };

where the static type implements both Iterable and Map is possible (but exceedingly rare) and also ambiguous. That one can also not be disproven at run-time (both will work), but I again think the users should be notified that their code isn't clear.

I prefer not to guess at compile-time. If there is only one valid interpretation, pick that. If there is more than one (or definitely if there is less than one), fail visibly and early.

I think this is actually easy for users to understand: When it works, it works in the way they expect it to (they gave enough hints about what they meant). When it doesn't work, they are told early and can give the necessary hint. At no time do we get to run-time with an interpretation other than what they expected (except perhaps the empty set, where our hands are tied).
There is no system to remember except "if it can be both, it's not allowed". That is simpler than any system for making a guess, which users do need to understand, because their code might run with that guess.

lrhn on 9 Oct 2018

👍1

Right, the strict approach (where anything which is ambiguous is simply an error) is safer, in the sense that it will allow developers to catch unintended ambiguities.

The trade-off is that things like var x = { ...dynamicExpression }; will be an error, no exceptions, even though it could be a concise way (using my proposal) to get a Map<dynamic, dynamic>. I do recognize that the decision to make it a map rather than a set is based on the history rather than any inherent properties of maps and sets, but we do have the extreme case {} (with no context type) which _will_ be a map.

Other than that, I can live with any of these proposals. ;-)

eernstg on 9 Oct 2018

I definitely agree with the earlier comment that we should avoid magic and be predictable.

I wish there was some syntactic way we could be 100% unambiguous. With types given, it's unambiguous: <Foo>{ } is a set. We could use some other syntax, but I can't find any that is sane and not already used by something... <>{} looks ugly and is weird and inconsistent, [. .] doesn't work because of the spreading operator, {{ }} is ambiguous with maps and blocks, ({ }) and {( )} are ambiguous in other ways, *{ } looks okish but looks terrible with types: <Foo>*{}...

Hixie on 10 Oct 2018

About syntax we could also use a prefix (like raw string) : s[1, 2]

a14n on 10 Oct 2018

The s[1, 2] syntax works badly for singleton sets: s[0] already means something else. We'd need a non-expression as the prefix, and that probably won't improve readability.

lrhn on 10 Oct 2018

I wish there was some syntactic way we could be 100% unambiguous.

Me too. One option is {,}, but that's kinda hokey too.

munificent on 11 Oct 2018

We have now added a more precise feature specification based on the original design proposal.

Please take a look at this specification, and give feedback if it is unclear or if you anticipate any implementation issues. If there are no objections, we will schedule the design for implementation in Q4.

@kmillikin @sigmundch @a-siva @mraleph @vsmenon @johnniwinther @munificent @danrubel @jwren @jcollins-g

lrhn on 2 Nov 2018

Hi, thanks! Set literals will be a nice feature.

Examples seems to be missing code quotes starting with "var v16", making
them hard to read.

It would also be great if, for those of us who don't read language
specifications every day, the examples section would indicate
interpretations in Dart prior to _and_ after the feature (even if "prior
to" means "compilation error"). This will help to highlight where the
differences are in practical terms. Besides being a sort of TL;DR of the
feature specification, it'll also help verify that my understanding of what
you wrote in the specification matches your intention.

Janice

On Fri, Nov 2, 2018 at 5:14 AM Lasse R.H. Nielsen notifications@github.com
wrote:

We have now added a more precise feature specification
https://github.com/dart-lang/language/blob/master/accepted/future-releases/set-literals/feature-specification.md
based on the original design proposal.

Please take a look at this specification, and give feedback if it is
unclear or if you anticipate any implementation issues. If there are no
objections, we will schedule the design for implementation in Q4.

@kmillikin https://github.com/kmillikin @sigmundch
https://github.com/sigmundch@a-siva, @mraleph
https://github.com/mraleph @vsmenon https://github.com/vsmenon
@johnniwinther https://github.com/johnniwinther @munificent
https://github.com/munificent @danrubel https://github.com/danrubel
@jwren https://github.com/jwren @jcollins-g
https://github.com/jcollins-g

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/dart-lang/language/issues/37#issuecomment-435358419,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ANdn2z0cufvFDYXSjezWtv7JDdEn1caZks5urDXmgaJpZM4XMqoc
.

jcollins-g on 2 Nov 2018

I've fixed the formatting issue and added a "tl;dr" summary section.

All set literals would be compile-time errors prior to the feature.
All existing compiling code is unaffected by the feature.

lrhn on 5 Nov 2018

👍1

@lhrn - what valid rewrite transformation we can do in kernel for initial implementation (i.e., without direct backend support)?

E.g., the summary suggests this:

<T>{x, y, z} => Set<T>()..add(x)..add(y)..add(z)

Would this also work?:

<T>{x, y, z} => Set<T>.from([x, y, z])

vsmenon on 5 Nov 2018

Both should work, or perhaps Set<T>()..addAll([x, y, z]).

I'd probably use Set<T>.of(<T>[x, y, z]) rather than .from just to be sure to catch type errors early, and avoid unnecessary type checks at run-time.

For constant maps, that won't work. There is currently no constant map implementation in the SDK.
A possible rewrite is:

const <T>{c1, c2, c3} => const _SetWrapper<T>(<T, T>{c1: c1, c2: c2, c3: c3})

(after removing equal objects from c1..c3), where _SetWrapper is an internal set implementation, based on UnmodifiableSetBase (if we have that, otherwise it should be possible to make one) which forwards to methods on the map (we need the value to implement Set.lookup).

lrhn on 6 Nov 2018

Closing; expected to launch in Dart 2.2

mit-mit on 22 Feb 2019

🎉5

Maybe complete effective dart examples
https://www.dartlang.org/guides/language/effective-dart/usage#do-use-collection-literals-when-possible