Roslyn: C# Design Notes for May 3-4, 2016

Created on 10 May 2016  Â·  32Comments  Â·  Source: dotnet/roslyn

C# Design Notes for May 3-4, 2016

This pair of meetings further explored the space around tuple syntax, pattern matching and deconstruction.

  1. Deconstructors - how to specify them
  2. Switch conversions - how to deal with them
  3. Tuple conversions - how to do them
  4. Tuple-like types - how to construct them

Lots of concrete decisions, that allow us to make progress on implementation.

Deconstructors

In #11031 we discussed the different contexts in which deconstruction should be able to occur, namely deconstructing _assignment_ (into existing variables), _declaration_ (into freshly declared local variables) and _patterns_ (as part of applying a recursive pattern).

We also explored the design space of how exactly "deconstructability" should be specified for a given type, but left the decision open - until now. Here's what we decided - and why. We'll stick to these decisions in initial prototypes, but as always are willing to be swayed by evidence as we roll them out and get usage.

_Deconstruction should be specified with an instance (or extension) method_. This is in keeping with other API patterns added throughout the history of C#, such as GetEnumerator, Add, and GetAwaiter. The benefit is that this leads to a relatively natural kind of member to have, and it can be specified with an extension method so that existing types can be augmented to be deconstructable outside of their own code.

The choice limits the ability of the pattern to later grow up to facilitate "active patterns". We aren't too concerned about that, because if we want to add active patterns at a later date we can easily come up with a separate mechanism for specifying those.

_The instance/extension method should be called Deconstruct_. We've been informally calling it GetValues for a while, but that name suffers from being in too popular use already, and not always for a similar purpose. This is a decision we're willing to alter if a better name comes along, and is sufficiently unencumbered.

_The Deconstruct method "returns" the component values by use of individual out parameters_. This choice may seem odd: after all we're adding a perfectly great feature called tuples, just so that you can return multiple values! The motivation here is primarily that we want Deconstruct to be overloadable. Sometimes there are genuinely multiple ways to deconstruct, and sometimes the type evolves over time to add more properties, and as you extend the Deconstruct method you also want to leave an old overload available for source and binary compat.

This one does nag us a little, because the declaration form with tuples is so much simpler, and would be sufficient in a majority of cases. On the other hand, this allows us to declare decomposition logic _for_ tuples the same way as for other types, which we couldn't if we depended on tuples for it!

Should this become a major nuisance (we don't think so) one could consider a hybrid approach where both tuple-returning and out-parameter versions were recognized, but for now we won't.

All in all, the deconstructor pattern looks like one of these:

``` c#
class Name
{
public void Deconstruct(out string first, out string last) { first = First; last = Last; }
...
}
// or
static class Extensions
{
public static void Deconstruct(this Name name, out string first, out string last) { first = name.First; last = name.Last; }
}

# Switch conversions

Switch statements today have a wrinkle where they will apply a unique implicit conversion from the switched-on expression to a (currently) switchable type. As we expand to allow switching on any type, this may be confusing at times, but we need to keep it at least in some scenarios, for backwards compatibility.

``` c#
switch (expr) // of some type Expression, which "cleverly" has a user defined conversion to int for evaluation
{
    case Constant(int i): ... // Won't work, though Constant derives from Expression, because expr has been converted to int
    ...
}

Our current stance is that this is fringe enough for us to ignore. If you run into such a conversion and didn't want it, you'll have to work around it, e.g. by casting your switch expression to object.

If this turns out to be more of a nuisance we may have to come up with a smarter rule, but for now we're good with this.

Tuple conversions

In #11031 we decided to add tuple conversions, that essentially convert tuples whenever their elements convert - unlike the more restrictive conversions that follow from ValueTuple<...> being a generic struct. In this we view nullable value types as a great example of how to imbue a language-embraces special type with more permissive conversion semantics.

As a guiding principle, we would like tuple conversions to apply whenever a tuple can be deconstructed and reassembled into the new tuple type:

``` c#
(string, byte) t1 = ...;
(object, int) t2 = t1; // Allowed, because the following is:
(var a, var b) = t1; // Deconstruct, and ...
(object, int) t2 = (a, b); // reassemble

One problem is that nullable value type conversions are rather complex. They affect many parts of the language. It'd be great if we could make tuple conversions simpler. There are two principles we can try to follow:
1. A tuple conversion is a specific _kind_ of conversion, and it allows specific _kinds_ of conversions on the elements
2. A tuple conversion works in a given setting if all of its element conversions would work in that setting

The latter is more general, more complex and possibly ultimately necessary. However, somewhat to our surprise, we found a definition along the former principle that we cannot immediately poke a hole in:

> An _implicit tuple conversion_ is a standard conversion. It applies between two tuple types of equal arity when there is _any_ implicit conversion between each corresponding pair of types. 

(Similarly for explicit conversions).

The interesting part here is that it's a standard conversion, so it is able to be composed with user defined conversions. Yet, its elements are allowed to perform their own user defined conversions! It feels like something could go wrong here, with recursive or circular application of user defined conversions, but we haven't been able to pinpoint an example.

A definition like this would be very desirable, because it won't require so much special casing around the spec.

We will try to implement this and see if we run into problems.
# Tuple-like construction of non-tuple types

We previously discussed to what extent non-tuple types should benefit from the tuple syntax. We've already decided that the deconstruction syntax applies to any type with a deconstructor, not just tuples. So what about construction?

The problem with allowing tuple literal syntax to construct any type is that _all_ types have constructors! There's no opt-in. This seems too out of control. Furthermore, it doesn't look intuitive that any old type can be "constructed" with a tuple literal:

``` c#
Dictionary<int, string> d = (16, EqualityComparer<int>.Default); / Huh???

This only seems meaningful if the constructor arguments coming in through a "tuple literal" are actually the constituent data of the object being created.

Finally, we don't have syntax for 0 and 1-tuples, so unless we add that, this would only even work when there's more than one constructor argument to the target type.

All in all, we don't think tuple literals should work for any types other than the built-in tuples. Instead, we want to brush off a feature that we've looked at before; the ability to omit the type from an object creation expression, when there is a target type:

c# Point p = new (3, 4); // Same as new Point(3, 4) List<string> l1 = new (10); // Works for 0 or 1 argument List<int> l2 = new (){ 3, 4, 5 }; // Works with object/collection initializers, but must have parens as well.

Syntactically we would say that an object creation expression can omit the type when it has a parenthesized argument list. In the case of object and collection initializers, you cannot omit both the type and the parenthesized argument list, since that would lead to ambiguity with anonymous objects.

We think that this is promising. It is generally useful, and it would work nicely in the case of existing tuple-like types such as System.Tuple<...> and KeyValuePair<...>.

Area-Language Design Design Notes Language-C# Language-VB New Language Feature - Pattern Matching New Language Feature - Tuples

Most helpful comment

That proposed syntax for using tuple-smelling syntax to create arbitrary objects just seems awful to me. First, it feels backwards given the general direction of inference in the C# language. This feels much more like Java, and not in any good way. Second, I fear the type of "short-hand" it will encourage:

Foo(new (x), new (1, 2, 3), new (), new (), new("foo", "bar"));

All 32 comments

That proposed syntax for using tuple-smelling syntax to create arbitrary objects just seems awful to me. First, it feels backwards given the general direction of inference in the C# language. This feels much more like Java, and not in any good way. Second, I fear the type of "short-hand" it will encourage:

Foo(new (x), new (1, 2, 3), new (), new (), new("foo", "bar"));

A possible optimization for deconstruction is to serialize the target property of out parameters in metadata so the compiler doesn't need to actually call Deconstruct method with potentially unused variables (in case of wildcards). Since the implementation is straightforward, I think it'd be nice if the compiler generates it. For example, assuming Deconstruct(out this.First, out this.Last) {},

if (obj is Name (*, "Last"))
// would be equivalent to
if (obj is Name name && object.Equals(name.Last, "Last" ))
  1. Since the Deconstruct method is not called, all properties are lazily evaluated.
  2. There is no need to emit locals for each out parameter even if we don't need them (wildcards) because we are directly using the property itself.
  3. More concise, but this needs a totally new mechanism to be added to out parameter declarations.

@HaloFour In my opinion that's a matter of preference, your example is as unreadable as Foo(1, null, false, "foo"); you are free to actually use named arguments here or not. Even var x = Foo(); is not universally accepted, but still.

I do not believe it's worth implementing tuple-like construction for all types. Why not simply implement implicit conversions to and from Tuple and KeyValuePair to ValueTuple? This will easily cover ~95% of all use cases.

@HaloFour What if it's limited to inline declaration+assignment scenario only? Exactly like the original sample.

Slightly related to #2319

@alrz That's not exactly true. In your example, you can tell all argument types except the null case (and any implicit user conversions). In HaloFour's example, you can't tell anything about the types being used/created and named arguments will not help with that either.

In var x = Foo(); example you can't see the type either. The clue is in method name itself. So named arguments are enough to express the intention, but not necessarily required, IMO. I expect some code style preference to choose to suggest omission of type only in certain places , like where the type is actually visible e.g. field initializers,return, etc, excluding method arguments.

@HaloFour I strongly agreed with your sentiment of let the developer decide. I see parallels.

@jnm2 That's quite a stretch. My comment is in regards to the implementation details of a specific type that happens to be used in a feature. And I am only really parroting the reasoning given by the design team themselves. But that detail doesn't affect the syntax of the feature, or whether or not the feature exists at all. I don't think that you can argue that every (mis)feature should be crammed into the language based on the fact that developers may choose to ignore them. The feature needs to be considered on its own merits.

These notes seem to consider this short-hand construction within the context of tuple conversion. I guess this is based on the fact that both happen to contain parenthesis? But in the "Huh?" example provided, neither the behavior nor the syntax is improved by slapping a new in front of that expression.

@HaloFour That's because you are seeing a tuple with a new keyword added, not a type inferred object creation expression :smile: I think all of this is due to the fact that C# has a C in it. C doesn't have any type inference mechanism (default int doesn't count) and all this is "slapped" to the language, For example var is a _placeholder_ for a type (ouch), perhaps new var (1, 2) would be more consistent in this regard.

@alrz No, I do see it as inferred object creation. My point is that it is unrelated to tuples and it doesn't make sense to propose this as an alternative to tuple conversion to arbitrary concrete types. But regardless of how you consider it, adding new still doesn't make that example any less dirty than it is.

@HaloFour Agreed, there are a lot of "little" features that could be way more useful than an early type inference improvement, it seems that syntactical similarities are the reason behind this being considered with tuples at the same time. Anywho, since fields don't allow type inference you might actually find new() syntax quite useful. Re "dirty" I'm still waiting for tuple casts ((int, int)), or switch((1, 2)), tuple type itself doesn't really look right in (int, int) field; or (int a, int b)Method(int a, int b) declarations.

The syntax List<int> l2 = new (){ 3, 4, 5 }; feels awkward to me... why not instead have a rule like "If a tuple literal is assigned to an IEnumerable, rewrite the tuple literal as a collection initializer." That way you could do: List<int> l2 = (3, 4, 5); This gives much of the benefit we would get from having a list literal. Possibly the rule could be extended to say that assigning a non-literal tuple to an IEnumerable gets rewritten as something like l2 = new List<int>(); foreach($item in tuple) { l2.Add($item); }

Regarding nuples/oneples, I think they're worth it, even if the syntax is awkward. In Python you can express the oneple as (42,) where the comma is required. If you make a trailing comma legal for all tuple literals (there's precedent with collection initializers), then it isn't really _that_ weird.

I for one like Foo(new (x), new (1, 2, 3), new (), new (), new("foo", "bar"));

It clearly states that you are new-ing up a bunch of things that have types that don't matter to the caller and are thus providing no benefit beyond line noise.

Point p = new (3, 4); 
List<string> l1 = new (10); 
List<int> l2 = new (){ 3, 4, 5 };

:+1: if you like my variation :smile::

C# var p = new Point(3, 4); var l1 = new List<string>(10); var l2 = new List<int>{ 3, 4, 5 };

@MadsTorgersen, is this really ambiguous?

List<int> l2 = new { 3, 4, 5 };

This one probably is:

C# List<string> strs = new { person.FirstName, person.LastName, person.FullName };

Why, @omariom?

@paulomorgado
On second thought I changed my mind - the type is clearly defined.

Providing C# syntax for functional concepts (e.g. adding matching within the switch syntax) may make it more approachable for C# programmers, but you should also consider providing alternative functional syntax (i.e more terse syntax) for those with exposure to other functional languages.

For the case of collection initialisers, can't you just omit the new, type name and parens?

List<int> list = { 3, 4, 5 };

This would be more in keeping with array initialisers.

Not sure omitting the type from an object creation expression in general is that useful... we already have var right? IMO, at construction it's more trivial to write the constructor with the parameters/initializers other than specifying the type on the left... as for reading/modifying, we have intellisense :)

:+1: @drewnoakes (although, it would be possible only for variable declarations I guess?)

Also, I think the ability to omit the type parameter list (aka diamond syntax) can be a good addition to type inference system (omission of of the whole type might be not desirable most of the time).

Some thoughts on this at explicitly created issue

11910

I don't understand the purpose of Deconstruction, or rather, I don't understand how it is a Language Feature. Is it just providing a standard interface to "read" instances of types?

@asibahi deconstruction as mentioned here is not a language feature, it is a library pattern which future language features might depend on. The language feature might look like this in usage:

(var x, var y) = GetCoordinates();

where GetCoordinates() is:

Coordinate GetCoordinates() { ... }

With a well defined set of methods for the language to look for, this deconstruction idiom can be compiled.

I don't think custom deconstruction needs any support other than the languages features we already have...

If someone wants to write a type that can be deconstructed then they can simply write an implicit cast to a tuple type:

class MyType
{
    public string Part1 { get; set; }
    public string Part2 { get; set; }

    public static implicit operator (string, string)(MyType @this)
    {
        return (@this.Part1, @this.Part2);
    }
}

@Richiban but what if you want to disambiguate between Foo(double, double) and Bar(double, double)? There's also the case of fallible deconstructors that your approach doesn't cover.

@orthoxerox Foo and Bar are types, right? so I don't see any ambiguities. Nullable tuples, e.g. (string, string)? can be used for fallible patterns but I'm not sure if they're any better than out parameters.

@Richiban

As an operator you'd lose the ability to override, overload and extend, all of which are mentioned in the above notes.

Update: Oops, you wouldn't lose the ability to overload in that case. But you would lose the ability to override and extend.

@alrz

Let's say I want to write if (obj is Foo(var bar, var baz)) {...}. How will this work with tuple conversions? Let's say the type of obj is not a supertype of Foo, but I want a custom deconstructor.

@orthoxerox That sounds like an active pattern, something that is coming (I
think) but not necessarily in scope of this discussion?

On 4 August 2016 at 08:42, orthoxerox [email protected] wrote:

@alrz https://github.com/alrz

Let's say I want to write if (obj is Foo(var bar, var baz)) {...}. How
will this work with tuple conversions? Let's say the type of obj is not a
supertype of Foo, but I want a custom deconstructor.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/dotnet/roslyn/issues/11205#issuecomment-237477304,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAoyj9kKpQsvi_qSQUBfhbJoOVrPQg3pks5qcZfOgaJpZM4Iaoka
.

  • Richard Gibson -

The LDM notes are now available on the csharplang repo. For May 3-4 2016, see https://github.com/dotnet/csharplang/blob/master/meetings/2016/LDM-2016-05-03-04.md

I'll go ahead and close the present issue. Thanks

Was this page helpful?
0 / 5 - 0 ratings