There is an unfortunate ambiguity in parsing an array type introduced by the addition of the nullable feature. This prefix of an expression
``` c#
a ? x is A[][] ? b : c
Can be considered to mean
``` c#
a ? (x is A[][]? b) : c
But it could also be followed by a couple of additional tokens (: d)and taken to mean
``` c#
a ? ((x is A[][]) ? b : c)
: d
We would have to know the context (of the enclosing expression) in order to know how to parse the array type deep down inside of it. I don鈥檛 see a straightforward way to disambiguate at parse time. The latter interpretation was the correct interpretation in C# 5 code. The parser changes for nullable types therefore broke compatibility here, as this code is now attempted (and failing) to parse as the first interpretation.
Similarly, the expression
``` c#
x is A[][] ? b && c
Can be considered to mean
``` c#
(x is A[][]? b) && c
But it could also be followed by a couple of additional tokens (`: d`) and taken to mean
``` c#
(x is A[][]) ? b && c
: d
Similarly, the latter interpretation was the correct interpretation in C# 5 code but is now rejected in the parser due to parsing changes for nullable types.
See also #32025 and #31911 which are symptoms of this problem.
Note that we would not have this ambiguity if we had taken a different decision on how to parse the nullable annotations on an array type. For example, if we parsed array types using the natural order rather than the inside-out order, the first interpretation is not a valid parse because we would never accept a nullable type as the type of a declaration pattern. I suggest that this should motivate the LDM to reconsider how nullable annotations of array types are parsed.
should motivate the LDM to reconsider how nullable annotations of array types are parsed.
Is that really an option? array specifiers are already in reverse order - C[]?[,] a = new C[0]?[,] reads odd - ? is prepended to array specifier while this wouldn't be the case for C[]?.
@alrz Yes, it is an option. The ? annotation would no longer be part of the array specifier. It would be appended to an array type to form a type, and that type could then be an array element type. What that means is that in some rare cases you would have to change the order of the array specifiers to annotate them. For example to annotate:
``` c#
C[][,] = new C[0][,];
you would write
``` c#
C[,]?[] = new C[,]?[0];
But those are the most rare of the situations in which you'd be changing code. In the more common situations, things would just (in my opinion) be more intuitive. In any case, they would no longer be syntactically ambiguous.
Note that the change I advocate is actually what is specified in https://github.com/dotnet/csharplang/blob/master/proposals/nullable-reference-types-specification.md where nullable annotations are not part of the array specifiers.
The current approach of specifying nullable annotations on array types is something like this
array_type
: non_array_type rank_specifier+
;
nullable_type
: non_array_or_nullable_type '?'
;
rank_specifier
: '[' dim_separator* ']' '?'?
;
dim_separator
: ','
;
The proposed approach is something like this
array_type
: non_array_type rank_specifier+
;
nullable_type
: non_nullable_type '?'
;
rank_specifier
: '[' dim_separator* ']'
;
dim_separator
: ','
;
Assigned to @gafter. Let me know if that's ok. Thanks
We have a decision from LDM 1/7. Such scenarios should prefer the ternary interpretation. The user can add parens to disambiguate towards an array type with a nullable annotation.
A simpler example is x is A[] ? b : c which has been reported many times as being broken (no longer parses) by external users.
@gafter, why wasn't it issue before with nullable types? For example this:
i is int? "1" : "2"
or
i is int? ? "1" : "2"
is considered as valid ternary operator with int and int? types.
Whay about applying the same rule? I mean ternary operator is always preferred.
@QtRoS Because that breaks the interpretation of this existing code: i is string[][]? a : b
You see, that type in there used to (before this change) be interpreted as an "array of nullable array of strings". It isn't a nullable type. And so it is a valid type for a declaration expression. That means that the current compiler broke this existing, well-behaved code, which before C# 8 was a plain old ternary expression and in the current compiler attempts (and fails) to parse as a declaration expression.
@gafter if I understood you correctly that code i is string[][]? a : b worked before (in C# 7) as "ternary operator after string[][] type check" and will work so (in C# 8) if you just give priority to ternary operator. If this i is string[][]? a : b is "array of nullable array of strings" followed by _part of ternary operator_ (without question mark) - it won't compile anyway, so I still don't get why
the current compiler broke this existing, well-behaved code
What type of parser is used? Can you share the grammar (in Backus鈥揘aur form or something similar)?
if you just give priority to ternary operator
Then how do I write a declaration pattern with the type "array of nullable array of strings"?
The parser is hand-written recursive descent. You can read it yourself. Start with LanguageParser.cs.
@gafter Ah I thought it is LALR for which this particular case doesn't seem to be a problem. Thank you for your explanation, I've read your post carefully and got the point about "the inside-out order" for arrays.
Just as an idea - what about putting question mark inside brackets, like i is string[][?] ? a : b or i is string[?] ? a : b? It may help clearly specify which array in complex type is nullable (but this way is a bit radical).
@QtRoS We did consider that in the LDM and we rejected it. We prefer the simple position that any type can be made nullable by appending ? to it.
@gafter It works now, thank you very much!
Most helpful comment
Assigned to @gafter. Let me know if that's ok. Thanks
We have a decision from LDM 1/7. Such scenarios should prefer the ternary interpretation. The user can add parens to disambiguate towards an array type with a nullable annotation.