Docs: null-forgiving operator can have effect at run time

Created on 21 Apr 2020 · 32Comments · Source: dotnet/docs

Quote from the current doc:

The null-forgiving operator has no effect at run time. It only affects the compiler's static flow analysis by changing the null state of the expression. At run time, expression x! evaluates to the result of the underlying expression x.

The document currently focuses primarily on the usage of suppressing warnings of code flow analysis.

But when combining the null-forgiving operator with a leading null-conditional member-access operator the compiler will indeed emit different code and this will have effect at run time -- which comes unexpected.

Example:
``C# void Main() { var names = T1(); //names == null` now
var names2 = T2(); // throws
}

IEnumerable T1()
{
FileInfo? fi = null;
return fi?.Directory?.EnumerateFileSystemInfos().Select(fsi => fsi.Name);
}

IEnumerable T2()
{
FileInfo? fi = null;
return fi?.Directory?.EnumerateFileSystemInfos()!.Select(fsi => fsi.Name);
}
```

This happens because in T2() the code actually compiles to return (fi?.Directory?.EnumerateFileSystemInfos()).Select(fsi => fsi.Name); (notice the additional parentheses).

I'm sure this is by design and is covered by the sentence "expression x! evaluates to the result of the underlying expression x". But since any article or documentation I have read about the null-forgiving operator so far, focused only about getting rid of warnings, this (change in) behavior was very unexpected, at least to me.

Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

ID: 0471f98c-f5cd-00d2-a705-396682f73bcb
Version Independent ID: a24b862d-c462-a040-8470-9771c74c4e6f
Content: ! (null-forgiving) operator - C# reference
Content Source: docs/csharp/language-reference/operators/null-forgiving.md
Product: dotnet-csharp
Technology: csharp-language-reference
GitHub Login: @BillWagner
Microsoft Alias: wiwagn

Area - C# Guide product-question

Source

springy76

👍2

Most helpful comment

Thanks for all the discussion. I'll make the following proposal:

In light of dotnet/csharplang#3393, dotnet/csharplang#3297 and dotnet/roslyn#43659 which open possible changes o the spec and implementation, I'd like to re-focus this on explaining the current behavior. (Thanks @AntonLapounov @agocke and @jcouv for driving those discussions).

In the short term, I think the best solution is Option 1 in https://github.com/dotnet/docs/pull/18037/files#r414319212 as @AntonLapounov says. In addition, we should add a note as previously discussed. I suggest the following edit to the previous proposal:

[!NOTE]
In C# 8, The expression x?.y!.z is parsed as (x?.y)!.z. Due to this interpretation (x?.y)! is considered not null. Therefore, if either x or y is null, the expression throws a NullReferenceException.

I specifically mention C# 8 to take into account change that might come from dotnet/csharplang#3393.

In the longer term, as the standardization effort catches up, we should remove this table from the language reference section, and refer to the corresponding table in the spec.

Can we give this is thumbs up or down as a proposed solution?

BillWagner on 28 Apr 2020

👍2

All 32 comments

@springy76 thanks!

I've simplified your example a bit:

using System;
using System.Collections.Generic;
using System.Linq;
#nullable enable

class Program
{
    private static void Main()
    {
        try
        {
            T1(null);
        }
        catch (Exception)
        {
            Console.WriteLine("T1 throws");
        }
        try
        {
            T2(null);
        }
        catch (Exception)
        {
            Console.WriteLine("T2 throws");
        }
    }

    private static IEnumerable<int> T1(int[]? numbers)
    {
        return numbers?.ToArray().ToArray();
    }

    private static IEnumerable<int> T2(int[]? numbers)
    {
        return numbers?.ToArray()!.ToArray();
    }
}

The above program produces the following output:

T2 throws

That program in sharplab

Note that T1 is converted into the following code:

    private static IEnumerable<int> T1([System.Runtime.CompilerServices.Nullable(2)] int[] numbers)
    {
        if (numbers == null)
        {
            return null;
        }
        return Enumerable.ToArray(Enumerable.ToArray(numbers));
    }

while T2:

    private static IEnumerable<int> T2([System.Runtime.CompilerServices.Nullable(2)] int[] numbers)
    {
        return Enumerable.ToArray((numbers != null) ? Enumerable.ToArray(numbers) : null);
    }

@BillWagner is that expected behavior?

pkulikov on 21 Apr 2020

👍1

Let's ask @jcouv for some thoughts on this, thank you @pkulikov and @springy76.

IEvangelist on 21 Apr 2020

Yeah, I see it as:

private static IEnumerable<int> T1(int[]? numbers)
{
    return numbers?.ToArray().ToArray();
}

private static IEnumerable<int> T2(int[]? numbers)
{
    return (numbers?.ToArray()).ToArray();
}

It is a really odd use case, it's literally like expressing that something could be null, and then something else chained off of that could be null and then saying that the last thing is definitely not null. It could be null, it could be null, it's not null.

IEvangelist on 21 Apr 2020

I can't resist a good puzzle. I think the behavior is correct, and is covered in operator precedence and in the nullable reference type spec on the null forgiving operator, and finally in the C# 6.0 spec on the null conditional operator:

the null conditional operator is a unary operator, which has lower precedence than primary expressions.
the null forgiving operator is a primary expression.

I think that means the left operand of ! must be considered as non-null, so execute the ToArray() method regardless of the earlier left operand(s) of any ?. operator.

What's equally interesting is that the statement "The null forgiving operator has no runtime effect" is still strictly true (and was taken directly from the feature spec referenced above). It isn't very helpful, though. The code generated is based on the precedence of the operators, and the fact that they are left-associative.

I'm still not sure how best to fix this in the docs, but you've found a super interesting edge case @springy76 We'll keep noodling.

BillWagner on 21 Apr 2020

👍1

Precedence. Does that mean that numbers?.ToArray()!.ToArray() is interpreted as (numbers?.ToArray()!).ToArray() because of precedence?

pkulikov on 21 Apr 2020

👍2

Sort of. It's both precedence and associativity. The ! operator is a post-fix unary operator. Its operand is whatever is left of it. Because ToArray() needs a receiver, its operand becomes (numbers?.ToArray()). Because ! has higher precedence than . (primary vs. unary), the expression numbers?.ToArray()! is considered non-null and then becomes the receiver to the final ToArray.

It took my a few reads of the sections I cited above, and I'm still not completely certain I'm analyzing it correctly. (I first thought it might be a compiler bug. Now I don't think so.) I'd like to wait for @jcouv to weigh in as well before we suggest a fix.

BillWagner on 21 Apr 2020

👀1

ok, then we need to write such code (it compiles!) :)

numbers?.ToArray()!?.ToArray();

pkulikov on 21 Apr 2020

😄2

Thanks @BillWagner for the analysis (which I think is correct).

@gafter, Would this warrant changing the precedence of post-fix ! (suppression)?
The precedence of ! causes x?.ToArray()!.ToArray() to parse with an implicit grouping (x?.ToArray()!).ToArray(), which affects semantics...

jcouv on 21 Apr 2020

👍1

I don't see what other precedence it could be...

agocke on 21 Apr 2020

@BillWagner thank you for the analysis; it helped. However, it's hard to think in terms of precedence because the spec on that _doesn't_ mention null-conditional operators. And it's possible to think that the precedence of ?. is the same as precedence of ., which is primary. So, I'll try to make my own analysis to understand this issue better.

Given (I assume the following two statements are true):

x! is a primary expression
the C# 6 draft spec for the null-conditional operators is still true for C# 8.0. In particular, the following grammar is valid:

null_conditional_expression
    : primary_expression null_conditional_operations
    ;

null_conditional_operations
    : null_conditional_operations? '?' '.' identifier type_argument_list?
    | null_conditional_operations? '?' '[' argument_list ']'
    | null_conditional_operations '.' identifier type_argument_list?
    | null_conditional_operations '[' argument_list ']'
    | null_conditional_operations '(' argument_list? ')'
    ;

numbers?.ToArray().ToArray() is a null-conditional expression with numbers as primary_expression and ?.ToArray().ToArray() as null_conditional_operations

We can't do the similar split for numbers?.ToArray()!.ToArray() because ! cannot be included into null_conditional_operations (it's not identifier, I guess). That means that numbers?.ToArray()!.ToArray() is NOT a null-conditional expression. According to the grammar, only numbers?.ToArray() is a null-conditional expression, let's mark it A. Then, we get A!.ToArray(), which is equivalent to A.ToArray(), which is equivalent to (numbers?.ToArray()).ToArray()

Furthermore, numbers?.ToArray()!?.ToArray() is a null-conditional expression with numbers?.ToArray()! as primary_expression and ?.ToArray() as null_conditional_operations.

So, if this issue to be fixed, the definition of a null-conditional expression should be updated, not only the precedence of null-forgiving (to unary??).

@BillWagner @jcouv is that correct analysis?

pkulikov on 22 Apr 2020

Let me go into detail how I (we) spotted this problem yesterday, because it's exactly the other way round than the example I gave in the original post:

It began with a collegue who claimed that inserting a "!" somewhere fixed a NRE. I protested and bet a staple of crates of beer that the null-forgiving operator won't fix any code but only give the code analyzer a hint: "I know it better, just be calm and stop warning me".
But he proved me wrong.

In our case it was exactly the other way round (code without "!" threw NRE) because the LINQ extension method in our code (I used Enumerable.Select(fsi => fsi.Name) in the sample above) was named EmptyIfNull and as the name implies is designed to handle null for input (and will return Array.Empty<T>() in this case):

```C#
var data = mightBeNullObject?.ArrayTypeProperty.EmptyIfNull().Select(x => x.Prop);
hashset.UnionWith(data); // ArgumentNullException when mightBeNullObject is null (data is then null, too)


So in our case the "?." stripped away the call to `.EmptyIfNull<T>()` and the resulting IEnumerable variable was still null and with the additional "!" the call to `.EmptyIfNull<T>()` was forced just the same way as putting the left side of the expression into parenthesis:

```C#
  var data = mightBeNullObject?.ArrayTypeProperty!.EmptyIfNull().Select(x => x.Prop);
  hashset.UnionWith(data); // data is never null now

equals

C# var data = (mightBeNullObject?.ArrayTypeProperty).EmptyIfNull().Select(x => x.Prop); hashset.UnionWith(data); // data is never null now

springy76 on 22 Apr 2020

👍1

It has no effect at run time; however, it may affect how the expression is parsed at compile time:
C# int[] a = null; string s1 = a?.First().ToString(); // null: ToString() is not called string s2 = a?.First()!.ToString(); // Empty string: ToString() is called on default(int?)

AntonLapounov on 23 Apr 2020

👍2

Yes, if you decompile it (using ILSpy for example) you'll see that the compiler emitted different code (by adding parenthesis or maybe an intermediary variable): string s2 = ( a?.First() ).ToString();

I just did not expect the compiler to behave different in any way because I only expected reducing of false-positive-warnings.

As already stated in the original post: I don't think the compiler is doing something wrong, it just came unexpected due to the lack of explanation or any mentioning in all the samples.

springy76 on 23 Apr 2020

@BillWagner @IEvangelist @springy76 what about adding the following note to the article about the null-forgiving operator:

[!NOTE]
The compiler interprets expression x?.y!.z as (x?.y)!.z. Because of that, at run time x?.y!.z is evaluated as (x?.y).z.

pkulikov on 23 Apr 2020

👍1

@BillWagner @IEvangelist @springy76 what about adding the following note to the article about the null-forgiving operator:

[!NOTE]
The compiler interprets expression x?.y!.z as (x?.y)!.z. Because of that, at run time x?.y!.z is evaluated as (x?.y).z.

I think that note would be appropriate, I'm curious what @BillWagner would think? I'd probably reword it a little (let me know your thoughts):

[!NOTE]
The compiler interprets the x?.y!.z expression as (x?.y)!.z. Due to this interpretation and the operator precedence, at runtime x?.y!.z is evaluated as (x?.y).z.

IEvangelist on 23 Apr 2020

@IEvangelist I would omit the words "the operator precedence": the parentheses in the first sentence are enough. Also to me, "operator precedence" here is more confusing than explaining (however, I don't know how it is to the remaining audience).

By the way, "runtime" is the noun that is in "Common Language Runtime", while "at run time/at compile time".

pkulikov on 23 Apr 2020

👍1

I first thought it might be a compiler bug. Now I don't think so.

Have we clarified with the compiler team that it is indeed the intended behavior here? If not, I would create a Roslyn issue asking for clarification. The present behavior is counterintuitive. We might consider changing the grammar to allow null-forgiving operators in null_conditional_operations.

AntonLapounov on 23 Apr 2020

👍1

@BillWagner @IEvangelist I've thought more about it and can explain why mentioning precedence confuses me (and can confuse others).

We say that a + b * c is evaluated as a + (b * c) because * has a higher precedence. So, a?.b!.c is evaluated as (a?.b)!.c. That looks as if ?. has a higher precedence because it's evaluated still first.

However, I don't think we need to mention precedence here at all. Precedence, for a given operand, decides to which operator the operand binds: to the one with a higher precedence. For example, in a + b * c, b binds to * because * has higher precedence than +.

That said, in expression a?.b.c.d!.e, all operators are unary postfix. Each operand binds to an operator on its right, there is no choice (thus, there is no need in precedence). Thus, all operators are evaluated from left to right in order. So, the compiler tries to evaluate the first ?.. To do that, I guess, it needs to form a null-conditional expression. A null-conditional expression can contain only (null-conditional) member and index access operations. The ! operator breaks that sequence of member and index accesses. That's why a?.b.c.d!.e is interpreted as (a?.b.c.d)!.e.

There are two solutions: (1) to allow ! in a null-conditional expression, (2) update the docs

As for the docs, currently they say:

The null-forgiving operator has no effect at run time. It only affects the compiler's static flow analysis by changing the null state of the expression. At run time, expression x! evaluates to the result of the underlying expression x.

Maybe we should say something like that:

The null-forgiving operator has no effect at run time. Expression x! evaluates to the result of the underlying expression x. However, at compile time, the null-forgiving operator affects the compiler's static flow analysis by changing the null state of the expression. In the case of a null-conditional expression, that changes the way how it's evaluated when you add the ! operator to it. The compiler interprets a?.b!.c as (a?.b)!.c, which evaluates to (a?.b).c at run time.

pkulikov on 24 Apr 2020

👍2

That said, in expression a?.b.c.d!.e, _all_ operators are unary postfix.

@pkulikov I am not sure how you came to this conclusion. Expressions a?. and a. have no meaning by themselves, because those operators require both _left-_ and _right-_ hand sides. For instance, the C++ reference describes the _first_ and the _second_ operands of the a.b member access operator. If you do not wish to use the word _operand_ for syntax constructions, we might agree to call them _lhs_ and _rhs_. The operators would still be binary though.

Having said that, I do not think it is strictly necessary to distinguish priorities of those special operators (implied by the grammar) in the table in question. Its main idea is that primary expressions are evaluated first (they have the highest precedence), then all other 'normal' operators are evaluated according to their priorities. For that purpose, it seems OK to extend primary expressions with null-conditional operators even if the outdated formal spec is worded slightly differently.

Maybe we should say something like that:

The null-forgiving operator has no effect at run time. Expression x! evaluates to the result of the underlying expression x. However, at compile time, the null-forgiving operator affects the compiler's static flow analysis by changing the null state of the expression. In the case of a null-conditional expression, that changes the way how it's evaluated when you add the ! operator to it. The compiler interprets a?.b!.c as (a?.b)!.c, which evaluates to (a?.b).c at run time.

The words "that changes" imply it is the same effect; however, those are two separate effects:

The ! operator may change the null state of the expression. That may suppress a warning.
The ! operator may change how the primary expression is parsed and, therefore, evaluated.

AntonLapounov on 24 Apr 2020

👍1

I am not sure how you came to this conclusion.

@AntonLapounov this is how I understand the language spec. According to it, a null-conditional expression and a member access expression (as it's a primary expression) are unary expressions. That means they have one operand. Indeed, they require right-hand part, but it's not operand (it can be a member identifier or a list of member and indexer access operations). For example, the spec on null-conditional says:

The null-conditional operator applies a list of operations to its operand...

So, there is a distinction between "operand" and "rhs". This distinction is important, because precedence concerns operands.

pkulikov on 24 Apr 2020

👍1

The words "that changes" imply it is the same effect; however, those are two separate effects:

@AntonLapounov thanks for that remark. I agree my suggestion can be improved like that:

At compile time, the null-forgiving operator affects the compiler's static flow analysis by changing the null state of the expression. At run time, expression x! evaluates to the result of the underlying expression x. Typically, the introduction of the ! operator doesn't affect run time evaluation. However, if you introduce the ! operator in a null-conditional expression, you might observe the change in behavior at run time. That happens because the compiler interprets a?.b!.c as (a?.b)!.c, which evaluates to (a?.b).c at run time.

@springy76 @IEvangelist @BillWagner what do you think?

pkulikov on 24 Apr 2020

👍2

@pkulikov I agree with your analysis, and I like the updated text you're proposing. The last sentence was tripping me up a bit (this is a minor).

That happens because the compiler interprets a?.b!.c as (a?.b)!.c, which evaluates to (a?.b).c at run time.

Does this read a bit better written this way?

The reason for this, is that the compiler interprets a?.b!.c as (a?.b)!.c, which evaluates to (a?.b).c at run time.

I am thoroughly enjoying this thread, being new to this team - I'm learning a lot from it. Thank you

IEvangelist on 24 Apr 2020

@IEvangelist I would make the intro even shorter (is that grammatically correct?):

That is because the compiler...

pkulikov on 24 Apr 2020

BTW: Not to oversee what happens on nullable structs while talking about nullable reference types:

C# static System.TimeSpan T1(DateTime? dt) => dt?.TimeOfDay!.GetValueOrDefault(Timeout.InfiniteTimeSpan);

Without ! the member GetValueOrDefault of the intermediary Nullable<T> struct would not be valid at this place at all since TimeOfDay itself is just a struct (TimeSpan).

springy76 on 24 Apr 2020

@IEvangelist I would make the intro even shorter (is that grammatically correct?):

That is because the compiler...

I'm good with that, I think the "That happens because" was throwing me off...

IEvangelist on 24 Apr 2020

this is how I understand the language spec. According to it, a null-conditional expression and a member access expression (as it's a primary expression) are _unary_ expressions. That means they have one operand.

That is not correct interpretation of the spec. While every primary_expression matches the unary_expression production, that does not mean that every operator within a primary_expression has to be a unary operator. Consider, for example, this unary_expression: a[1+2].b. It contains three operators:

Binary arithmetic operator + taking arguments 1 and 2.
Binary indexing operator [] taking arguments a and 3 (the result of 1+2).
Binary access operator . taking arguments a[3] and b.

While you might claim that b is not a 'real' operand of the last operator and name it _rhs_ instead, it would be difficult to deny that 3 is a real operand of a[3], which is certainly a unary_expression.

For example, the spec on null-conditional says:

The null-conditional operator applies a list of operations to its operand...

And that list of operations works as the second argument/operand of the operator. That is commonly used terminology. Look, for example, at Safe navigation operator Wiki page:

In object-oriented programming, the safe navigation operator (also known as optional chaining operator, safe call operator, null-conditional operator) is a binary operator that returns null if its first argument is null; otherwise it performs a dereferencing operation as specified by the second argument (typically an object member access or an array index).

So, there is a distinction between "operand" and "rhs". This distinction is important, because precedence concerns operands.

Operator precedence rules dictate how an expression is interpreted by the compiler and translated into the expression tree. That includes determining arguments of each operator. It does not matter whether you call them operands or lhs/rhs — you still have to determine their textual boundaries. If some part of the grammar is more complicated and cannot be expressed as an operator-precedence grammar, that is fine, but then you should avoid assigning precedence to operators in that part of the grammar. That is what you did in #18001 and I came here to explain why it was wrong:

moved x?.y and x?[y] (null-conditionals) from the primary precedence row to the unary precedence row (putting it to the primary expressions group was not correct)

AntonLapounov on 25 Apr 2020

Thanks for all the discussion. I'll make the following proposal:

[!NOTE]
In C# 8, The expression x?.y!.z is parsed as (x?.y)!.z. Due to this interpretation (x?.y)! is considered not null. Therefore, if either x or y is null, the expression throws a NullReferenceException.

I specifically mention C# 8 to take into account change that might come from dotnet/csharplang#3393.

In the longer term, as the standardization effort catches up, we should remove this table from the language reference section, and refer to the corresponding table in the spec.

Can we give this is thumbs up or down as a proposed solution?

BillWagner on 28 Apr 2020

👍2

Therefore, if either x or y is null, the expression throws a NullReferenceException.

Nitpicking, it may throw, but not necessary if z is a method call:
C# int[] a = null; // s2 is the empty string: ToString() is called on default(int?) string s2 = a?.First()!.ToString();
What's about the following?

In C# 8 the expression x?.y!.z is parsed as (x?.y)!.z. Due to this interpretation z is evaluated even if x is null, which may result in a NullReferenceException.

AntonLapounov on 28 Apr 2020

👍1

BTW: What is inside LINQ expressions? ?. isn't allowed there but ! is allowed -- but seems to have no impact on the expression.

springy76 on 28 Apr 2020

I opened #18124 to address this. Thanks for all the discussion.

@springy76 Can you open a new issue for the question on LINQ expressions? I'm not sure what you mean, and I'd like to start a new discussion for that concern.

BillWagner on 28 Apr 2020

@BillWagner

I just wanted to note that

Expression<Func<DirectoryInfo, string>> expr = di => di.Parent?.Name;

does not compile at all (CS8072 An expression tree lambda may not contain a null propagating operator) but

Expression<Func<DirectoryInfo, string>> expr = di => di.Parent!.Name;

is no problem, seems to be ignored by (or transparent to) the expression code builder.

springy76 on 28 Apr 2020

👍1

Thanks @springy76 That's a different issue. (And we don't document it well.)

There are a number of new syntax elements that aren't representable in LINQ expression trees. I'll create an issue and we'll address that as well.

BillWagner on 28 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Folder Encrypting

Manoj-Prabhakaran · 3Comments

Guidance on whether to indicate CLS compliance

Eilon · 3Comments

Business logic inside queries

gmatv · 3Comments

Example has invalid value warning

ite-klass · 3Comments

Please explain your over-engineered crap.

LJ9999 · 3Comments

Docs: null-forgiving operator *can have* effect at run time

Document Details

Most helpful comment

All 32 comments

Related issues

Docs: null-forgiving operator can have effect at run time