Roslyn: Using null conditional with IEnumerable fails with a compiler error CS0023

Created on 14 Mar 2018 · 16Comments · Source: dotnet/roslyn

Version Used: VS 2017 15.6 and 15.7 Preview 1

Steps to Reproduce:

Put the following code into a program.

Scenario 1

static T GetFirstItem<T> ( ) where T: new()
{
    var results = GetItems<T>();

    return results?.FirstOrDefault() ?? default(T);
}

static IEnumerable<T> GetItems<T> () where T: new()
{
    return new[] { new T(), new T() };
}

Scenario 2

static T GetFirstItem<T> ( ) where T: new()
{
    IEnumerable results = GetItems<T>();

    return results?.OfType<T>().FirstOrDefault() ?? default(T);
}

Expected Behavior:

The type of results is IEnumerable<T> in Scenario 1. That would make it a reference type. Since the type is nullable then ?. should properly evaluate the value and call the subsequent member if not null.

In Scenario 2 the type is IEnumerable which eliminates the generic type altogether.

Actual Behavior:

The compiler reports an error with the ?. The actual error is

CS0023 Operator '?' cannot be applied to operand of type 'T'

For some reason the compiler is confused about the type of results. Switching from var to an explicit type doesn't change anything.

Scenario 2 generates the same error but we've removed the generic type so the compiler is still seeing it as T.

If you change to a different interface (i.e. IFoo) and then try the null conditional then the code compiles correctly. There is something special about IEnumerable but I don't know what.

Area-Compilers Bug Concept-Diagnostic Clarity

Source

CoolDadTx

Most helpful comment

in this context it basically means: not constrained to a reference or value type.

as such ?. is not legal for it as it doesn't know what T will finally be. If it knew it would be a value type, then it would know to create the "T?" type. If it knew it would be a reference type, it would just keep it as the "T" type. Because the type neither has the "class" or "struct" constraint, you get hte error. "unconstrained" is just a simple way of describing that.

CyrusNajmabadi on 16 Mar 2018

👍2

All 16 comments

While the error message is very confusing, this is the expected behavior. The problem is that ?. behaves differently with reference types and value types.

Specifically, in your example, what should be the type of results?.FirstOrDefault()? If T is a reference type, then it's T. But if T is a value type, then it's T?. But the compiler doesn't know what kind of type T is, so it can't compile your method.

The spec explicitly calls out that this case is not allowed:

If T0 is a type parameter that is not known to be a reference type or a non-nullable value type, a compile-time error occurs.

The above implies that one workaround to this problem is to specify which kind of type T is by using either where T: class, new() or where T: struct.

Now, back to the error message. Maybe it could be changed to something like:

Cannot use operator '?' in an expression of type 'T', because it is an unconstrained type parameter.

Do you think that this makes the problem clear? Do you have suggestions on how that error message could be improved even further?

svick on 14 Mar 2018

I see what you're saying. Had I chosen a different method, like Any, then it would compile but since the compiler cannot determine what the type of FirstOrDefault() it fails because it could be T or T?. Specifying a constraint of class does allow it to compile.

The compiler message doesn't really convey this to me. I think the error more akin to what happens when you try this with a conditional expression makes more sense.

static T GetFirstItem<T> () where T : new()
{
    var results = GetItems<T>();

    return results.Any() ? new T() : null;
}

Generates: CS 0173 Type of (conditional) expression cannot be determined...

CoolDadTx on 15 Mar 2018

@CoolDadTx that last example though is by design. There is simply no common type between T and null here due to the possibility that T could be instantiated as a struct. This could be fixed in two ways:

Add the class constraint to T
Use default instead of null

jaredpar on 15 Mar 2018

Agreed, but the error message doesn't convey that information to me. If the compiler is failing the call because it cannot decide between two types it would be nice if it reported that similar to how conditional expression does so the dev knows there is an ambiguity in the types.

Your proposed error message change may be technically more accurate but I think if you do a poll of devs who know what an 'unconstrained type parameter' is then you'll probably find that it doesn't mean anything to them. An error similar to the conditional expression (perhaps with a mention of the unconstrained type parameter or even just listing the 2 types it is struggling with) may be more helpful. Just my opinion though.

CoolDadTx on 15 Mar 2018

is then you'll probably find that it doesn't mean anything to them.

The problem then becomes: what audience are you creating your error messages for? There are novices using the language. There are experts. There are people who come from all sorts of different language paths, and you have to come up with a single message that somehow is suitable for all of them.

My preference is that the language error messages speak in the terms of the language itself. It means there is something you can actually read in the spec related to these concepts and that all the error messages are consistent with themselves and hte actual specification.

On top of that, i think it's enormously important that you be able to get from an error message to useful documentation explaining it better. That documentation can then fill in the gaps of that audience spectrum. In this case, it seems unfortunate that CS0023 is being used for all these messages. Having a dedicated error code would be beneficial as that could help lead people to a dedicated page discussing the error as it pertains to these types (unconstrained type parameters) and these operators ('?').

This puts a higher burden on the compiler. i.e. the need to internally break what is one error message into many. But it serves to better drive users toward understanding what is going wrong, which is ultimately the most important goal of errors in the first place. After all, we would never accept a compiler that quit out, saying 'something is wrong with your program'. :)

CyrusNajmabadi on 15 Mar 2018

👍1

"The problem then becomes: what audience are you creating your error messages for?"

Agreed.

"My preference is that the language error messages speak in the terms of the language itself. It means there is something you can actually read in the spec related to these concepts and that all the error messages are consistent with themselves and hte actual specification."

I completely disagree here. Only language people who understand grammars, rvalues and implementation details look at the specs. Everybody else will take the error code and paste it into their search engine. They will then read the official docs from the compiler to understand the message. That means the docs associated with that error code have to be clear about what the error is (in all combinations). Compared to just making the error message more readable this seems like needless work.

Messages should be targeted somewhere between the language expert (who can go to the spec and read more if they are interested) and the beginner (who knows little about the language). The goal of a good message is to provide enough information to tell the dev what went wrong and (ideally) how to fix the issue without sending them to external resources.

"In this case, it seems unfortunate that CS0023 is being used for all these messages"

Agreed that the compiler should perhaps have different error messages for problems that have different solutions. However having too many error messages is just as bad. Perhaps a happy medium is error code groups and subgroups. For example CS1234 indicates the compiler, 1200 would indicate some category of errors akin to what the compiler generates today. 34 would be a specific scenario with a small set of fixes. Then it becomes easier to clarify that 1200 indicates the general problem whereas 34 may indicate the specific problem/solution. Of course such a design is beyond the scope of this issue raised in this thread.

CoolDadTx on 15 Mar 2018

Messages should be targeted somewhere between the language expert (who can go to the spec and read more if they are interested) and the beginner (who knows little about the language).

In this case, i think that message follows. It's right in the middle. It's referring to simple concepts like operators and types. It's not trying to go super simple. Nor is it trying to go into spec legalize.

The goal of a good message is to provide enough information to tell the dev what went wrong and (ideally) how to fix the issue without sending them to external resources.

This is true. However it means you've now defined a goal in terms of all users. i.e. "tell the dev". Which dev? All of them?

how to fix the issue without sending them to external resources.

This is noble. However, in practice, i think this is simply not actually achievable. Why? Because 'fix the issue' is far to broad to have meaning. The narrowest view is "what minimal change would satisfy the compiler". However, that's really likely not the right thing for the user. What is correct is ultimately domain specific. It might involve narrow targeted changes (i.e. 'add the "class" constraint', or 'use "default" instead', or it might be hugely broad: i.e. you should not allow this type to flow here in the first place by totally refactoring your class design.

External resources are necessary as a compiler message is invariably going to need to condense all the info into a couple of lines at most, whereas the subjects and solutions can invariably be quite involved and require a lot of information. This is why things like docs (and stackoverflow and hte like) exist in the first place. Because it really is necessary at times to have to learn deeply about what the issue is and how you can apply that knowledge to your domain to create the best solution possible.

CyrusNajmabadi on 15 Mar 2018

We'll agree to disagree. The error message is not clear as to what the problem is. But if you want to take a poll as to whether the error makes sense to anyone else then go ahead.

CoolDadTx on 15 Mar 2018

Seems clear to me. You have a T. You can't use ? on it. A bit of supplementary text might be useful. like "Unconstrained type parameters may be value types, and ? is only valid on reference types". however, then you might have someone who doesn't know what an 'unconstrained type parameter' is. Or people who don't know what value-types or reference-types are. etc. etc. There is always a point at which people will have to read more and learn more.

Like i said, the compiler could do better here. But that's true for me for pretty much every error message :)

CyrusNajmabadi on 15 Mar 2018

@CyrusNajmabadi

Seems clear to me. You have a T. You can't use ? on it.

Except that's not what people see. If you have results?.FirstOrDefault(), then you might consider that this is the unary ? operator applied to the expression results.FirstOrDefault() of type T. But I think that most people will see the binary ?. operator applied to the expression results of type IEnumerable<T>, and the name FirstOrDefault. Which is one reason why the error message is confusing.

svick on 15 Mar 2018

That's fair.

CyrusNajmabadi on 15 Mar 2018

@svick
Could you please elaborate on "unconstrained type parameter" ? I couldn't find relevant simple example on this terminology.

akashkc on 16 Mar 2018

Type parameters (i.e. T in void Foo<T>()) can have constraints on them. For example:

InterfaceName/ClassName constraints: void Foo<T>() where T : IEnumerable. It means it can only be instantiated with a type that is or inherits from that class name.
class/struct constraint: void Foo<T>() where T : class or void Foo<T>() where T : struct. It means this can only be instantiated with reference or value types respectively.
'new' constraint. void Foo<T>() where T : new(). It means the type must have a public no-arg constructor on it.
'unmanaged' constraint. We'll ignore that for now. It's very new.

"Unconstrained type parameter" simply means: "a type parameter without a constraint provided".

CyrusNajmabadi on 16 Mar 2018

👍1

@CyrusNajmabadi
Thank you for quick response. As you said in above comment :

Having a dedicated error code would be beneficial as that could help lead people to a dedicated page discussing the error as it pertains to these types (unconstrained type parameters) and these operators ('?').

You are assuming T as unconstrained type parameters in the OP example (static T GetFirstItem<T> ( ) where T: new()) but it seems "new" constraint to me which is defined in 3rd point. Is still T be considered as Unconstraint type parameter if we only constraint it to parameterless constructor(new()) ?

akashkc on 16 Mar 2018

in this context it basically means: not constrained to a reference or value type.

CyrusNajmabadi on 16 Mar 2018

👍2

Thank you @CyrusNajmabadi I got it now.

akashkc on 16 Mar 2018

Was this page helpful?

0 / 5 - 0 ratings