Roslyn: [Question] Should compiler allow code that uses a random interface type as variable type in foreach statement?

Created on 27 Feb 2018  ·  17Comments  ·  Source: dotnet/roslyn

Version Used: 2.4.0.62225

Steps to Reproduce:

I am not sure if this is by design or a bug. Here's the code, please see comments inside.

```C#
namespace ConsoleApp1
{
class SomeType { }

class DifferentType { }

interface SomeRandomInterface { }

class Program
{
    static void Main(string[] args)
    {
        SomeType[] arrayOfSomeType = new SomeType[]
        {
            new SomeType(),
        };

        IEnumerable<SomeType> enumerableOfSomeType = arrayOfSomeType;

        //  This compiles, but throws InvalidCastException at run time (and puzzels me)
        foreach (SomeRandomInterface item in enumerableOfSomeType) { }

        //  This won't compile (thankfully)
        //foreach (DifferentType item in enumerableOfSomeType) { }
    }
}

}
```

Expected Behavior:
I would expect that the compiler won't allow declarations of enumeration variable of a type that is not assignable from the type of elements that the respective enumerator represents. Therefore, in the above example I would expect the compiler to forbid any types except Object, types on the inheritance hierarchy of SomeType, and interface types implemented by SomeType, either directly of indirectly.

Actual Behavior:
In reality, the compiler allows usage of interface types that are not related to SomeType which results in a run time error, although it forbids class types that are not on SomeType's inheritance hierarchy. I might be missing something, but I was not able to come up with a scenario where this might be useful (and not lead to an exception at run time).

Area-Language Design Question Resolution-Answered

All 17 comments

I think a simple case would be:

class DerivedType : SomeRandomInterface
{}

Having elements of DerivedType in the collection will make it not throw a runtime exception.

❗️ If you seal SomeType, it'll correctly report a compiler error.

I have brought this up before (#20314) and it is by design. I later suggested an analyzer be made to catch these cases (dotnet/roslyn-analyzers#1494).

@Therzok you're right I thought of that too, but strictly speaking it won't throw if all items in the collection are of type DerivedType in which case you'd most likely have a collection of DerivedType not its base type. The whole idea of working with items in a collection via some base type is that you expect that the actual type of the element may be different which means that the above foreach statement may fail.

Making SomeType sealed indeed produces a compiler error. Thanks for pointing that out.

Thank you @Joe4evr, I was not able to find that issue myself.

I still sounds counter intuitive to me. I don't see how the above code is semantically different from the following

for (int i = 0; i < arrayOfSomeType.Length; i++)
{
    SomeRandomInterface item = arrayOfSomeType[i];
}

The objects it the array might actually be of DerivedType and indeed implement SomeRandomInterface or they may not. The compiler simply doesn't know. That's why, in the latter case it would produce an error because implicit casts are forbidden (which is one of the keys to type safety - be explicit whenever you're doing something potentially dangerous). Yet in the former example the compiler allows an implicit cast.

As I speculated over in the Analyzer thread, it likely has to do with C# not having generic collections in its first release, so anything that iterates over a non-generic IEnumerable only gets something of type object. You can then while you're iterating using foreach specify the type that you expect and the compiler inserts that cast on your behalf.

Yeah, the old non-generic collections used to be a complete PITA but what's even worse, so many years later we still have to deal with the consequences. I wish we haven't end up with generic collections supporting the non-generic IEnumerable. Introducing an explicit Cast<> to support iterating over non-generic collections would have been much safer.

@eduard-malakhov

Introducing an explicit Cast<> to support iterating over non-generic collections would have been much safer.

Except, AFAIK, foreach already existed in C# 1.0 and was pattern-based. I don't see how what you're proposing could have been introduced in C# 2.0 without it being a breaking change.

Of course, the point is moot, changing this would be a breaking change now, so it's not going to happen.

That would still be a breaking change in the compiler, making previously legal code illegal (which has been and is still a big no-no for the teams). It's simply a case of "suck it up". If you want the compiler to not do something wrong, just use var and all your problems are solved. ¯\_(ツ)_/¯

@svick , @Joe4evr which change are you referring here to as "breaking"? I think that the behavior I've described is due to IEnumerable<> that inherits IEnumerable which has the object-typed GetEnumerator() which the compiler picks up to do the implicit cast. I don't see how IEnumerable<> not inheriting IEnumerable is breaking in any way.

@eduard-malakhov

which change are you referring here to as "breaking"? I think that the behavior I've described is due to IEnumerable<> that inherits IEnumerable which has the object-typed GetEnumerator() which the compiler picks up to do the implicit cast.

IEnumerable<T> inheriting IEnumerable is not really related to this. In fact, foreach mostly doesn't care about interfaces, as long as it finds GetEnumerator() (and it doesn't see the explicitly implemented GetEnumerator() from IEnumerable, so that has no effect on this).

For example, this code still works (read: compiles and throws InvalidCastException at runtime), even though it doesn't use IEnumerable<T> or IEnumerable:

```c#
class InfiniteCollection // no interfaces
{
public Enumerator GetEnumerator() => new Enumerator();

public class Enumerator
{
    public bool MoveNext() => true;
    public SomeType Current => new SomeType();
}

}

class SomeType {}
interface ISomeRandomInterface { }

class Program
{
static void Main()
{
foreach (ISomeRandomInterface item in new InfiniteCollection())
{
}
}
}
```

You're right that changing IEnumerable<T> to not inherit from IEnumerable would not be a breaking change, if it was done when IEnumerable<T> was first being added. But it would not solve the issue: the syntax you dislike would still work for collections that implement the modified IEnumerable<T>.

@svick I know that the compiler does not rely on interfaces, it finds a proper GetEumerator() method and emits a call to it, but I thought it finds the non-generic IEnumerable.GetEumerator() which returns object and therefore tries to cast it. But you're right, the compiler indeed emits a call to IEnumerable<>.GetEumerator(), here's MSIL code for the above example

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       64 (0x40)
  .maxstack  4
  .locals init (class ConsoleApp1.SomeType[] V_0,
           class [System.Runtime]System.Collections.Generic.IEnumerable`1<class ConsoleApp1.SomeType> V_1,
           class [System.Runtime]System.Collections.Generic.IEnumerator`1<class ConsoleApp1.SomeType> V_2,
           class ConsoleApp1.SomeRandomInterface V_3)
  IL_0000:  nop
  IL_0001:  ldc.i4.1
  IL_0002:  newarr     ConsoleApp1.SomeType
  IL_0007:  dup
  IL_0008:  ldc.i4.0
  IL_0009:  newobj     instance void ConsoleApp1.SomeType::.ctor()
  IL_000e:  stelem.ref
  IL_000f:  stloc.0
  IL_0010:  ldloc.0
  IL_0011:  stloc.1
  IL_0012:  nop
  IL_0013:  ldloc.1
  IL_0014:  callvirt   instance class [System.Runtime]System.Collections.Generic.IEnumerator`1<!0> class [System.Runtime]System.Collections.Generic.IEnumerable`1<class ConsoleApp1.SomeType>::GetEnumerator()
  IL_0019:  stloc.2
  .try
  {
    IL_001a:  br.s       IL_002a
    IL_001c:  ldloc.2
    IL_001d:  callvirt   instance !0 class [System.Runtime]System.Collections.Generic.IEnumerator`1<class ConsoleApp1.SomeType>::get_Current()
    IL_0022:  castclass  ConsoleApp1.SomeRandomInterface
    IL_0027:  stloc.3
    IL_0028:  nop
    IL_0029:  nop
    IL_002a:  ldloc.2
    IL_002b:  callvirt   instance bool [System.Runtime]System.Collections.IEnumerator::MoveNext()
    IL_0030:  brtrue.s   IL_001c
    IL_0032:  leave.s    IL_003f
  }  // end .try
  finally
  {
    IL_0034:  ldloc.2
    IL_0035:  brfalse.s  IL_003e
    IL_0037:  ldloc.2
    IL_0038:  callvirt   instance void [System.Runtime]System.IDisposable::Dispose()
    IL_003d:  nop
    IL_003e:  endfinally
  }  // end handler
  IL_003f:  ret
} // end of method Program::Main

But I still don't understand, what the "breaking change" would be. Regardless of how the non-generic collections used to be handled, the new generic collections could be treated differently. In fact, the compiler does treat them differently. Here's a piece of MSIL code for the same code as above except for IEnumerable<SomeType> enumerableOfSomeType being replaced by Array enumerableOfSomeType. As you can see, compiler emits a castclass instruction.

    IL_001a:  br.s       IL_002a
    IL_001c:  ldloc.2
    IL_001d:  callvirt   instance object [System.Runtime]System.Collections.IEnumerator::get_Current()
    IL_0022:  castclass  ConsoleApp1.SomeRandomInterface
    IL_0027:  stloc.3
    IL_0028:  nop
    IL_0029:  nop
    IL_002a:  ldloc.2
    IL_002b:  callvirt   instance bool [System.Runtime]System.Collections.IEnumerator::MoveNext()
    IL_0030:  brtrue.s   IL_001c
    IL_0032:  leave.s    IL_0049

However here's MSIL code for a version of code where we iterate over the enumerable using a variable of type SomeType
```C#
IEnumerable enumerableOfSomeType = arrayOfSomeType;

foreach (SomeType item in enumerableOfSomeType) { }


```MSIL
    IL_001a:  br.s       IL_0025
    IL_001c:  ldloc.2
    IL_001d:  callvirt   instance !0 class [System.Runtime]System.Collections.Generic.IEnumerator`1<class ConsoleApp1.SomeType>::get_Current()
    IL_0022:  stloc.3
    IL_0023:  nop
    IL_0024:  nop
    IL_0025:  ldloc.2
    IL_0026:  callvirt   instance bool [System.Runtime]System.Collections.IEnumerator::MoveNext()
    IL_002b:  brtrue.s   IL_001c
    IL_002d:  leave.s    IL_003a

As you see, there's no castclass. So some optimization to handle generic collections was made.

the new generic collections could be treated differently

Probably. I'm not sure if you could make it consistent with C# 1.0 or if it would work well for types that implement IEnumerable<T>, but also have a custom GetEnumerator() (like List<T>). But yes, I think you're right that it could be done without a breaking change. Though I don't see much point in long discussions about what could have been.

the compiler does treat them differently

Not really, it just behaves the same as for any other cast that it knows doesn't do anything.

@svick , I guess you're right, this is not going to change anyways .

This is as required by the language specification. See https://github.com/dotnet/csharplang/blob/master/spec/statements.md#the-foreach-statement

If you want to request a change, that is the correct repo to file an issue.

Was this page helpful?
0 / 5 - 0 ratings