Roslyn: C# Compiler emits calls via the _least_ derived override. This is unintuitive and contradicts the model presented by the IDE.

Created on 19 Jan 2018  路  10Comments  路  Source: dotnet/roslyn

@yaakov-h commented on Sun Jan 14 2018

My understanding of the C# spec is that ref and out are treated separately for method overload resolution:

For other purposes of signature matching (e.g., hiding or overriding), ref and out are considered part of the signature and do not match each other. - Signatures and overloading

Given the following code, Baz.M(ref object) should override Foo.M(ref object), but at runtime it appears to override Bar(out object) instead.

"Go To Definition" agrees with me, or has made the same misunderstanding (see the comment below).

using System;

public static class Program
{
    public static void Main()
    {
        object o = null;
        new Baz().M(ref o); // Should print "Baz"
        new Baz().M(out o); // Should print "Bar"
        new Baz().N(0); // Should print "Baz"
    }
}

class Foo
{
    public virtual void M(ref object o)
    {
        Console.WriteLine("Foo");
    }

    public virtual void N(int i)
    {
        Console.WriteLine("Foo");
    }
}

class Bar : Foo
{
    // New method, does not override Foo.M()        
    public virtual void M(out object o)
    {
        o = null;
        Console.WriteLine("Bar");
    }

    // New method, does not override Foo.N()
    public virtual void N(int i)
    {
        Console.WriteLine("Bar");
    }
}

class Baz : Bar
{
    // Should override Foo.M()
    public override void M(ref object o)
    {
        // If you uncomment this line, then Go To Definition goes to Foo.M()
        // base.M(ref o);

        Console.WriteLine("Baz");
    }

    // Should override Bar.N()
    public override void N(int i)
    {
        Console.WriteLine("Baz");
    }
}

Expected Output:

Bar
Baz

Actual Output:

Baz
Baz

Is this:

  • A complete misunderstanding on my part?
  • A bug in Go To Definition?
  • An error in the language spec?
  • A bug in Roslyn?
  • A bug in the .NET Framework and .NET Core runtimes?

Thanks.


@Thaina commented on Sun Jan 14 2018

Interesting. I also play with it a little more and found out that

image

Might be related


@yaakov-h commented on Sun Jan 14 2018

Yep, the spec mentions that too, right before the bit above:

Although out and ref parameter modifiers are considered part of a signature, members declared in a single type cannot differ in signature solely by ref and out. A compile-time error occurs if two members are declared in the same type with signatures that would be the same if all parameters in both methods with out modifiers were changed to ref modifiers.


@Thaina commented on Sun Jan 14 2018

@yaakov-h If so then I don't think this is a bug. public virtual void M(out object o) in Bar just has the effect as putting new keyword to replace the original Foo.M because it has exactly same signature

Maybe a bug in compiler warning that it not warn anything about this


@yaakov-h commented on Sun Jan 14 2018

Maybe, except the spec explicitly says that for the purposes of hiding or overriding - which is exactly what we are doing here - "ref and out are considered part of the signature and do not match each other."


@Thaina commented on Sun Jan 14 2018

@yaakov-h I see then I think it was a bug from the point that Bar.M was not error from being hiding Foo.M


@MkazemAkhgary commented on Mon Jan 15 2018

have you tried this on older c# compilers? I'm wondering for how long it was unnoticed.


@yaakov-h commented on Mon Jan 15 2018

According to a coworker this behaviour has been around for years.


@stakx commented on Mon Jan 15 2018

have you tried this on older c# compilers? I'm wondering for how long it was unnoticed.

I just tried this with Mono's C# compiler, and with http://rextester.com which claims to use C# compiler version 4.0.30319.17929. (If I understand correctly, this is a pre-Roslyn version of the C# compiler.) They all produce the same output as shown above. So this behaviour has probably been around for a long time.

It appears to me that when it comes to "signatures", the C# spec differs from the underlying platform's notion of signatures. As far as I can tell, CLI signatures (at least the standalone ones) do not distinguish between ref and out—there are both simply by-ref parameters. (See ECMA-335, 搂22.2.10.) The CLI considers optional or required modifiers as being part of a signature, but C# doesn't use these to mark ref or out parameters. And custom attributes aren't considered part of a signature, so distinguishing using the [Out] pseudo-attribute is out of the question, too.

The C# spec OTOH explicitly states (as mentioned in the initial post) that ref and out don't match each other with regard to overriding / method hiding.

Perhaps the C# compiler simply takes a shortcut here and implements its overloading / method hiding rules by directly employing the CLI's hidebysig mechanism unaltered (which might be good enough in practice, but not really sufficient given the differences in the respective specifications)? The C# spec suggests just that in 搂3.6 Signatures and overloading:

This restriction [not being able to overload based on ref or out alone in the context of a single type] is to allow C# programs to be easily translated to run on the Common Language Infrastructure (CLI), which does not provide a way to define methods that differ solely in ref and out.
...
F(ref int) and F(out int) cannot be declared within the same interface because their signatures differ solely by ref and out.

Could it be that the spec writers simply forgot to consider cases where such signature "collisions" appear, not in a single type, but in a type hierarchy?


@brian-reichle commented on Mon Jan 15 2018

have you tried this on older c# compilers? I'm wondering for how long it was unnoticed.

I first noticed it around the .NET 4.0/4.5 days, I'm fairly sure I also tested it on a machine that only had .NET 3.5 and VisualStudio 2008 installed (I don't recall if csc was distributed with the framework or visual studio back then). My interpretation of ECMA-334 at the time was that having the methods in the same type was prohibited but different types in the same inheritance chain was ok (if you don't mind crazy code).

As @stakx pointed out, the ref and out are distinguished by an attribute and attributes are not considered part of the signature by the runtime, though I have seen various tools behave as though it were.

I believe this could be resolved with a MethodImpl record, but I think that might be heading down a problematic path and I would be happy if the C# spec were simply 'clarified' to prohibit this :)


@stakx commented on Mon Jan 15 2018

the ref and out are distinguished by an attribute

(@brian-reichle - I need to correct myself there: AFAIK, it isn't actually OutAttribute itself. That is only a pseudo-custom attribute and doesn't even get emitted as such; it is turned into a particular flag (ParamAttributes.Out) in the Params metadata table, see ECMA-335 搂20.2.1 and 搂22.1.12. The information is there in IL metadata, but the runtime probably doesn't use it. I cannot find the exact definition of the hidebysig signature matching rules right now, but I suspect it doesn't take any ParamAttributes into account, so the end result is the same.)


@brian-reichle commented on Mon Jan 15 2018

@stakx, I think you might be right, but either way it's not using a modopt or modreq so the end result is the same. IIRC, matching (for the purpose of overriding) is based only on the method name, and signature blob, ParamAttributes.Out is definitely not part of the signature blob.

hidebysig is ignored by the runtime (Partition II section 15.4.2), I believe it's supposed to provide guidance for tools so they can match language behaviour.


@brian-reichle commented on Tue Jan 16 2018

Just for the record, here is a/the error report on connect from 2014

https://connect.microsoft.com/VisualStudio/feedback/details/815996/c-compiler-out-vs-ref-override-added-to-wrong-vtable-slot


@tannergooding commented on Wed Jan 17 2018

FYI. @vsadov, @jcouv


@jcouv commented on Fri Jan 19 2018

From the readonly-ref spec:

It is not permitted to overload on ref/out/in differences.
It is permitted to overload on ordinary byval and in differences.

Moving to roslyn repo as a compiler bug.

Area-Language Design

Most helpful comment

This is a long standing issue with a design choice that virtual calls in C# are emitted via "least derived" instead of the "most derived" method.

It is confusing in cases where inheritance hierarchy as seen by CLR is not the same as seen by C#.
Here CLR does not understand ref/out differences, but C# does.

So when user sees call to Baz.M, which is intuitive and is supported by the VS and GTD, in reality a call to Foo.M is emitted.
Essentially C# traces the target method back to its least derived override, hoping that CLR will trace it back to the most derived one in the context of the call. However, since C# and CLR disagree on the hierarchy, CLR arrives to to a different method and calls that.

With introduction of named parameters, optional parameters, tuples and other features where the information from the most derived statically known method wins over, such behavior is getting more and more confusing.

Every time we see a bug like this we consider a change where we emit calls via the _most_ derived override. In most cases the difference would be unobservable, but the confusing cases like this would be "fixed" and that in itself could be considered a breaking change.

I think the evidence is mounting that we should make the fix.

This should go back to the dotnet/csharplang.

CC:@gafter

All 10 comments

This is a long standing issue with a design choice that virtual calls in C# are emitted via "least derived" instead of the "most derived" method.

It is confusing in cases where inheritance hierarchy as seen by CLR is not the same as seen by C#.
Here CLR does not understand ref/out differences, but C# does.

So when user sees call to Baz.M, which is intuitive and is supported by the VS and GTD, in reality a call to Foo.M is emitted.
Essentially C# traces the target method back to its least derived override, hoping that CLR will trace it back to the most derived one in the context of the call. However, since C# and CLR disagree on the hierarchy, CLR arrives to to a different method and calls that.

With introduction of named parameters, optional parameters, tuples and other features where the information from the most derived statically known method wins over, such behavior is getting more and more confusing.

Every time we see a bug like this we consider a change where we emit calls via the _most_ derived override. In most cases the difference would be unobservable, but the confusing cases like this would be "fixed" and that in itself could be considered a breaking change.

I think the evidence is mounting that we should make the fix.

This should go back to the dotnet/csharplang.

CC:@gafter

I think I've run into this issue as well (see below). I originally found out about it in https://stackoverflow.com/q/16419168/240733.

@VSadov, if the behavior were changed from "least derived" to "most derived" method, would this extend to how the C# compiler captures methods as MethodInfo in LINQ expression trees?

I recently dealt with a bug over moq/moq4#497 that seems directly related to the "least derived" behavior. I added reflection code in https://github.com/moq/moq4/pull/497/commits/f9969156c28ca347489d49f56d2ae60b2ac80256 to find the "most derived" method given the base declaration; this would perhaps become obsolete after the change you describe?

If the change is made, it might be particularly noticeable in reflection-heavy code such as Moq 4, where it could surface as new bugs. I'm not opposing the change (it would perhaps be for the better), I'd just like to mention LINQ expression trees and reflection in general (and perhaps even Roslyn analyzers?) as places where people might have to modify their code after the change to maintain the previous behavior.

If the change is not made, perhaps the current behaviour could be documented somewhere?

/cc @kzu

I'll re-open the csharplang issue with added comments, and close the present issue.

@stakx - the expression tree situation would be more directly observable than the change in IL.
I doubt that expression trees would change.

@VSadov - In a way this would be a pity, as it would mean that expression trees get even more out of sync with the C# language as whole. Thanks for taking the time to reply, though!

Technically C# spec specifies binding to be happening in terms of least derived methods. That is not very visible and does not arguably prescribe how actual calls are emitted.

Indeed - The actual method that gets to run may be neither the _least_ nor the _most_ derived (as we see statically), since there could be even more derived overrides that we do not see at compile time.
As long as the method that gets to run matches what spec imply - the IL is correct.
From that point, emitting a call to the _most_ derived is actually more representative of the binding choices since there are fewer chances that CLR will mess up something and an unexpected method gets called.

On the other hand the expression tree, exists to represents the "bound tree". One could argue that the tree must respect the binding spec and literally have a call to the least derived method, while IL should match the execution semantics, where most derived could yield more stable behavior.
(the actual shape of the IL is not bound by particular contracts as long as the right methods get called)

What I mean here is that - an argument could be made for the IL to change, while expression trees to stay the same.

In reality the decision around the expression trees will be driven by the backward compatibility.

Changing shape of trees often breaks people in unexpected ways. The value vs. risk will not be in the favor of the change there.

I totally appreciate backward compatibility concerns and am content to leave it at that.

I do not fully understand your previous posts however, I'd be grateful if you could make it clearer to me:

the actual shape of the IL is not bound by particular contracts as long as the right methods get called

Here you seem to be saying that the spec doesn't proscribe the exact IL that needs to be emitted; the exact IL is an implementation detail (i. e. in Roslyn's territory).

Every time we see a bug like this we consider a change where we emit calls via the most derived override.

This suggests that you could just change the IL without having the spec changed. Yet, moving the issue back to dotnet/csharplang suggests to me that you propose a change to the spec instead (regarding the part that defines how method binding should work).

Technically C# spec specifies binding to be happening in terms of least derived methods. [...] On the other hand the expression tree, exists to represents the "bound tree". One could argue that the tree must respect the binding spec and literally have a call to the least derived method [...]

Given the above, this argument doesn't make sense to me: If you're going to change the spec so binding happens in terms of "more derived" instead of "least derived", then couldn't a case be made with the same argument, i. e. that expression trees should respect the binding spec and therefore represent "more derived" methods?

So in order to keep backward compatibility in expression trees, you'd have to disregard the binding spec, if it gets changed?

This was discussed before. It is not a straightforward issue.

  • It is not clear whether the change requires spec adjustments or not. I think it does not, but I know that there are other opinions. We may not change the spec at the end, but there must be an agreement on not changing ;-) . If we do the change at all.
  • There would be an observable change in behavior. It must be examined from the compatibility standpoint.

So in order to keep backward compatibility in expression trees, you'd have to disregard the binding spec, if it gets changed?

Or amend the spec in a way that expression trees do not need to change.
Or make a note in compiler spec/implementation that spec is intentionally not followed for compat reasons.

Compatibility is treated very seriously and expression trees is one of the areas historically known to be very sensitive to changes.

I see. Thank you for spelling it out so clearly for me. :+1:

Was this page helpful?
0 / 5 - 0 ratings