Roslyn: Allow by-ref Extension methods.

Created on 29 Jan 2015  路  25Comments  路  Source: dotnet/roslyn

Currently C# allows methods that are public and static on static classes to be extension methods by augmenting the first parameter with the this keyword.

This is all well and good, but structs cannot really benefit from this feature as much as struct method invocation is basically always by-ref. e.g. Calling 3.GetHashCode() actually translates in il like...

ldc.i3
stloc.0
ldloca.s 0
call Int32 GetHashCode

(Forgive me if that's wrong its from memory.) The basic gist here is that we actually get a reference to the memory location and then invoke the method. My proposal is to allow a user to declare extension methods with a by ref first parameter if it is a struct. Thus:

public static void MyMethod(this ref int x){...} // allowed
public static void MyMethod<T>(this ref T x) where T:struct //allowed
public static void MyMethod(this ref string x) // not allowed.

Technically, I suppose we don't need to restrict it. But I think the behavior isn't as necessary for class types.

This would be a large benefit for larger structs as copying them is something you always would want to avoid.

3 - Working Area-Language Design Feature Request Language-C# New Language Feature - Ref Locals and Returns

Most helpful comment

The following is valid code:

public struct SomeStruct
{
    public void SomeMethod()
    {
        this = new SomeStruct();
    }
}

The following should be valid as well, with _exactly_ the same result:

public static void SomeMethod(this ref SomeStruct value)
{
    value = new SomeStruct();
}

All 25 comments

Interestingly, Visual Basic already supports ByRef extension methods

ref extension methods should only allowed when used together with readonly (as in #115) to prevent a reassignment of x inside the extension method.

@fubar-coder No, that would _completely_ defeat the purpose of this proposal. The purpose here is to be able to create an extension method for a struct type which behaves like an instance method of that type. Instance methods of a struct type pass the instance by reference; extension methods look the same in code but always pass the instance by value.

Keep in mind that when you call an instance method on a struct which is marked readonly, a local variable is allocated to hold a copy of that struct, and a reference to that local copy is passed to the method. This means even the memory allocation benefits of pass-by-reference would be lost if you required it only be used with readonly.

Oh there is an interesting question to ask.
Currently new int().GetHashCode() is legal, but calling a method like this is not SomeMethod(ref new int()). If SomeMethod was now an extension method would new int().SomeMethod() be legal?

@mburbea I would imagine yes. In new int().GetHashCode(), the int is getting implicitly passed by reference.

I guess that you misunderstood me. Just take a look at this example:

public static void MyExtensionMethod(this ref SomeStruct x) {
    x = new SomeStruct();
}

This should not be allowed, because it produces an unpredictable side effect.

When I wrote about readonly, I meant that x cannot b be reassigned, but x will still be mutable. What you meant was a const value that x points to, which might be useful too, but wasn't the problem I wrote about. Therefore my resolution: ref extension methods must not allow a reassignment of x, which matches the readonly parameter proposal.

EDIT: Even when the value of x is immutable, then this doesn't require a copy of the value. readonly for the parameter and/or value are just contracts that guarantee that x doesn't get replaced with a different instance and/or that the value of x remains unchanged.

The following is valid code:

public struct SomeStruct
{
    public void SomeMethod()
    {
        this = new SomeStruct();
    }
}

The following should be valid as well, with _exactly_ the same result:

public static void SomeMethod(this ref SomeStruct value)
{
    value = new SomeStruct();
}

I think this would be a great idea. But I would not restrict it to structs, I think it has merits for classes as well, for example:

c# public static void Release<T>(this ref T disposable) where T : class, IDisposable { if (disposable != null) { disposable.Dispose(); disposable = null; } }

Indeed. Visual Basic already allows exactly that:

<Extension>
Public Sub Release(Of T As {Class, IDisposable})(ByRef disposable As T)
    If disposable IsNot Nothing Then
        disposable.Dispose()
        disposable = Nothing
    End If
End Sub

This feature would only encourage mutable structs, which are almost always a bad idea.

@thomaslevesque Indeed, I would like to combine this with readonly (with the compiler enforcing the immutable contract). See Issue #115. You get something similar to C++'s const T& x where the compiler enforces 'constness' and you avoid copying which, for large structs will be a performance win.

@dfkeenan The primary reason why I'm in favor of this is it allows extension methods to provide the same semantics as other instance methods when the receiver is a value type. For the same reason, I strongly oppose a move to allow reference types to be passed by reference as the receiver of an extension method. Not only would this be unexpected behavior of an instance method, but it is actually introducing a semantic concept which has never before existed in C# and provides no hint of the difference at the call site during code reviews.

@sharwell You make a pretty good point. The dangers probably outway the benefits for reference types.

Personaly I would probably only ever use it for internal helper methods because I am lazy and want to save key strokes from things like DisposerHelper.Release(ref disposable);. :smile:

@sharwell

I strongly oppose a move to allow reference types to be passed by reference as the receiver of an extension method

How would you achieve that when it comes to generics? A variable of type T might be a value type or it might be a reference type, so what should the compiler do in such case?

Here I'm simulating ref extension methods using normal static ref methods.

``` c#
static void Intermediary() where T : I, new()
{
T x = new T();
// imagine those are called as extension methods
ByReferenceExtension1(ref x);
ByReferenceExtension2(ref x);
}

// you want to allow this one for value types
static void ByReferenceExtension1(ref T x) where T : I
{
x.M1();
}

// but forbid this one for reference types
static void ByReferenceExtension2(ref T x) where T : I
{
x = x.M2();
}
```

Full code

@svick Since my goal is to allow a developer to write an extension method that both looks and behaves like an instance method, there is no need to allow a generic extension method to use a by-ref receiver. Instance methods of a type are always associated with a specific type.

It might be possible to allow a generic extension method to use a by-ref receiver as long as the generic type constraint included struct:

private static void ByReferenceExtension1<T>(ref T x)
  where T : struct, I<T>
{
  x.M1();
}

+1

I鈥檇 like to write an extension method to use on an enumeration, I don鈥檛 think I could get the effect I want without being able to use this ref T x.

VB allows extensions on byref receiver. That does allow some neat scenarios around structs. It also results in some difficult behaviors around generics. There are some features that do not compose well with that.
For example ?. conditional accesses combined with extensions on byref generic receivers lead to difficulties figuring whether sideeffects should be able to mutate receiver or not and whether we should nullcheck the reciever or capture it in a temp first. Combine that with async and you are in a serious trouble.

IMO. plainly allowing extensions with byref receivers would allow for too many complicated corner cases.

However, if receiver is required to be a struct, it might actually avoid most of the difficult scenarios while still being useful.

Interesting suggestion to think about.

@sharwell Mutable struct is not bad idea at all. Struct is meant to be mutated for performance. And we must mark it as readonly to force immutable, that was the purpose of readonly from the start

Surely readonly struct will not be able to call ref extension method. And it will compile error like you change variable of readonly struct object

Just think about Matrix.Rotate()

@Thaina

Surely readonly struct will not be able to call ref extension method. And it will compile error like you change variable of readonly struct object.

The readonly keyword would have no impact on the ability to call by-ref extension methods for value types. I'll demonstrate with each of the ways you can use readonly with value types:

Readonly field in a value type

public struct SomeStruct
{
  private readonly int value;

  public SomeStruct(int value) { this.value = value; }
}

The above struct is a pretty straightforward immutable struct. However, the following unit test will pass just fine.

[Fact]
public void TestMutability()
{
    SomeStruct someStruct = new SomeStruct(1);
    Assert.Equal(1, someStruct.Value);

    someStruct.SetValue(3);
    Assert.Equal(3, someStruct.Value);
}

public struct SomeStruct
{
    private readonly int value;

    public SomeStruct(int value)
    {
        this.value = value;
    }

    public int Value => value;

    public void SetValue(int value) { this = new SomeStruct(value); }
}

It would therefore make sense to be able to define the following extension method:

public void SetValueExtension(this ref SomeStruct obj, int value)
{
  obj = new SomeStruct(value);
}

With that extension in place, the test would work as follows:

[Fact]
public void TestMutability()
{
    SomeStruct someStruct = new SomeStruct(1);
    Assert.Equal(1, someStruct.Value);

    someStruct.SetValue(3);
    Assert.Equal(3, someStruct.Value);

    someStruct.SetValueExtension(4);
    Assert.Equal(4, someStruct.Value);
}

Readonly struct within another type

In this case, the struct does not have readonly fields, but the field holding the struct value is declared readonly. The following example shows another passing test, where the field normalValue of a struct type is not marked readonly, and the field readonlyValue of the same struct type is marked readonly.

[Fact]
public void TestOtherMutability()
{
    SomeClass obj = new SomeClass(1);
    Assert.Equal(1, obj.NormalValue);
    Assert.Equal(1, obj.ReadonlyValue);

    obj.NormalValue = 3;
    Assert.Equal(3, obj.NormalValue);

    obj.ReadonlyValue = 3;
    Assert.Equal(1, obj.ReadonlyValue);
}

public class SomeClass
{
    private SomeOtherStruct normalValue;
    private readonly SomeOtherStruct readonlyValue;

    public SomeClass(int value)
    {
        normalValue = new SomeOtherStruct(value);
        readonlyValue = new SomeOtherStruct(value);
    }

    public int NormalValue
    {
        get { return normalValue.Value; }
        set { normalValue.SetValue(value); }
    }

    public int ReadonlyValue
    {
        get { return readonlyValue.Value; }
        set { readonlyValue.SetValue(value); }
    }
}

public struct SomeOtherStruct
{
    private int value;

    public SomeOtherStruct(int value) { this.value = value; }

    public int Value => this.value;

    public void SetValue(int value) { this.value = value; }
}

The reason this behaves as it does is due to the way the compiler handles calls to SomeOtherStruct.SetValue (which takes the receiver by reference since it's an instance method). When calling the method with normalValue, it simply passes a reference to the normalValue field. However, when calling the method with readonlyValue it first copies the field value into a temporary local variable, and then it passes a reference to the temporary variable instead of the original field when calling SetValue.

Calls to by-ref extension methods would behave _exactly_ the same way.

@sharwell Well what I mean is to set readonly on the struct, not the internal variable of that struct

Also that why I was support another feature (#115) about readonly keyword at function parameter and local var. So we can pass struct by reference without mutate it if we could be used readonly at this for extension method

But still your case is fine by itself. The extension method, from the user perspective, they know what extension method they created
So to make

public void SetValueExtension(this ref SomeStruct obj, int value)
{
  obj = new SomeStruct(value);
}

is the same as

var someStruct = new SomeStruct(1);
someStruct = new SomeStruct(2);

There are also a proposal about extension constructor and extension method to dispose and null at the same time. So it is useful to allow mutate extension function like this

Remember C# is still object orient with functional programming feature. Not a pure functional programming language. So the mutation is useful in OOP and structural programming point of view. Sometimes we want object from one class to be managed by other class and just use the result. That's why we have Factory

So my point is, to call function of struct is already known that it could mutate that struct. And that is on caller's own risk. Extension method should be treated in the same manner as normal function, no exception. So if readonly variable in struct cannot protect it from function that mutate whole struct, so do extension method will not. And you need readonly (normal readonly / readonly parameter / readonly local var //// => #115 ) To protect your struct

What about changing where the this is for byref to reduce confusion e.g.

ByVal

public static void MyMethod(this int val) {...}
public static void MyMethod<T>(this T val) {...}
public static void MyMethod(this string val) {...}

val. used in method body

ByRef

public static void MyMethod(ref int this) {...}
public static void MyMethod<T>(ref T this) {...}
public static void MyMethod(ref string this) {...}

this. used in method body

Also work with ref returns for chaining/pipelining/tailcalls

public static ref int MyMethod(ref int this) {...}
public static ref T MyMethod<T>(ref T this) {...}
public static ref string MyMethod(ref string this) {...}

@benaadams How does that reduce confusion? I think it would increase it, because it makes the language less consistent. Especially since it would mean that this can now sometimes be null.

I can't find origin but I see this syntax in some roslyn issue

He propose it like this

```C#
public static class AnyExt
{
public void int.DoSomeThing() {...}
public T T.ReturnSelf() { return this; }
public ref T T.ReturnRefSelf() { return ref this; } // Always pass by ref
public static void int.DoSomeThingStatic() {...}

public T T.Swap<T>(ref T another)
{
    var tmp = another;
    another = this;
    this = tmp;
}

}

///
0.DoSomeThing();
var zero = 0.ReturnSelf();
ref var tmp = zero.ReturnRefSelf();
var one = 1;
zero.Swap(ref one);
int.DoSomeThingStatic();
```

I think this is interesting syntax

@benaadams That syntax is so satisfying, I wish we could change it for byval as well.

This is now tracked by https://github.com/dotnet/csharplang/issues/186. It is championed by @MadsTorgersen .

Was this page helpful?
0 / 5 - 0 ratings