(Note: this proposal was briefly discussed in #98, the C# design notes for Jan 21, 2015. It has not been updated based on the discussion that's already occurred on that thread.)
Since the first release of C#, the language has supported passing parameters by reference using the 'ref' keyword, This is built on top of direct support in the runtime for passing parameters by reference.
Interestingly, that support in the CLR is actually a more general mechanism for passing around safe references to heap memory and stack locations; that could be used to implement support for ref return values and ref locals, but C# historically has not provided any mechanism for doing this in safe code. Instead, developers that want to pass around structured blocks of memory are often forced to do so with pointers to pinned memory, which is both unsafe and often inefficient.
The language should support the ability to declare ref locals and ref return values. We could, for example, now declare a function like the following, which not only accepts 'ref' parameters but which also has a ref return value:
``` C#
public static ref TValue Choose
Func
{
return condition() ? ref left : ref right;
}
With a method like that, one can now write code that passes two values by reference, with one of them being returned based on some condition:
``` C#
Matrix3D left = …, right = …;
Choose(chooser, ref left, ref right).M20 = 1.0;
Based on the function that gets passed in here, a reference to either 'left' or 'right' will be returned, and the M20 field of it will be set. Since we’re trading in references, the value contained in either 'left' or 'right' is updated, rather than a temporary copy being updated, and rather than needing to pass around big structures, necessitating big copies.
If we don't want the returned reference to be writable, we could apply 'readonly' just as we were able to do earlier with ‘ref’ on parameters (extending the proposal mentioned in #115 to also support return refs):
``` C#
public static readonly ref TValue Choose
Func
{
return condition() ? ref left : ref right;
}
…
Matrix3D left = …, right = …;
Choose(chooser, ref left, ref right) = new Matrix3D(...); // Error: returned reference is read-only
Note that when referencing the 'left' and 'right' ref arguments in the Choose method’s implementation, we used the 'ref' keyword. This would be required by the language, just as it’s required to use the ‘ref’ keyword when passing a value to a 'ref' parameter.
## Solution: ref locals
Once you have the ability to receive 'ref' parameters and to return ‘ref’ return values, it’s very handy to be able to define 'ref' locals as well. A 'ref' local can be set to anything that’s safe to return as a 'ref' return, which includes references to variables on the heap, 'ref' parameters, 'ref' values returned from a call to another method where all 'ref' arguments to that method were safe to return, and other 'ref' locals.
``` C#
public static ref int Max(ref int first, ref int second, ref int third)
{
ref int max = first > second ? ref first : ref second;
return max > third ? ref max : ref third;
}
…
int a = 1, b = 2, c = 3;
Max(ref a, ref b, ref c) = 4;
Debug.Assert(a == 1); // true
Debug.Assert(b == 2); // true
Debug.Assert(c == 4); // true
We could also use ‘readonly’ with ref on locals (again, see #115), to ensure that the ref variables don’t change. This would work not only with ref parameters, but also with ref locals and ref returns:
C#
public static readonly ref int Max(
readonly ref int first, readonly ref int second, readonly ref int third)
{
readonly ref int max = first > second ? ref first : ref second;
return max > third ? ref max : ref third;
}
If I recall, Eric Lippert blogged about this some years back and the response in the comments was largely negative.
I do not like this feature for C#. The resulting code is like an uglier version of C++, and code written with it takes longer to reason about and understand. The use-cases are not particularly compelling, and I have never run into a situation where I wished I had ref
locals or return values.
Yes, I know very well that mutable structs should be avoided. Still, one interesting use case would be lists of mutable structs. Consider:
C#
struct MutableStruct { public int X { get; set; } }
MutableStruct[] a = ...
List<MutableStruct> l = ..
a[3].X = 5; // changes the value of X of the struct in the array
l[3].X = 5; // compile time error
If the indexer of the List<T>
class would return the value stored in the list by reference, the code above would compile, making the use of mutable structs less surprising. It is probably even more efficient as the (potentially large) struct no longer has to be copied out from the list.
Unfortunately, I doubt that the return type of List<T>
's indexer can be changed for backwards compatibility reasons.
Disclaimer: I work on game engine, so I am probably not the typical user.
One use case this could really help us is this one:
MyHugeStruct[] data; // we use a struct to improve data locality and reduce GC pressure
// Ideally, we would like to be able to use List<T>, but we can't take ref then
for (int i = 0; i < data.Length; ++i)
{
// Option 1: make a local copy (slow)
var item = data[i];
// Option2: To avoid making a stack copy of MyHugeStruct,
// we have to defer to a inner loop function
MyLoopBody(ref data[i]);
// Option3: using new proposal, that would be much better:
ref MyHugeStruct = data[i];
}
We end up making separate function for loop body, and in case of tight loop this can end up being quite bad:
Nice to have:
Extra (probably impossible without changing BCL):
What happens with this?
var data = GetData();
...
ref SomeStruct GetData()
{
var ss1 = new SomeStruct();
var ss2 = new SomeStruct();
return ref Choose(ref ss1, ref ss2);
}
ref SomeStruct Choose(ref SomeStruct ss1, ref SomeStruct ss2)
{
return whatever ? ref ss1 : ref ss2;
}
GetData
might not be aware that Choose
is returning one of its variables and returns to the caller a reference to it.
Does the value still exist after exiting GetData
?
@paulomorgado You would not be allowed to return a ref to a local variable or parameter.
@gafter, the only difference between my Choose
method and @stephentoub's one is that mine does not have the selector passed as a delegate. Did I miss something here?
@paulomorgado, the compiler would only let you return a ref to something that it knew was either on the heap or that came from the caller. In my example, the ref inputs to the Choose method were all from ref parameters (or ref locals to ref parameters), so the compiler would conclude that the result of the Choose method met the criteria and would allow its returned ref to be returned. But in your example, the refs passed to Choose were not from the caller nor from the heap, such that the compiler couldn't be sure that the result of Choose was allowed to be returned, and it would error out.
@stephentoub, forget my Choose
method. Your's is the best that can be done and you just published it to NuGet and I added it to my project. How can the compiler know where the return valur of Choose
is coming from? My GetData
is just complying to the contract of Choose
to get its result and pass along as all the code written so far and to be written in the future does.
What you're saying is that publicly exposed methods can't return ref
s, which reduce the usage to only private methods.
@paulomorgado, I understand the confusion, but that's not what I'm saying.
There would be some rules about what it would be safe to return, e.g.
Forget the implementation of Choose here. Assuming Choose abides by these rules (which the compilation of Choose would enforce), in my example all of the inputs to Choose were valid to be returned, therefore the result of Choose could be returned. In your example, at least one of the inputs to Choose wasn't valid to be returned, therefore the result of Choose could not be returned. The compiler can validate that.
@stephentoub, what I'm having trouble with is understanding how those rules can be effectively enforced.
And a proposal should have an example that works under the proposal.
@paulomorgado, how does my example not work under the proposal? And why do you believe the rules can't be enforced?
@stephentoub, either that or I totally missed everything.
My understanding is that there's no way the caller can take the result of your Choose
method as safe to return as reference. Is there? If so, how?
@paulomorgado, in this example:
public static ref TValue Choose<TValue>(
Func<bool> condition, ref TValue left, ref TValue right)
{
return condition() ? ref left : ref right;
}
left and right are both safe to return because they came from the caller.
In this example:
public static ref int Max(ref int first, ref int second, ref int third)
{
ref int max = first > second ? ref first : ref second;
return max > third ? ref max : ref third;
}
first, second, and third are all safe to return because they all came from the caller. max is safe to return because the only refs it's possibly assigned to are those which are safe to return.
If I as a caller wanted to use Choose, e.g.
public static ref TValue ChooseByTime<TValue>(
ref TValue left, ref TValue right)
{
return Choose(() => DateTime.UtcNow.Seconds % 2 == 0, ref left, ref right);
}
Both left and right are safe to return because they came from the caller. Therefore all of the ref inputs to Choose are safe to return. Therefore the resulting ref from Choose is also safe to return. I don't need to worry about the implementation of Choose, because the compiler is enforcing all of these same rules on the implementation of Choose.
Both left and right are safe to return because they came from the caller. Therefore all of the ref inputs to Choose are safe to return. Therefore the resulting ref from Choose is also safe to return. I don't need to worry about the implementation of Choose, because the compiler is enforcing all of these same rules on the implementation of Choose.
But ChooseByTime isn't returning neither left nor right. It's returning the return value of Choose. Noting but the implementation details of Choose is saying its return value is the same as one of its parameters. What if Choose is an implementation of an interface?
You're restricting the use of Choose to cases where it works without any safeguards or proof that it's safe.
My example shows the opposite.
@paulomorgado, your example wouldn't compile... the compiler would error out exactly because it doesn't abide by the rules: your call to Choose is passed ref values that are not safe to return, therefore the result of your call to Choose is not safe to return. I'm sorry if I'm not explaining this well; not sure how to convey it differently.
Noting but the implementation details of Choose is saying its return value is the same as one of its parameters.
Ah, maybe this is the point of confusion. The implementation doesn't matter because the compiler assumes the worst: regardless of how a parameter is actually used, if any argument isn't safe to return, then the result of the call isn't safe to return. The compiler is conservative in that regard.
A conservative compiler that assumes the worst cannot assume the return value of Choose is safe to return.
Is this what you're proposing?
public static ref TValue ChooseByTime<TValue>(
ref TValue left, ref TValue right)
{
TValue result = Choose(() => DateTime.UtcNow.Seconds % 2 == 0, ref left, ref right);
if (result == left) reurn ref left;
else if (result == right) return ref right;
else throw new Exception("Invalid value.");
}
Why do you say that? What specifically about this example do you believe is problematic?
Let's try something else: can you construct an implementation of Choose that will compile based on the aforementioned rules/explanations but where the caller of the method could not assume its return value was safe to return?
No I can't. Because I haven't been able to understand how this would work.
I can understand how, in your implementation of Choose, it is safe to return that reference.
What I can't understand is why its callers can safely return the same reference without intimately knowing its internals..
Because it wouldn't be allowed to return anything that's not safe in the case where the caller assumes it is safe. If the only thing the caller passes in are refs that are safe to return, then what could this method return?
Etc.
So, this wouldn't be safe, right?
public static ref TValue ChooseByTime<TValue>(
ref TValue left)
{
ref TValue right = default(TValue);
return Choose(() => DateTime.UtcNow.Seconds % 2 == 0, ref left, ref right);
}
Correct, that would not compile.
Beautiful solution, I've wondered why this couldn't be done before.
@MgSam [The resulting code is like an uglier version of C++]
Because of sentiments like this (i.e. 'anything I don't personally use should never be part of the language _for anybody else either_, even though the CLR itself has this capability'), it means our language is needlessly crippled in places where a very easy and beautiful solution like this gives us such a capability. As the gamer showed in the comment above, this can be a big performance win in some cases.
:+1:
Anytime I can pass a pointer instead of performing a value copy, I'm all for it. Are there good reasons to pass memory by value-copy? Yes. Should it always be the case? Absolutely not.
The resulting code is like an uglier version of C++
I agree, it is not pretty but it is very descriptive. It would be nice if the ref
keyword could be replaced with syntax we're all used to. Perhaps we could use *
in place of ref
because int* foo;
is "cleaner" and "easier" to read than ref int foo;
. I put "cleaner" and "easier" in quotes because it is incredibly subjective.
Yes, I know that *
is generally reserved for unsafe
but there's no reason the symbol cannot be reused, so long as one is reserved for a "safe" contexts and the other for an "unsafe" context.
Given the limitations listed above imposed to maintain a safe context I'm having a hard time envisioning the use cases for this feature. The real gains would seem to be in how structs can be used throughout the BCL with arrays, lists or other collection types.
Given the limitations listed above imposed to maintain a safe context I'm having a hard time envisioning the use cases for this feature. The real gains would seem to be in how structs can be used throughout the BCL with arrays, lists or other collection types.
Agreed. This is, in my opinion, a small step in the right direction though.
Would this implementaion allow for ref int[] intRefs = new ref int[512];
?
If it doesn't, then I am less excited than I originally was. If it does, read ref struct[]
is difficult. Is it a reference to an array of structures or an array of structure references?
Better to use struct*[]
in my opinion.
I don't disagree that ref something
is unattractive, however your use of *
is already legal C# syntax and implies an unsafe context. I'm sure that you know that, but I thought it warranted mention.
I imagine that the array scenario would likely depend on the proposal for fixed-size buffer enhancements, #126. Once the size is determined and allocated I believe that would behave the same as a field or as a local.
I don't disagree that ref something is unattractive, however your use of * is already legal C# syntax and implies an unsafe context. I'm sure that you know that, but I thought it warranted mention.
I do. I also know that *
is only legal with an unsafe
block. Thus, the compiler could assume that *
needed to be "safe" unless in an unsafe
block. Therefore operations like int* p = ...; p++;
would no be legal, instead int* p
would have to point to safely referenced memory.
Yes, there would be complexities if devs started an unsafe
block, but there can rules established on this would work, etc.
FYI: the PR for the initial commit of a prototype #4042
I support ref return
And can we have ref parameter in lambda?
@Thaina You can use ref
parameters in lambdas today as long as the signature of the target delegate defines those parameters as ref
:
public delegate void RefAction<T>(ref T arg);
RefAction<string> action = (ref value) => { value = "Hello World!"; };
string x = "";
action(ref x);
Console.WriteLine(x);
@HaloFour: What @Thaina probably means is that you can't capture a ref-parameter in a lambda.
@HaloFour Sorry I don't know that. Which version we can use ref lambda?
I use unity for such long time so I don't update new info of C# much
@axel-habermaier Maybe. That wouldn't be my first guess given the proposal they posted under, but it is terribly unspecific. IIRC ref
parameter capture would be wading too close to unsafe
territory since you'd basically have to stuff the address to a variable in the state machine class and the compiler could no longer control its lifetime.
@Thaina C# has always supported ref
and out
parameters for anonymous delegates and lambdas.
oh... I never know that we just can't (ref i) => {}. I just need to (ref int i) => {}
Thanks for your point
Sorry for necropost, but i have question. I found that ref properties will be supported but only for getter. Why couldnt it be resolved for setters too? I mean if we have
class Foo{
public ref int Number{
get;
set;
}
}
it could be resolved to public ref int get_Number(){...}
and public void set_Number(ref int){...}
And if there is reason for abandoning setters why not do like this:
class Foo{
public string Description{
ref get;
set;
}
}
so we still be able to have setter and getter in one property (or this is already the case?)
@BreyerW the main reason for disallowing setters in byref properties and indexers is that they would not be very useful. While you can make a ref for a field or an array element and return that from the getter, you cannot go the other way in the setter.
If some use pattern is discovered, restriction on byref setters can be relaxed later, so it was decided to start with not allowing them.
Thanks for reply. I wonder - avoiding copy value types while passing to setter method isnt a good thing? And next thing - allowing non-ref setter alongside with ref getter is impossible? like i show in second example?
EDIT:
Ah and if there isnt any dangerous situation with ref setter I dont see why we have to be so strict about this - if someone find pattern for ref setter then you dont have to cook special c# version in future, this already be enabled. And ref could be defined per acessor not per property. Obviously you are designers not I, so possibly there is something subtle i dont know ;).
@BreyerW An important part here is that byref properties and indexers have assignable getters.
If a type has a byref indexer, you already can read and write elements without redundant copying.
What would be the "obvious purpose" of a setter if property is already assignable via its getter?
In particular byval setter next to a byref getter would actually make assignments ambiguous.
C#
obj.Description = "aaa";
is this a assignment to getter or invocation of a setter?
There are short and long term costs of adding language features and it is next to impossible to remove them. That motivates the design team to resist features with unclear utility or confusing behavior.
oh i think i understand now why you abandoned setter, one of the reason is that ref getter can work like setter thanks to returning ref so there is no point in having setter? If that true then i completely see why you abandoned this. Thanks for clarification, now i feel a bit dump, obviously overlooked that.
the only thing that can be missing is fact by using ref getter there might be problem with firing events like OnBeforeValueChange but this is feature of ref itself, not flaw of c# design
BTW dont forget to update PropertyInfo somehow so CanWrite return true if there is only ref getter or add new check property for signalising there is ref getter. I mention this because i use this class.
Nothing mentioned about foreach
. It would be nice to be able to write foreach(ref Struct item in arr)
.
@alrz Your suggest would be impossible from foreach implementation. foreach use IEnumerable interface to return Current which is not return ref from there
It need to do opposite. We should have IEnumerableByRef to override Current. And let foreach check that if the collection is IEnumerableByRef then it will return item as ref automatically
Or maybe it should enable ref keyword in generic. So we will use IEnumerable<ref Struct>
It would be the best if MS will implement all things it has IEnumerable attached to (all things in System.Collection) to implement IEnumerableByRef when the feature was finished
@Thaina How about this?
When iterating over an array (known at compile-time) the compiler can use a loop counter and compare with the length of the array instead of using an IEnumerator
@alrz Only array is possible with that kind of foreach. Which I think it should not be difference workflow. Instead, array should implement IEnumerableByRef if C# have one
It can be simply allowed only for arrays, and then translate to for( .. ) { ref T item = arr[i]; ... }
. I don't think that something like <ref Struct>
would be possible because it ultimately causes to outlive the local object which is not supported by CLR, AFAIK.
@alrz I apologize that I am very against the idea of making array a special thing again. Actually I am against the idea to make something special case. We have this special problem from the start that only array has indexer return by ref and now we try to fix it, everything should have indexer return by ref as array could do
Yeah I think <ref Struct>
is overkill too. Just IEnumerableByRef is enough
@Thaina I think nothing's wrong with special cases. foreach
already has, though, unobservable, special case for arrays to make it faster, and ref
locals also help to make things faster (avoid copying), so combining these two in an use case like this would be nice.
:+1:
GameDev often involves initializing large, structured data to some default values. Therefore, I suggest the following pattern for consideration:
ref largeData[i] = initData;
This avoids the otherwise needless (in this scenario):
ref data = largeData[i];
data = initData;
Granted, it only saves an extra line. But that extra line really doesn't have a reason to exist, and being able to assign directly to a reference to a location in an array, list or etc would bring it in line with assigning to a value location in an array and so on.
If it's too much work, it can certainly wait, of course; it's far from critical.
@MouseProducedGames I don't understand, how is that different from just largeData[i] = initData;
?
@svick: It's an assignment to a ref, not a copy-and-assign to value.
@MouseProducedGames Are you saying that largeData
is an array of ref
s? Because I don't think that's proposed here and that CLR can't support that.
If it's not an array of ref
s, just a normal array of struct
s, then you do need to copy the data.
@svick You're missing the point, and misunderstanding it.
This:
ref data = largeData[i];
data = initData;
Is getting a reference to a location in an array and assigning data to that reference.
Likewise, this:
ref largeData[i] = initData
Is also getting a reference to a location in an array and assigning data to that reference. Just without one extra, and unnecessary, line.
The bottom example is simply replicating the code in the top example, only without the need for an explicit ref variable.
@MouseProducedGames
That wouldn't be possible unless the array was itself of ref
s or pointers. You can't avoid the copy, so largeData[i] = initData
is as good as it gets.
@HaloFour Would it help if I gave you the equivalent C++ syntax?
Here:
(&values[i]) = initData;
I guarantee, this works exactly as I've actually stated.
And no, you don't need an array of refs to get a reference to a location in an array.
...Oh, I see the problem. You've latched on to this statement: "It's an assignment to a ref, not a copy-and-assign to value."
Look, that was in response to this statement: "@MouseProducedGames I don't understand, how is that different from just largeData[i] = initData;?"
And assigning to an indexer often has an extra copy into the setter of the indexer. Since you can ref return from a "this[index]" indexer, this avoids copying into the setter of the indexer.
You instead deal with the ref directly.
Does that help?
@MouseProducedGames So it's not about arrays, only about custom collections? For those, I believe that just having ref
-returning getter (and no setter) would achieve what you want, no need for additional syntax.
@MouseProducedGames
Ah, so largeData
is some type with a custom indexer property, not an array? That's where I think we were getting it confused.
While @svick does mention using a "readonly" property with a ref
return as the type I am curious if C# would allow treating that as if it were a settable property. Otherwise you would probably need to assign the ref
to a local before you then assigned the value to the address stored in that local. And you're right that this is what the C# compiler would have to emit anyway.
@svick There's two threads of conversation here. One is about shortening this:
ref data = largeData[i];
// Do something with data.
to this:
ref largeData[i] // Do something with data without the need to copy to an explicit variable, when you don't need an explicit variable;
The other is clarifying that assigning to ref can save you a copy to setter, to explain one reason why you'd want to operate on the ref directly.
Edit: Sorry for the confusion.
Another reason for using a ref directly would be something like this:
ref data = largeData[i];
data.oneFloat = 5f;
Which is a bit needlessly long-winded compared to:
ref largeData[i].oneFloat = 5f;
@HaloFour The original proposal above seems to imply this would be allowed (in the context of methods, but I assume it would work on indexers too).
@MouseProducedGames I still think you don't need additional syntax for that. If largeData[i]
is a ref
-returning indexer with only a getter, then largeData[i] = someValue;
and largeData[i].oneFloat = 5f;
will do what you want. There is no need for the ref largeData[i]
syntax.
@svick: Ah, I misunderstand your point. I thought you were talking about returning a copy from a getter, which left me quite confused.
However, if there's both a copy getter and a ref getter (for example, if the implementer wanted to make the usage explicit) then ref values[i]
could make it explicit which one you're using. Other than that, though, I think you're right on that one, given that clarification.
@svick Final update, I think. Looks like the devs came to the same syntax you did; I think I was tripped up by trying to apply the same sort of syntax as I would in C++.
Namely:
using System;
namespace TossCSharp7b
{
class Program
{
static void Main(string[] args)
{
LargeData[] largeData = new LargeData[3];
largeData[1].floatValue = 5f;
foreach (var data in largeData)
Console.WriteLine(data);
Console.ReadKey(true);
}
}
struct LargeData
{
public double doubleValue;
public int intValue;
public float floatValue;
public char charValue;
public bool boolValue;
public override string ToString()
{
return string.Format("double: {0}, int: {1}, float: {2}, char: {3}, bool: {4}", doubleValue, intValue, floatValue, charValue, boolValue);
}
}
}
Output:
double: 0, int: 0, float: 0, char: , bool: False
double: 0, int: 0, float: 5, char: , bool: False
double: 0, int: 0, float: 0, char: , bool: False
Overall, perhaps allowing a "ref" qualifier, even if pointless, might ease a transition from C++; OTOH, the above format is exactly what I tried to do with array variables when I first learned C# with, IIRC, either v1.1 or v2. Probably v1.1 on XP.
A ref-getter is a very welcome feature. It will help things like array slices work exactly like real arrays. Of course, something like IRefList<T>
will be required to represent this behaviour.
Why doesn't the code example work in VS '15' preview. It looks like ternary operator doesn't support it. Im also interested in how to reassign reference variable (ref locals) like in the following code which doesn't compile:
public static ref int Max(ref int first, ref int second, ref int third) {
ref int max = ref first;
if (second > max) {
ref max = ref second;
}
if (third > max) {
ref max = ref third;
}
return ref max;
}
Seeing as readonly refs are basically controlled-immutability managed pointers, the following construct may be possible:
public static readonly ref Unbox<T>(object inst) where T : struct
{
return ref (T)inst;
}
Returning an object interior reference.
Also reference list and enumerables would be a nice addition.
@fsacer ref reassignments are not allowed in the current implementation. The main reasons for that are:
ref max = ref second;
or
max = ref second;
How does this impact platform invocation, if at all? For example, could I do the following?
[DllImport("foo.dll")]
static extern ref int Foo();
With the expectation that this is a "safe" way of doing static extern unsafe int* Foo()
?
That's not safer in any way. By choosing "ref int" instead of "int*" you will allow the GC to know of the pointer, but it still wouldn't be allowed to do anything about it, due to it being located outside the managed memory (assuming the pointer was created there in the first place).
On the other hand, how would reflection handle ref returns? Maybe using boxable TypedReferences? ☺
@IllidanS4
On the other hand, how would reflection handle ref returns?
Don't have the bits in front of me to test, but I'll assume that it's the same as ref
parameters. The Type
of the return parameter would be a byref type, meaning Type.IsByRef
would return true
and Type.GetElementType()
would return System.Int32
. The CLR has always supported declaring variables and the like as byref types, it's just never been exposed to C#.
@HaloFour Of course the return type could be represented, but what if I wanted to invoke the method dynamically? Would it simply return the value in the variable?
@IllidanS4
Of course the return type could be represented, but what if I wanted to invoke the method dynamically? Would it simply return the value in the variable?
Looks like the CLR doesn't support it, at least with .NET 4.6.1:
System.NotSupportedException was unhandled:
ByRef return value not supported in reflection invocation.
The only way that I could imagine the CLR supporting this would be to automatically dereference the ref and then box the result.
@stephentoub In your original example with Choose function, how would I get to know _which_ Matrix3D object had been changed?
As previously said ref properties wont have setter because ref getter already do this (and pure ref property with setter might lead to some subtle bugs), which might be slighty misleading at first glance.
In this case I wondering: readonly (implicitly or explicitly) ref getter with normal (or ref if possible) setter wouldnt be better then (or at least make readonly ref getter as optional which will enable normal/ref setter)?
If i understand correctly readonly should ensure that there will be no subtle bugs because you wont be able to modify ref on get. When compiler see that you are trying to modify property then resolve to setter if present and possible in current context otherwise emit error (since getter will be readonly there shouldnt be question what we want to call).
There was example that byref getter with byval/byref setter might be ambigious, but with readonly getter this problem probably disappear:
class Obj{
private string description="test";
public ref string Description{
readonly get{ return ref description;}
set{
OnPropertyChange(this, nameof(description), value);
description=value;
}
}
}
var obj=new Obj();
obj.Description = "aaa"; //since getter is readonly ref we cannot use getter we have to resolve to setter
I mention this because sometime there is need to not copy value field (rather get ref to value field) AND be able to fire event on property change reliably.
Please never allow foo.GetByRef("x") = 42;
This code reads that you're somehow assigning 42 as if GetByRef was a Func being assigned (s) => 42
The strange unicode quotes aren't valid syntax, but allowing ref returns to be lvalues is kind of exactly the point here. From the first comment:
int a = 1, b = 2, c = 3;
Max(ref a, ref b, ref c) = 4;
Debug.Assert(a == 1); // true
Debug.Assert(b == 2); // true
Debug.Assert(c == 4); // true
If I were writing say a video game, or some other set of algorithms that required I went out of my way to avoid GC (such as a web server) being able to write code like this can be very useful.
We aren't looking to make this kind of thing pretty, merely for it to be better performing than what you have to currently do (either add another method override to do something and the IL instructions to add the new parameters, or mess around with pointers in C# unsafe
code, or write the CIL yourself without the help of the C# language).
I think some people here have too many enthusiastic on immutability. But it defeat the purpose of this request in the first place, which is performance to control value of struct (performance is reason why we have struct in the first place)
If you want your struct to be immutable you should explicitly use readonly or const
Also I would suggest that in addition to ref return and ref local, It would be useful if we have const return, local and at parameter
C#
int[] numbers = new int[]{0,1,2};
public const int ConstValue(const int index)
{
return const numbers[index]
}
public ref int RefValue(const int index)
{
return ref numbers[index]
}
And const parameter will pass by ref but cannot write any code to modified it, Which is useful if we have large immutable struct
@Thaina
What's the point of passing int
by const
reference? Or is this intended to be used for large struct
s?
Also, how would this be represented in IL? AFAIK, CLR has no support for constant managed references, so this would have to be enforced by the C# compiler, which is not ideal.
@svick Yes it intend to be used on larger struct (Matrix4x4 for example). just use int
as sample
And I think it fine to just enforce on compiler if we can't add anymore feature to CLR. But ideally would be best if it added into CLR
will this allow ugliness like
ref (ref (ref xx)) = GetData();
also I'm not sure I like syntax where evidently ref is used but is not explicitly visible in immediate code, like mentioned
foo.GetByRef("x") = 42
would it be possible to have some ref there as well.
Or is there some way to maybe use something like 'as ref' e.g.
ref int myInteger = 45 as ref;
// =
var myInteger = 45 as ref;
similarly
var data = GetData() as ref;
data[4] = 5;
As far as mutable structs are concerned: They are the ONLY way to get large mutable arrays with good locality, period. When it comes to performance, locality of reference often dominates all else.
Just curious that. Would it possible for Task and await/async to return ref after resolve?
@Thaina The CLR doesn't permit using byref types as generic type arguments. Attempts to use a type like Task<ref int>
would produce a BadImageFormatException
.
@HaloFour Could we solve the problem by introduce TaskRef
@Thaina Maybe. Given the limitations enforced by the compiler as to what you can take a ref
to I kind of doubt that there would be a safe way to implement an async
method that returns a ref
, though.
A ref result on a tasklike would not be fitting the pattern for await
. See section 7.7.7.3 of the specification (I suppose it could be changed):
• Either immediately after (if
b
wastrue
), or upon later invocation of the resumption delegate (ifb
wasfalse
), the expression(a).GetResult()
is evaluated. If it returns a value, that value is the result of the _await-expression_. Otherwise the result is nothing.
I am not sure what a ref tasklike would accomplish. The resumption delegate is evaluated with a different callstack so the ref must already be in the heap somewhere, why not return the instance of whatever object is holding the ref?
edit: fooling around on Try Roslyn it seems apparent that a ref tasklike would make a copy of the value anyway to store it in a backing location (which seems obvious to me but perhaps I'm missing something).
edit 2: I don't think that code should emit CS0649. Does it still do so in a newer build and/or is there an issue entered?
@bbarry Because what we want to return maybe valuetype
The async may do the work of loading and initialize large struct. And after it finish it will just return that struct. So the struct may alive in something on heap but we don't want to let client have access to that class. We want to return that value but as reference for performance reason
``` C#
struct UserData { /* Very Large detailed information */ }
class UserManager
{
UserData[] array;
public static async ref UserData LoadOrGet(int n)
{
// First find value in array, return ref array[n] if exist
// await for load from server, set to array[n + 1] and return after loaded
}
}
```
I see I was thinking about the problem wrong.
@Thaina, you could return the index to the array or some other handle but it gets old really fast since the implementation doesn't seem to permit ref
locals inside async state machines at all. So you are left with:
struct UserData
{ /* Very Large detailed information */
}
class UserManager
{
UserData[] array;
//idea: can return index and work with that as a handle to the data
public static async Task<int> LoadOrGet(int n)
{
await Task.Delay(1);
return n;
}
public ref UserData Get(int n)
{
return ref array[n];
}
public static UserManager Instance {get; set; } //mock singleton
}
public class Foo
{
public async Task Bar()
{
int n = await UserManager.LoadOrGet(0);
//doesn't compile:
//{
// ref UserData d = ref UserManager.Instance.Get(n);
// ...
//}
//instead:
SomeWork(n);
await Task.Delay(1);
MoreWork(n);
}
void SomeWork(int n)
{
ref UserData d = ref UserManager.Instance.Get(n);
//...
}
void MoreWork(int n)
{
ref UserData d = ref UserManager.Instance.Get(n);
//...
}
}
Please do reconsider CS8932 at least when the ref local does not spill across awaits.
@Thaina I think combining async
with ref
returns won't be possible. At first, I was considering that a design that is tied to arrays and indexes could work:
``` c#
class ArrayRefTask
{
private T[] array;
private int index;
private bool isCompleted;
private Action continuation;
public Awaiter GetAwaiter() => new Awaiter(this);
public void SetResult(T[] array, int index)
{
this.array = array;
this.index = index;
isCompleted = true;
continuation?.Invoke();
}
public class Awaiter : INotifyCompletion
{
private ArrayRefTask<T> myRefTask;
public Awaiter(ArrayRefTask<T> myRefTask)
{
this.myRefTask = myRefTask;
}
public bool IsCompleted => myRefTask.isCompleted;
// this is the important part:
public ref T GetResult() => ref myRefTask.array[myRefTask.index];
public void OnCompleted(Action continuation)
{
if (IsCompleted)
continuation();
else
myRefTask.continuation = continuation;
}
}
}
```
But it won't work, because:
ArrayRefTask<T>
, the compiler will require SetResult(T)
, not SetResult(T[], int)
or SetResult(ref T)
(which wouldn't work anyway).``` c#
static ArrayRefTask
{
var task = new ArrayRefTask
Task.Delay(10000).ContinueWith(_ => task.SetResult(new[] { 42 }, 0));
// or just:
//task.SetResult(new[] { 42 }, 0);
return task;
}
static async Task Consume()
{
Console.WriteLine(await Produce());
}
```
But, more importantly, code like this doesn't work:
c#
static async Task Consume()
{
ref int i = ref await Produce();
Console.WriteLine(i);
}
The compilation fails with:
CS8942 Async methods cannot have by reference locals
And I don't think that error is fixable, since async
locals are stored as fields in the state machine, which means they can't be ref
.
One approach is to make async
a feature of the runtime, such that the CLR knows about the runtime. Alternatively, unsafe pointers could be used.
@bbary @svick If it not permit then it could if we find a way to implement it
What I think it possible is introduce class TaskRef<T>
in additional to class Task<T>
. Which TaskRefpublic ref T Results { get; }
and constructor public TaskRef(FuncRef<T>)
And implement async/await
to select Task<T>
or TaskRef<T>
based on return type of function we used on async
And I'm fine about ref local cannot cross await. But it should be allow ref local to use between await it get from
Code like this actually should work. Should remove CS8942 for this
``` C#
async ref Product Produce(int i) { }
async Task Consume()
{
ref Product X = await Produce(0);
Console.WriteLine(X); // Still live in the block of await it got value from so no error
ref Product Y = await Produce(1);
Console.WriteLine(Y);
}
But like this should not work. Would be use CS8942 in this case
``` C#
async Task Consume()
{
ref Product X = await Produce(0);
ref Product Y = await Produce(1);
Console.WriteLine(X); // Throw CS8942 here
Console.WriteLine(Y);
}
What I talking about is not that it work with current await or not. I'm saying that it should also work when we introduce ref return feature. And if it need more implementation then we should figure out how to do it. What I ask is the possibility technically, not the limit of current implementation
Still ref is difference from pointer, technically it should be safe. And must not be allowed across async block. But it also should be allowed in the same async block
@Thaina I think async ref T Method()
should be best left to a separate spec. It is different than both #10902 (in that the return is ref T
, not some tasklike) and this issue (in that such a spec would need to entail a state machine and some transport type TaskRef<T>
or something like that) but at the same time it is clearly something that will be influenced by both issues.
As nice as production of an async ref method may be, it could be sidestepped for the immediate future of getting ref
returns and ref
locals out the door. This code:
async Task Consume()
{
ref Product X = await Produce(0);
...
}
could be written without an additional allocation this way:
async Task Consume()
{
var handle = await ProduceHandle(0);
ref Product X = ref Resolve(handle);
...
}
Or in the slightly more frustrating way the current implementation works:
async Task Consume()
{
var handle = await ProduceHandle(0);
DoWork(handle);
}
void DoWork(int handle)
{
ref Product X = ref Resolve(handle);
...
}
And I'm not saying this shouldn't be done. But I will say it will take considerable time to get done and should not be a cause for delaying C#7.
Is it correct to understand that only references to local variables are not safe to return? And if so, why not just insert a runtime check in the caller that throws an exception if the returned reference points to something that lies in the discarded stack frame? That shouldn't be too much of a performance hit, and it can be optimized away if the compiler can prove that the returned reference is safe.
@kldf At the CLR level - yes we cannot return references to locals and byval parameters. At the C# level the situation is a bit more complex because of lexical scoping.
According to the language semantics a new set of local variables is created when control flow enters {}. As a result you should not be able to access the variable via a reference when outside of {}. From the language semantics the variable does not exist, form implementation prospective the IL slot of the variable could be reused for something else.
So, once we have a reference to a local variable, we would not only have problems with returning it. Using it as a source of byref assignments could also be problematic and cannot rely on runtime checks.
One simple solution to this is - make ref locals single-assignment and require ref assignment to happen at declaration only.
Also it is generally preferable to have compile-time errors over run time failures - you do not want to ship something and only later discover that some scenarios may lead to crashes.
One simple solution to this is - make ref locals single-assignment and require ref assignment to happen at declaration only.
Doesn't feel like C#, and what about ref parameters? Maybe just forbidding assignments to ref variables from outer scope will be enough? And, most importantly, the compiler is easy to satisfy in this case.
Also it is generally preferable to have compile-time errors over run time failures
No arguments about that. But I don't see how can current limitations to ref locals and ref returns work in any reasonably complex scenario, with lots of structs passed around and assigned to locals, where just one not safe to return struct 'taints' the returned reference and ref locals are can only be used as aliases to the parameters.
One more thought.. second best thing after compile time error is failing fast, and that's what check after return does, whereas currently the compiler can satisfied by promoting local variable to a field. It's a straightforward thing to do and is not too ugly, so is likely to become an accepted practice. But if there is a logical error and a reference to this field is returned, the programs just moves on.
This proposal would increase the complexity of the language, without a significant use case or benefit to offset that complexity. I'm not convinced.
@JeroMiya I have heard something like your words from so many people who was spoiled by fast computer and don't know what new object
did to memory, how the GC work or how it impact performance, and how struct could be used. People like these always make every little thing as a class, generate so many little garbage in memory, and boxing thing with ignorance
Would this compile?
``` c#
class Base {
public virtual readonly ref int Foo(readonly ref int bar) { ... }
}
class Derived {
public override ref int Foo(ref int bar) { ... }
}
```
In other words: Is readonly
part of method signature or not?
And does readonly ref
accept r-value?
Will ref return allow properties with code to return refs to structs inside classes or structures? For example having Nullable
struct Nullable<T>
where T : struct
{
T m_value;
bool m_hasValue;
ref T Value
{
get
{
if (!m_hasValue)
throw new Exception();
return m_value;
}
set { /* ... */ }
}
}
@ceztko Inside classes - Yes. Inside structs - No.
That is to prevent cases like
C#
ref var refToNowhere = ref new Nullable<int>().RefValue; // reference outlives the referent
If a struct field needs to be directly modifiable, It would be safer and likely more efficient to just make such field public instead.
Nullable, in particular, would not allow ref Value for a different reason though. Nullable values are intentionally readonly. Value and HasValue are loosely coupled and are not supposed to be modified separately from each other.
@VSadov Fair enough for Nullable, I tend to forget that ref allows full assignment of structures. Generically speaking about ref properties in struct, what is the semantics of the second "ref" before "new Nullable" in your example? Is it necessary or could it be omitted?
@joeante The ref
prefix doesn't just indicate that the value can be changed, it also represents a different data type with different semantics both for the language and the CLR. I think hiding that fact would be more confusing than not.
It's also important to note that since ref
indicates a different CLR type that the CLR permits overloading ref
and by-value methods:
public void Foo(int x) { ... }
public void Foo(ref int x) { ... }
As for readonly ref
, only the ref
part is relevant to the CLR and to the caller. The readonly
keyword would be lost during compilation since there is no CLR metadata to actually encode it or enforce it. The caller would never know if a method was readonly ref
vs. just ref
.
I can understand the desire to have both performant and attractive code utilizing ref
locals/returns. What about ref
flavors of the operators?
// Transforms a [[Vector4]] by a matrix.
static public ref Vector4 operator *(ref Matrix4x4 lhs, ref Vector4 v, ref Vector4 res)
{
res.x = lhs.m00 * v.x + lhs.m01 * v.y + lhs.m02 * v.z + lhs.m03 * v.w;
res.y = lhs.m10 * v.x + lhs.m11 * v.y + lhs.m12 * v.z + lhs.m13 * v.w;
res.z = lhs.m20 * v.x + lhs.m21 * v.y + lhs.m22 * v.z + lhs.m23 * v.w;
res.w = lhs.m30 * v.x + lhs.m31 * v.y + lhs.m32 * v.z + lhs.m33 * v.w;
return ref res;
}
@VSadov still on your example, I have the feeling that it could be solved with rules on stack allocation and ref variables scope/initialization:
c#
{
// refToProp must be stack allocated and initialized only in this block,
// and not on any nested blocks. Structure1 must be stack allocated
// and can't be deallocated before the end of the block
ref var refToProp = ref new Structure1().RefProp;
}
Am I may missing something else or this is already in violation of other rules? I would be interested in looking at some discussions on the topic, if there are.
@HaloFour Of course there's a CLR metadata suited to encode readonly
(apart from attributes). It's called modopt
and modreq
, which is part of a signature. It's used heavily in C++/CLI. As a matter of fact, readonly
is quite similar to const
in C++, thus it would be suitable to use System.Runtime.CompilerServices.IsConst
on parameters as C++/CLI does.
@IllidanS4
Possibly, but modopt
and modreq
have the problem of the requirement of being encoded into the call-site. This makes it unsuitable for use with non-nullable references, and I believe makes it just as unsuitable for use with readonly
parameters. Marking a parameter as readonly
shouldn't cause existing consumers to fail, which is what will happen if they are encoded via modopt
/modreq
.
@HaloFour Good point. However, making it part of a signature has both pros and cons. On one hand, readonly ref
guarantees the value won't be modified, so a change in a library from readonly ref
to just ref
should be a breaking one, because it could highly affect depending code. On the other hand, change from ref
to readonly ref
is just an additional contract in the method code, like [Pure]
. It would require modifying CLR signature resolving rules to be more benevolent in handling type modifiers, maybe differing between opt
and req
more significantly.
I have to reiterate, any solution needs to prevent:
c#
foo.GetByRef("x") = 42;
It should always require capturing the ref as a local to allow modification.
@dotnetchris
What about with properties?
public struct Foo {
private int x;
public ref int X {
get { return ref this.x; }
}
}
var foo = new Foo();
foo.X = 123;
It should always require capturing the ref as a local to allow modification.
Why?
Regarding the desire to return references to List<> elements. Doesn't this open up a new set of potential bugs for users? Or are we just assuming people wouldn't do something like:
{
var items = new List<Item>();
items.Add( new Item( "ABC", 123 ) );
var item0 = ref items[0]; // (is ref before var needed? hopefully not)
items.RemoveAt( 0 );
items.Add( new Item( "XYZ", 456 ) );
// < item0 now references {"XYZ", 456"}
}
This can be an annoying source of bugs if the modifications to the list are done in a function call after the item0 assignment (not easily visible).
The performance penalty mentioned should mostly come from this List<> style operator[] access where a copy is returned and it does not seem like a 'safe' thing to add 'ref operator[]' to List<>. But if this proposal is added without ref operator [], is everyone that cares about performance just going to reimplement List<> and add it themselves?
@Ziflin this is why developers need to understand concurrency. If you want immutable data structures, use read-only or immutable variants.
regardless, if items[0]
returns a reference, then your scenario won't happen because item0
would continue to point to the returned reference, not the item[0]
which is an indexer (basically a function).
My question would be be (to the designers): what happens when T : struct
? Is the ref T
a pointer at an address, if so what happens when the List<T>
resizes, or is just updated?
Is retaining the ref T
advised? Seems to me that it would invalidate quickly - but C#/CLR has some fancy memory tricks it can play.
@Ziflin
I can't see that being remotely possible, for a couple of reasons.
From a language perspective it's not possible to overload properties (indexer or otherwise) based on their return value, so you couldn't add an indexer that returns a ref
while the existing indexer is defined.
From an implementation point of view, internally List<T>
will dispose of its underlying array anytime it needs to grow. There's no way for List<T>
to safely return a ref
into its underlying storage.
@whoisj If List's operator[] returned a reference, it would return a reference to an element of the underlying Array of items. This seems valid according to this spec and the desired behavior in @xen2 comment. Based on how List<>'s Add/Remove are implemented, my example should be the result.
@HaloFour List<>'s operator[] could in theory be changed to only return a readonly ref T
. I'm not saying it would, just that seems to be the desired for those performance-minded comments.
Do you mean Dispose() of? Otherwise it's just letting the GC collect the old array and this is no different than any other 'ref' to a member of a heap object (which is allowed in the spec).
I'm not really trying to argue against the points you're making, but I think this is something people that worry about performance are going to try to do. So if anything I just wanted to bring up the issues that they'd have.
@Ziflin
List<>'s operator[] could in theory be changed to only return a
readonly ref T
.
This would be a breaking change. T
and ref T
are completely different types according to the CLR and they require different IL in order to work with them.
Do you mean Dispose() of? Otherwise it's just letting the GC collect the old array and this is no different than any other 'ref' to a member of a heap object (which is allowed in the spec).
"Discard" is probably the better word here. List<T>
doesn't do anything to try to clear the memory of the array, it just stops using it. But what this means is that ref
values pointing to the underlying array of the list may go stale as the List<T>
:
var list = new List<int>();
list.Add(1);
list.Add(2);
list.Add(3);
list.Add(4);
ref int first = ref list[0];
list.Add(5); // Adding a fifth element triggers the List<T> to grow and change arrays
list[0] = 6; // Updates the new array
Debug.Assert(first == 1); // Ref still points to the old array
@HaloFour - Yes, I agree with you :) I just wanted to point these issues out to anyone that might try to make their own "OptimizedList<>". It's basically possible, but would have some hidden issues.
Closing as this is now implemented.
@jaredpar so does the finalized implementation allow or reject foo.GetByRef("x") = 42;
@dotnetchris yes. it should even allow
foo.GetByRef("x") = foo.GetbyRef("x")
@jaredpar that's very saddening, i just have to pray 99% of people don't touch this feature.
@dotnetchris I think its very good!
However a readonly
type extension to both ref arguments and ref locals and returns probably would also be a useful addition (e.g. for larger structs where they are passed by ref due to size; but not for modification)
foo.GetByRef("x") = 42;
Looks like elegant code to me.
Please never allow
foo.GetByRef("x") = 42;
This code reads that you're somehow assigning 42 as ifGetByRef
was aFunc
being assigned(s) => 42
...
That's a lambda, not a simple equals sign. So they are clearly distinguishable. I've been on the other side of this equation before, where what I wanted wasn't done in the end, so I don't mean to rub it in. I'm just personally very glad these restrictions weren't made.
Do you prefer marking every variable with var
? I don't either. So just because some people do this (even in cases where the type on the right side isn't evident, therefore making the code less clear), does that mean an arbitrary restriction should be made on the usage of var
? Likewise with the examples some gave above of ugly usage, yes, you can always make ugly code. But the snippet example above is still elegant in my view, though it does signal of course a whole new world for C#, ref returns and locals, a very exciting new future.
So I'm sorry I'm still unclear on what is now possible. In C++ the more practical uses would be to set/get large value types by const T&
. So is it possible to do readonly ref
or did that not make it in? If not, I don't exactly see this being very useful as it appears to break encapsulation, or I'm missing some good use cases.
@Ziflin it improves encapsulation.
Currently if you want to return a reference to an array element (say array of structs) you need to return the whole array and an index to reference into it. So the entire array and all its data needs to be exposed for a single access.
With this change only a single element needs to be exposed.
Yes, but assuming you have some class that contains the array, that class is still not able to return a reference to an element in a way that prevents you from modifying it - or in a way that it knows the element was modified. So this features doesn't seem 'complete' without that functionality.
@Ziflin
that class is still not able to return a reference to an element in a way that prevents you from modifying it
Correct me if I'm wrong, but isn't the only way that the C++ compiler enforces this is via the const
modifier? If you were to hand a library to someone with a header and they stripped that modifier then they could modify the reference at will, right?
I think that's the issue here with C#. The CLR itself offers no concept of a ref
to a readonly
. At best the C# compiler could do is to attach its own metadata to the return value and hope that other compilers will understand and support it. That seems like a relatively fragile solution.
@VSadov @jaredpar can you use this (or coupled with Tuples) to present a structure of arrays as an array of structures? (or could it do in future?)
e.g.
struct DataElement
{
public ref int data0;
public ref int data1;
public ref Vector4 data2;
public ref Vector4 data3;
public ref Vector4 data4;
}
class Data
{
private int[] data0;
private int[] data1;
private Vector4[] data2;
private Vector4[] data3;
private Vector4[] data4;
// Either
public DataElement this[int i]
{
get
{
return new DataElement()
{
data0 = ref data0[i],
data1 = ref data1[i],
data2 = ref data2[i],
data3 = ref data3[i],
data4 = ref data4[i]
};
}
}
// Or
public (ref int, ref int, ref Vector4, ref Vector4, ref Vector4) this[int i]
{
get
{
return (ref data0[i],
ref data1[i],
ref data2[i],
ref data3[i],
ref data4[i]);
}
}
}
Structure of arrays for efficient vector processing; array of structures for easy programmablity.
@HaloFour Sure, it's possible to force just about anything in C++, but returning a const T& is still more restrictive than returning a T&. For those wanting a performance benefit for methods previously returning a T by value, a readonly ref is the closest match.
I was just hoping it would be done _more_ correctly in C# or at least support similar use cases. But it seems this is mostly just improving cases where you were already wanting a modifiable reference.
@Ziflin
It would certainly be possible by going the same route as C++/CLI and placing a modreq(IsConst)
with on the return parameter of the method to mark it as a constant. However without explicit support for it any compiler could simply ignore that modifier when consuming your assembly and overwrite your data. I would assume that the vast majority of compilers would need to be modified in order to properly support ref returns anyway so maybe that's not a big deal.
@dotnetchris
that's very saddening,
Why? That behavior is consistent with every other feature in C# which can return a location. Consider for instance if that code return and array instead of a ref
value:
int[] GetArray(string s)
This method gives virtually the same behavior as the one I listed:
GetArray("x")[0] = GetArray("x")[0]
Why?
Because you've broken the illusion of immutability? 😏
@benaadams
However a readonly type extension to both ref arguments and ref locals and returns probably would also be a useful addition (
I agree. But for it to be useful you need to take it one step further. Consider for example this code:
void M(readonly ref BigStruct s)
{
Console.WriteLine(s.ToString());
}
In this case the argument is taken by ref
presumably to avoid copying a large struct. However in order to execute the ToString
call the compiler will fully copy the value to the stack. Oops :frowning:
This is the behavior of C# when you call a struct method on a readonly
location. Without a copy it would be possible for the stuct to violate readonly
by modifying it's state within the method.
This logic doesn't just apply to methods, but to properties as well. Hence passing a struct by readonly ref
is only advantageous compared to passing by value if you read fields off of it. Any use of method or properties and you're better off passing it the standard way.
In order to get around this we need to be able to mark struct methods in such a way that the compiler knows they aren't mutating. That way it can invoke the method directly vs. having to go through a copy on the stack.
There are two proposals for how to do that:
readonly structs
: ability to tag an entire struct
as readonly
. For such structs the type of this
in non-constructor members would be readonly ref T
instead of ref T
. readonly
members on structs: ability to tag a struct member as readonly
. For that member the type of this
would be readonly ref T
. I agree. But for it to be useful you need to take it one step further.
Yes definitely a different Issue; readonly
structs are problematic. I see https://github.com/dotnet/roslyn/issues/115 main, addtional https://github.com/dotnet/roslyn/issues/12364 https://github.com/dotnet/roslyn/issues/3202
ref returns and locals as they stand are an amazing addition! Thank you very much!
@benaadams - ref fields are not allowed in CLR with few very special exceptions which are ref-like structs that are stack only.
So, the structure as you suggest is not currently possible. It might be possible in theory, if concept of ref-like stack/only is more general.
@jaredpar Yes, readonly structs
would be nice. We actually had that in our C#-like language that compiled to C++ and it worked well. We also cheated slightly and had the C++ compiler automatically treat them as 'const T&' parameters.
However, how does a readonly struct
prevent the encapsulation issue of:
transform.GetPositionByRef() = position;
In this case, having the position (say a Vector3) type be readonly
does not help prevent the assignment to the transform. Can this be fixed without resorting to a C++ like const
modifier: const ref Vector GetPositionByRef() const {...}
? (We did not want to do this in our language as it seemed to greatly increase the learning curve.)
@Ziflin
However, how does a readonly struct prevent the encapsulation issue of:
transform.GetPositionByRef() = position;
No. The proposal about readonly
struct only refers to the ability of the struct
to modify itself via instance methods. The mechanism for doing so is changing this
to be typed as readonly ref T
instead of the normal ref T
.
The GetPositionByRef
method though can control whether or not callers can assign into the returned value. Using readonly ref
as the return prevents assignment irrespective of whether or not the struct itself is readonly
.
The GetPositionByRef method though can control whether or not callers can assign into the returned value. Using readonly ref as the return prevents assignment irrespective of whether or not the struct itself is readonly.
Ok, so then there is a proposal / feature planned for returning a readonly ref
? That's mostly what I've been trying to figure out. And can this be used by properties?
@Ziflin I think its captured in https://github.com/dotnet/roslyn/issues/115 though it deals with ref parameters so may need to be extended for ref returns, now they are a thing.
@Ziflin
As @benaadams pointed out, #115 has it a bit. But it does need to be extended for readonly
structs and refs to be complete. It's on my list of items to write up.
@jaredpar @benaadams Ok great. I'm definitely with @joeante (in #115) in his desire to see C# perform as well as it can and this seems like one of the last issues I've had with moving to C# from C++ for game engine development. I guess keeping C# as clean a language as possible with that is the hard part.
In this case the argument is taken by ref presumably to avoid copying a large struct. However in order to execute the ToString call the compiler will fully copy the value to the stack. Oops
@jaredpar, not sure to understand why readonly ref
should be interpreted as a ref readonly
... Couldn't we have the ability to have a:
readonly ref
, i.e. you can do whatever you want on the struct behind the ref, but you can't modify the refref readonly
, i.e. you cannot modify the struct behind the ref and you can't modify the ref (this would be different from a C++ const
). This would allow typically to be able to pass a ref to a readonly field (something we can't do today)I'm really missing the readonly ref
behavior there...
@xoofx please can avoid the taint of C here with its const int const *ptr
non-sense.
I'm having a difficult time thinking of a real scenario where any would want an mutable pointer to an immutable object. The function could too easily just replace the object with its own, mutate as it sees fit, then return control to the caller who would then have a new struct not realizing it. Seems rife with misuse and danger.
a real scenario where any would want an mutable pointer to an immutable object
No scenario, I don't propose this (as I said above, the pointer ref readonly
is not mutable).
In C# world ref of struct is not pointer. It is the object itself. It can only be immutable pointer for mutable object
And by the standard of static internal protected
is the same as protected internal static
. ref readonly
and readonly ref
must be the same
@xoofx
I'm really missing the readonly ref behavior there...
Reading your comment I think there may be a bit of a terminology difference. Let me elaborate a bit on the operations for a ref
that could be affected by readonly
:
ref
is really just a pointer that is safe. Hence just like you can change the address a pointer refers to, you could also change the location a ref
points to. ref
points to. In the case of a class
it would be changing it to refer to a new instance (or null). In the case of a struct
though it's mutating the contents directly.Attaching readonly
semantics to a ref
could choose to affect one, or both of these operations.
When I say readonly ref
I'm referring to protecting against mutating the target. I definitely understand the inclination to say that syntactically the readonly
modifies the ref
so perhaps it should be guarding against re-pointing.
At this time though the language doesn't allow for re-pointing of ref
values. I have a lot of skepticism that it would ever be allowed. Mostly because it is of fairly limited use. Midori made heavy use of ref
locals / returns and there was only one case in our extremely large code base where we ever wanted to allow for a re-point operation. Additionally allowing for re-pointing complicates the lifetime rules around ref
locals significantly. Hence it's low use, extra complication ... less likely to happen.
My skepticism aside though, assume we did desire both re-pointing and the ability to guard against it. That would be in addition to guarding against mutating the target (a very good case can be made for this feature). That means logically variables can now be defined as readonly ref readonly
. While that is logically correct and meaningful it probably makes most developers go "huh?".
But if we did go with this feature I'm sure we'll spend plenty of time debating ref readonly
vs. readonly ref
. Hard to pass up a good naming / syntax debate :smile:
@jaredpar Ah, sorry, may be I have not been enough clear. I'm not proposing the idea to re-pointer the ref (though, I have never had a need for this, but hey, the idea could grow on me 😄 ) , but to disallow the variable (and the struct behind of course) to be re-assigned entirely.
Let me take an example for a readonly ref
scenario:
``` c#
struct MyStruct
{
public readonly int X;
public int Y;
}
public void Process(readonly ref MyStruct val)
{
// This would not compile
// In this case, we also disallow the field X to be modified
// while with a regular ref, we could modify it indirectly with the following code
val = new MyStruct();
// We cannot do this
val.X++;
// But we can do this:
val.Y++;
....
}
It allows typically to protect the variable + protect readonly fields behind, which is a nice behavior as It allows partial immutability of a ref struct. If the caller of the method is passing this struct, It can ensure that the callee will not be able to modify its readonly fields (or even private ones).
On the other hand `ref readonly` would allow to pass a readonly field or variable to another method:
``` c#
class MyClass
{
public static readonly MyStruct MyField;
}
public static void Process(ref readonly MyStruct val)
{
// We cannot do this:
val = new MyStruct();
// And also we cannot do this:
val.Y++;
}
Process(ref MyClass.MyField); // It would be possible
Hope it makes more sense 😅
It'll be difficult to make a solid case for why we need "immutable references to mutable structs", "mutable references to immutable structs", "immutable references to immutable structs", and "mutable references to mutable structs".
Seems to be (ref stuct)
and (readonly ref stuct)
is all we need. One allows for mutable the other is immutable. This is a far simpler set of things to understand and the lost "flexibility" closes a lot of holes for bugs to sneak in through.
IMO (readonly ref struct)
should be the same as (ref readonly struct)
, given C# laziness in keyword order enforcement historically.
It'll be difficult to make a solid case for why we need "immutable references to mutable structs", "mutable references to immutable structs", "immutable references to immutable structs", and "mutable references to mutable structs".
@whoisj, I have been abusing structs for years in C#, because they are lightweight objects, interop nicely with native code and allow to lower substantially pressure on the GC. And while using them a lot, I have been facing many problems, not only related to performance but also about their safety-ness. Being a strong users of structs makes me looking forward to more powerful abilities (e.g ref locals/returns... but I have so many other stuffs that would probably roll your eyes 😋 ) and stronger options for safety (readonly, more control on immutability). So yes, the cases you are listing like they are small side things (e.g who cares about safety or immutability?), are for me primordial. I'm not talking from a "nice to have place" but from a "real-world usage" place, as yours, but with a different "is all we need" world if you prefer... 😉
@whoisj I see two variants which @xoofx covers
Pass byval semantics with pass byref cost which I think was the ref readonly
example so (good for large structs and read only use):
public static void Process(ref readonly MyStruct val)
{
// We cannot do this:
val = new MyStruct();
// And also we cannot do this:
val.Y++;
// However we can do this as it creates a copy; though introduces a byval cost
var newVal = val;
newVal.Y++;
}
For byref where you want to allow modifications to the original but not allow overriding of properties which is the readonly ref
first example (semi-mutable structs)
public void Process(readonly ref MyStruct val)
{
// This would not compile
// In this case, we also disallow the field X to be modified
// while with a regular ref, we could modify it indirectly with the following code
val = new MyStruct();
// We cannot do this
val.X++;
// But we can do this:
val.Y++;
}
As if you can do the new MyStruct();
then you can override the readonly properties on it with the .ctor
+1 to readonly
ref!
When you deal with large struct and want to avoid copies (think Matrix
), ref makes a lot of sense.
And of course, we want to have predefined values as static readonly
(i.e. Matrix.Identity
).
The problem is we can't use any of the Matrix
methods that take a ref with those static readonly (i.e. Matrix.Multiply(ref Matrix.Identity, ref matrix2)
). The only way is to make a full copy beforehand, or getting rid of the readonly (bringing lot of safety issues).
Well, I have remember there is an argument about readonly ref
. It is about the problem that struct with readonly may not be able to call method (also property). Because, internally, method of struct could modify its value. So it cannot call any method at all
Even get only property can modified struct too
Maybe we also need readonly function
and readonly get/set
to make it compatible?
@xoofx
but to disallow the variable (and the struct behind of course) to be re-assigned entirely.
Gotcha. That is absolutely the intent of readonly ref
. It's a way to safely take a ref
to a struct
that lives in a readonly
location. It effectively disallows all mutations, including assignments.
the intent of
readonly ref
[...] It effectively disallows all mutations, including assignments.
That's a ref readonly
in my terminology. 😅 A readonly ref
would allow partial/controlled mutation but not full assignment (see my example above where val.Y++;
is possible for a readonly ref
), which is important when you want to make sure that a callee cannot modify the private/readonly fields/state of the mutable struct but only through its public mutable API.
@xoofx
I feel like you're trying to draw a distinction that doesn't really exist though. Their is no real difference between mutating the public
and non-public / readonly
portions of a struct. A struct is either mutable in it's entirety or not mutable at all.
This example is clearer if you consider method calls. Take for example the following, completely legal, method:
struct S
{
public readonly int X;
public int Y;
private int Z;
public void M()
{
this = new S();
}
}
In your design would a readonly ref
be able to call M
without introducing a copy? In order to maintain the proposed semantics of readonly ref
the answer must be no. This means then that readonly ref
is only a useful distinction for accessible, mutable fields of a struct. I don't think that's enough of a benefit for the extra complexity.
@xoofx for params I could see the semi-muatable working as an in
parameter (due to the loose keyword ordering of C# on ref
and readonly
)
// passed by val (or register)
void Process(MyStruct val)
// fully mutable including assignment
void Process(ref MyStruct val)
// readonly struct; no assignment, no method calls (get props allowed?)
void Process(ref readonly MyStruct val)
// must be assigned in function
void Process(out MyStruct val)
// semi-mutable struct, no assignment, but method calls & non readonly assignment fields allowed
void Process(in MyStruct val)
However the in
paramater wouldn't make sense for a return/local maintaning the same sematics
// value/register struct
MyStruct val0 = val;
// fully mutable including assignment
ref MyStruct val0 = val;
// readonly struct; no assignment, no method calls (get props allowed?)
ref readonly MyStruct val = val;
// semi-mutable struct, no assignment, but method calls & non readonly assignment fields allowed
// Not sure what would match for local
In your design would a readonly ref be able to call M without introducing a copy?
@jaredpar Yes. The struct itself know its state and is the owner of the implementation details. It disallows the callee to break anything that is not exposed by the public API on the struct, but the implementation in the struct can choose whatever is needed. Again, the readonly ref
is just saying = The ref is not assignable by the callee, not that the struct behind is readonly. But I understand that the keyword could be misleading (though if we are introducing it for other locals/params, it feels more natural to me but well...)
As @benaadams is suggesting another keyword would be something like in ref
or refin
, basically a ref
that cannot be out
(assigned entirely by the callee)
@xoofx
The ability to assign to a struct location and call methods without a copy are equivalent operations. Adding protection for one without protection for the other is just lulling developers into a false sense of confidence about their code.
This all has to do with how this
is modeled. In a struct
the type of this is ref T
. Hence whenever you call a method on a struct
the target must be convertible to ref T
. That is why it's wrong from a language correctness standpoint to allow readony ref
to call a method without a copy. It's implying there is a conversion between readonly ref T
and ref T
.
Well, If a library A provides a struct (that can be created in a valid state only by lib A using some internal constructors) and and interface with a readonly ref
method, this can guarantee to an end user implementing it that It cannot modify the struct in unexpected ways that lib A hasn't covered. It provides confidence for user of lib A, but sure, It doesn't save the developer of lib A to make mistake internally. Can't really adhere to the idea of a strict equality in the behavior between assignment on a struct location and calling a method on it... and like the transient for stackalloc for class, it could be possible to detect struct method that are making such a "violation" and the compiler could report it...
But, seeing how much this idea is controversial, we can forget it, good sign that it is not a good idea after all... 😉
I'm testing ref locals and I've encountered following limitation. I declare variable ref int
as reference to first element in array. How can I change where this variable points? Let's say I want to change it to point to last element, but it's not possible? (see my attempts below)
static int[] data = new int[] { 0, 1, 2, 3, 4 };
unsafe static void Main(string[] args)
{
ref int slot = ref data[0];
slot = data[4]; // This stores value "4" into data[0], I don't want that
// This does not work
//ref int slot = ref data[4]; // This would be it, except variable is already declared
//slot = ref data[4];
//ref slot = ref data[4];
slot = 99; // When it works, this would overwrite last element
foreach(var item in data)
{
Console.WriteLine(item);
}
Console.ReadKey();
}
I thought I'll try to compare performance of this approach in my tree-like collection implemented on array. Currently when looking for element to add/remove, I have locals like int currentIndex
, int parentIndex
. With this I thought I would use ref Node current
, ref Node parent
, but when it's not possible to modify current
in while loop, it won't work.
@OndrejPetrzilka: AFAIK, that's unsupported. You can't reassign references in C++ either.
@axel-habermaier: That makes sense, otherwise it would be probably much harder if not impossible for compiler to detect invalid use. I'm not happy about it though. Is it possible to reassign reference in IL?
It is possible to assign managed pointer in IL, but it is not possible to reset ref local or parameter in C#. Not in C#7.
Safety of use is indeed an issue to solve here.
Most helpful comment
Disclaimer: I work on game engine, so I am probably not the typical user.
One use case this could really help us is this one:
We end up making separate function for loop body, and in case of tight loop this can end up being quite bad:
Nice to have:
Extra (probably impossible without changing BCL):