Sometimes you need to create a copy of an array. For example, one reason is arrays are mutable and sometimes you need to create a defensive copy, or you may want to modify some of the indices in it but keep the original array intact. There are a couple of ways of doing this right now:
var array = new T[original.Length];
Array.Copy(original, 0, array, 0, original.Length);
return array;
The downside of this approach is that it's verbose (compared to the second approach below), and can't be used in things like expression-bodied methods where you want a one-liner to clone an array. It is, however, fast since we go straight to allocating an array and calling Copy.
return (T[])original.Clone();
While this can be used in expression-bodied methods, the programmer has to specify T[] again to cast, which is prone to error since Clone returns an object. In addition, Clone is slightly slower than Copy and we have to make that extra typecast.
return original.ToArray();
This can also be used in expression-bodied methods and is type-safe, however it is slow. Several casts are made by ToArray, then it makes virtual method calls to Count/CopyTo which do the same thing as approach 1 under the hood.
Add a static Array.Clone method:
namespace System
{
public abstract class Array : ...
{
public static T[] Clone<T>(T[] array);
}
}
Usage:
class Customer
{
Order[] _orders;
public Order[] GetOrders() => Array.Clone(_orders);
}
The implementation would be trivial and do the same thing as example 1:
public static T[] Clone<T>(T[] array)
{
var array = new T[original.Length];
Array.Copy(original, 0, array, 0, original.Length);
return array;
}
Sounds reasonable to me. Not sure how many people would use it though ...
Let's get API review feedback on the proposal.
Makes sense.
What are the semantics when the array is null? Should it return null or raise an exception? If only non-null values are permitted, then rather than static, this could be an instance method. That way the null check as well as potential exception could be avoided.
I know accepting null as the this to an extension isn't the common pattern, but I think it would be more useful.
To be clear, Array.Clone would be a static method and not an extension method. It can't be an extension method because an instance method with the same signature exists and thus will be preferred by the compiler.
I think accepting and returning null would be useful.
@JonHanna, @terrajobst Why do you think returning null for null would be useful? If this is meant to supplant usages of (T[])array.Clone(), and if that throws a (nullref) exception today if array is null, then wouldn't it make sense for Array.Clone(array) to also throw if array is null?
I was thinking of cases where I've implemented Clone() for compound objects. It's much easier if you can simply call the Array.Clone(array) method without having to worry whether the array is null or not. Cloning a null array is well-defined by returning null.
Yes, generally how I'm going to treat null falls into one of the following:
More philosophically, I prefer things to work with null if it makes any sort of conceptual sense. To my mind it's perfectly sensible that a clone of null is null.
@terrajobst @JonHanna Makes sense. I'll update the PR to return null, then.
I'll throw in a dissenting opinion here and say that null shouldn't be accepted as input. It's a small distinction and obviously we could define the semantics/document it however we want, but it seems semantically and conceptually cleaner to require the parameter to be a real, non-null array.
To my mind it's perfectly sensible that a clone of null is null.
This is wandering into personal preference territory, I guess, but I feel sort of the opposite. null isn't actually an array, so passing it to a method taking an array doesn't make conceptual sense to me.
@jkotas @stephentoub Do you have an opinion on this?
@mellinoe
To me, the higher order bit to consider is whether a given enforcement of non-null is helping or hurting. The case where missing validation is hurting is when it hides bugs. For example, operations that have side effects should generally validate arguments instead of silently no-oping.
Operations like Array.Clone that return null values are hardly hiding bugs. If subsequent code can't deal with null, the user gets a clean null-reference exception.
On the other hand, accepting and returning null avoids having to special case situations that would be perfectly well defined anyways.
Sure, we can define the semantics however we want. And like I said, it's a pretty small distinction anyways. But to me, Clone()ing a null object doesn't make conceptual sense. If I saw some code relying on that behavior, I might be slightly confused for a bit, and have to re-evaluate my assumptions. I'm just throwing my opinion in here, if others feel differently then it's no big deal.
If subsequent code can't deal with null, the user gets a clean null-reference exception.
I dunno; maybe. That's rather simplifying things. Perhaps your code can handle a large variety of circumstances involving null, empty, various-sized arrays, and you make the assumption that Array.Clone doesn't give you back a null value. Maybe you're in a weird state now that you didn't anticipate. I think it's more nuanced than either "everything works fine" or "you get a clean, easy-to-track-down NRE right where you expect it".
Anyways, I'm happy to agree with whatever others feel is the right call; just wanted to get my thoughts out there.
Can we take a step back? Why are we adding this method? At this point, its implementation is just going to call the instance Clone and do a cast, so the only value the method provides is doing the cast for you. But it also has to do a null check, it incurs generic overheads, and as a new method it's not portable. Doesn't seem like enough value to me.
All that said, if everyone else still wants to add it, I'd vote for null being an error.
Yes, the primary value this method offers is syntactic sugar by preserving the type of the array instead of returning object.
syntactic sugar
In this particular case, it is a syntactic sugar mixed with salt because of it is not extension method (for source compat) that it should really be for good flow. I agree with @stephentoub that it does not have enough value.
Why do you say that? Especially on array instance methods aren't common. Most of the functionality is provided as static methods on Array.
@jkotas, if we were to add an extension method, does array.Duplicate sound like a good name?
Why do you say that? Especially on array instance methods aren't common
Clone is existing instance method on array and many other types. If you got int[] a and start typing a.Clone(), the intelisense will give you the existing one that returns object. The static Clone would be pretty hard to discover.
The existing Clone() with cast does not look that bad, considering all alternatives. There are number of instance Clone() methods in the framework that return less derived type, and it is not uncommon to see them together with cast.
The array surface was designed in v1 before framework designed guidelines, generics, extension methods, and other goodies. I believe it would have been quite different if it is designed today.
does
array.Duplicatesound like a good name
It has the same discoverability problem.
@jkotas
If you got
int[]a and start typinga.Clone(), the intelisense will give you the existing one that returnsobject.
I get that. But what I'm trying to say is that we've trained developers to look for static methods on Array. Of course, an instance method would probably be more discoverable, but at the same time I think not having to type the extra parentheses and the type names is still worth it -- you just have to learn which method you need to call. Less ideal, but given how important arrays are not outrageously bad either.
Does
array.Duplicatesound like a good name?
We should most certainly not introduce a method using a different name. Being consistent with terminology is key. We've learned that the hard way with things like Close vs. Dispose.
In well written code, a developer should know if a value could be null.
It's a nice goal to aspire to, but real world code is not always so clean as it should be.
If this static method does not accept nulls, I imagine that it will generally need to be used as follows:
var copy = array != null ? Array.Clone(array) : null;
It's not too bad--and there are shorter ways to write it--, but I see it as advantageous to avoid this this boiler-plate code and simply write the following, knowing full well that null is possible and handle it later if/when needed:
var copy = Array.Clone(array);
Essentially, I want a clone of whatever I currently have. If it is null, then so be it. I'll figure it out later.
@stephentoub and @jkotas
Any push back on the API? I know you guys said you don't think it adds enough value, but it' clear folks disagree. So unless you strongly think we shouldn't add this API I'm inclined to accept this API as proposed.
Any push back on the API?
I still think it's silly to add. Any new API we add increases the differences between the various .NET implementations. It's one thing to do so for things that really move the needle; this doesn't. There's also the extra costs involved in it being a generic method. There's potential confusion around instance vs static Clone. Etc. Maybe these are minor concerns, but the value of the API is also so incredibly minor, just doing a cast for you.
I do not think we should add this API.
I'm really interested if we can get a real boost (which if possible would likely need to be at the level of the array instance method C++ code). Otherwise I'd rather have the syntactic vinegar of an explicit cast to the sugar of a hidden cast.
@jkotas As @JonHanna mentions, it doesn't have to be a wrapper around the existing Clone. I don't know as much about the current Clone as you do, but presumably there are things like checks in the C++ code to handle multi-dim arrays, non-zero based arrays, etc. If we added this API, we could remove those checks since we know the input is an SZ array. Also, since this would be implemented in C++ we would have direct access to the Array.Copy implementation for example, and be able to shave off a few method calls.
If the implementation can be made meaningfully faster (and the instance method can't), then I'd understand the method being added when those improvements come, not before.
Agree.
IMHO, it would pretty hard to demonstrate that the generic API has significant performance advantage over the non-generic one, everything else being equal.
If we added this API, we could remove those checks since we know the input is an SZ array
The existing non-generic implementation does not have these checks already. Also, the existing non-generic implementation can be optimized quite a bit.
this would be implemented in C++
C++ does not help you much here. And we want to move to a place where this is implemented in C# for the most part anyway.
BTW: This shares similar set of problems as the generic version of Array.Copy. I believe it is hard to demonstrate significant performance advantage for the generic Array.Copy for similar reasons.
@jkotas
I believe it is hard to demonstrate significant performance advantage for the generic Array.Copy for similar reasons.
Is it? GetLowerBound(0) removed?
You can restructure the implementation a bit to make the GetLowerBound(0) call go away, without changing the public surface.
Better performance is really the only good reason to add this API, but there is no evidence that it really does that. We can reconsider if the generic implementation is proved to provide significant performance advantage over the existing non-generic one (and equivalent optimization cannot be reasonably applied to the existing non-generic implementation).
It sounds like while the API looks fine, the implementation would not provide any performance improvements over the current solution.
I am going to close this for now but feel free to reopen if you come up with a way to make the new method faster than the existing solution.
Most helpful comment
@mellinoe
To me, the higher order bit to consider is whether a given enforcement of non-
nullis helping or hurting. The case where missing validation is hurting is when it hides bugs. For example, operations that have side effects should generally validate arguments instead of silently no-oping.Operations like
Array.Clonethat returnnullvalues are hardly hiding bugs. If subsequent code can't deal withnull, the user gets a clean null-reference exception.On the other hand, accepting and returning
nullavoids having to special case situations that would be perfectly well defined anyways.