Currently, C# only supports params T[]
, however it is rare that I actually need an array for my method. More often than not, I merely want iteration or to perform some sort of LINQ operators against the parameter. In fact, it is common for libraries to declare a version of the method that takes an IEnumerable
The implementation detail could still be that an array is made at the call site and sent to the method, the benefit is to simply avoid having to create the extra overload.
This feature was already proposed for C# 6, but didn't make it due to scheduling. So my guess is that it's pretty likely it will make it into C# 7.
If we are going to invest in a new params
type I would like to see us invest in one that is more efficient than either T[]
or IEnumerable<T>
but equally inclusive.
Today low level APIs which expose a params
method typically end up exposing several other non-params overloads in order to avoid call site allocation of the array for small numbers of arguments. It's not uncommon at all to see the following:
WriteLine(params object[] args)
WriteLine(object arg)
WriteLine(object arg1, object arg2)
...
// The above pattern lets the following bind without the object[] allocation
WriteLine("dog");
Switching to IEnumerable<T>
would not help here and would in fact make it works because it creates two allocations for a small number of arguments:
An idea we've sketched around a few times is a struct of the following nature:
struct Arguments<T> : IEnumerable<T>
{
T _arg1;
T _arg2;
IEnumerable<T> _enumerable;
int _count;
// struct based enumerator
// index which intelligently switches between T fields and enumerable
}
WriteLine(params Arguments<T> args)
The Arguments<T>
struct represents a collection of arguments. In can be built off of individual arguments or a collection. In the case of a small number of individual arguments, which for many APIs is the predominant use case, it can do so allocation free. The compiler would be responsible for picking the best method of constructing Arguments<T>
at the callsite.
@jaredpar It sounds like you want to use the calli
and arglist
IL instructions in C# code...
I could see a signature like this working pretty well:
WriteLine(params ArgIterator args)
@sharwell yes and no. Those definitely achieve part of the problem: efficient way of calling a method with an arbitrary number of arguments. It doesn't hit the other part though which is easy interaction with .Net collections.
I actually see your concern as a separate issue from params IEnumerable<T>
. This topic (IEnumerable<T>
) is about exposing a convenient method for simplified APIs, where memory allocation characteristics are often (but not always) much less of a concern.
Having yet another separate class to represent params
defeats most of the purpose of this feature which is to allow an API to expose a method that accepts IEnumerable<T>
and support using params
without having to explicitly write two separate methods.
Also, the performance of using foreach
over an IEnumerable<T>
which happens to be an instance of T[]
is quite efficient, generally on par (if not faster) than using for
, as unintuitive as that sounds.
@HaloFour @sharwell
The intent of (params Arguments<T>)
is to accept both individual arguments and IEnumerable<T>
values equally well. There would be no need for (params IEnumerable<T>)
in this scenario because it would already be covered.
void Method(params Arguments<int> args) { ... }
void Example(IEnumerable<int> e)
{
Method(42); // binds to Method(new Arguments<int>(42))
Method(e); // binds to Method(new Arguments<int>(e);
}
@jaredpar
I'm still not seeing any advantage to having an Arguments
intermediary. If an API has specific optimized paths for dealing with a small number of arguments then even with Arguments
they would have to be coded separately anyway so why is Arguments
an improvement over having the overloads? The separate overloaded methods already provide the appropriate separation for the different algorithms and passing the arguments as individual values on the stack is more efficient than copying those values into a struct and copying that struct. If the API doesn't have specialized paths and will just enumerate the arguments via for
or foreach
then there is no performance benefit to this struct, even with 2 or fewer arguments.
@HaloFour
The advantage is simply avoiding the allocation for the collection at the call site. It's not about specific optimizations within the method. The allocation may seem small, and typically is in isolated scenarios. Added up over an entire application though the array allocations add up (and quite fast).
I've worked on several projects where we've measured significant performance improvements by adding a few overloads to params
methods that avoid call site allocations. The implementation of the non-params overload had no special optimizations. The algorithms, minus the loop, were identical to the params
overload.
This is why I don't see the value in adding params IEnumerable<T>
. It is solving only one of the current issues with params
(not being inclusive to all collection types). I'd much rather invest in a solution that solves all of them. A solution like Arguments<T>
has the possibility of doing so because:
IEnumerable<T>
as input (which in turn includes T[]
).params T[]
methods today. The algorithms, minus the loop, were identical to the params overload.
Which is a different algorithm by definition since you're treating the arguments differently. You're better off keeping the specialized overloads since bouncing through a struct intermediary will be slower, even if you reference the fields containing the values directly. If you have to instead bounce through an indexer it will be significantly slower, on par with the speed of just working with an array.
The purpose of supporting params IEnumerable<T>
is for the cases where performance isn't important because the performance will absolutely be slower even compared to even enumerating over params T[]
. It's simply to eliminate the need to write that additional overload as this has become a common pattern:
public void Foo(params int[] args) {
Foo((IEnumerable<T>)args);
}
public void Foo(IEnumerable<int> args) {
// enumerate here
}
For those cases where the performance is important and you can behave differently given a small number of arguments the compiler story is already quite good through overload resolution. I see no need to complicate that and force a single code path which couldn't be a single code path anyway.
You're better off keeping the specialized overloads since bouncing through a struct intermediary will be slower, even if you reference the fields containing the values directly
Completely disagree. We've tested out solutions in the past and found that the allocation benefit dominates any sort of indirection you would get from the struct.
Note: IEnumerable<T>
adds even more indirection than T[]
given that it will be
It's simply to eliminate the need to write that additional overload as this has become a common pattern:
The Arguments<T>
solution would fix the exact same scenario. It can accept individual arguments and IEnumerable<T>
values.
Why push for a solution that is slower and allocates more memory over one which is solves the same problem + additional scenarios?
Why push for a solution that is slower and allocates more memory over one which is solves the same problem + additional scenarios?
Because those additional scenarios don't benefit without adding onus onto the developer of the method to write even more code than they need to today.
The following is slower than params T[]
:
void Foo(params Arguments<int> args) {
for (int i = 0; i < args.Count; i++) {
int arg = args[i];
// do stuff here
}
}
And the following is slower than overloads:
void Foo(params Arguments<int> args) {
if (args.Count == 2) {
// assuming accessible public readonly fields, using indexers here is significantly slower
int x = args.arg1;
int y = args.arg2;
}
// need to handle other possibilities here as well
}
The one place where this could be nominally faster is in the case of params IEnumerable<T>
where optimized IEnumerable<T>
implementations can be provided for specific argument counts, which is something that the C# team could do when consuming params IEnumerable<T>
rather than just emitting an array.
@HaloFour the Arguments<T>
type implements IEnumerable<T>
using a struct based enumerator. I would expect the vast majority of consumers to use a foreach
loop.
edit typo
@HaloFour talk about the bad time for typos. I meant to say the exact opposite of that :( Editted the comment.
I would expect the vast majority of consumers to use a foreach loop.
I agree, and in those cases I do think that it would be worthwhile for the C# team to emit specialized implementations of IEnumerable<T>
and IEnumerator<T>
to improve performance, but that is a detail that could be hidden. The one gotcha there is that if the method is written to also check to see if the IEnumerable<T>
is an T[]
as that would not be true.
Hmm, and if you want more than 2 arguments what happens? Will the compiler go and create an array and pass it to Arguments<T>
? And if you need arguments of different types and they're value types you still end up allocating due to boxing, the only T you can use in such cases is Object. I'm a bit of a performance freak myself but this particular optimization attempt seems a bit overdone.
And I can't help not to notice that the allocation cost problem comes up quite often and every time we end up with solutions that are either incomplete or have some other not so great effects. Back in .NET 2.0 generic collections got struct enumerators, great, allocations avoided. And then you look at the JIT generated code and go hrmm. It looks like RyuJIT will produce better code but it took "only" 10 years.
Maybe at some point we need to accept that the system works the way it works and it has certain performance characteristics. If you want to make it better, well, use .NET Native, add escape analysis and stack allocation to it and call it a day.
Just my 2 cents.
@mikedn
Hmm, and if you want more than 2 arguments what happens? Will the compiler go and create an array and pass it to Arguments
?
Yes.
And if you need arguments of different types and they're value types you still end up allocating due to boxing, the only T you can use in such cases is Object
Correct but this is not a new problem. It already happens today with params
.
It is definitely something I would love to see solved. So far though an elegant solution hasn't presented itself. If it did though I would likely be very interested in that as well.
Back in .NET 2.0 generic collections got struct enumerators, great, allocations avoided.
Struct based enumerators have been around since 1.0. It was the original way to have type safe, non-allocation enumeration. I do share the frustrations on enumerators though and I've written some thoughts about it here.
http://blog.paranoidcoding.com/2014/08/19/rethinking-enumerable.html
Maybe at some point we need to accept that the system works the way it works and it has certain performance characteristics.
Speaking as someone who's worked on a lot of perf sensitive applications over the years: allocations matter much more than most developers give them credit for. Most performance investigations end up doing little more than trying to reduce the GC time which translates into curbing unnecessary allocations.
Any time we create a feature in the language that has unnecessary allocations, it's a feature that will likely be avoided by perf sensitive applications. I'd much rather focus on features that are applicable to all types of programs.
@jaredpar
I do like elements of your rethinking on IEnumerable<T>
. Without IEnumerator<T>
don't you lose the capacity for generic variance? Also, and probably a fairly minor tweak, I'd also prefer TEnumerator
to have a generic constraint of IDisposable
, although I guess the compiler could just emit a try/finally
which would check to see if TEnumerator
was disposable and, if so, call Dispose()
. Variance aside I love the idea of current
being an out
parameter. TEnumerator
being a ref
seems a tad weird but I get why you do it.
In the end, though, I think I'd rather be stuck with one slightly-less-perfect method than have a bunch of disparate but similar methods. IEnumerable<T>
is still better than Java's Iterable<T>
or Enumeration<T>
.
@HaloFour , But at the same point, unless you are writing some very high specialized code like LINQ where you have optimized paths for different type of IEnumerable<T>
, its very uncommon to actually inspect the underlying type of an IEnumerable<T>
.
@mikedn, But boxing is always the problem if you have incompatible types. If you need different types and presumably need different handling for said type, you have no choice but to box. As for the rest of your points I'm not sure.
I like that @jaredpar 's idea as it help solves the extra overloads problem that my original suggestion was getting at.
e.g.
void SomeMethod(T item1, T item2)
void SomeMethod(params T[] items)
void SomeMethod(IEnumerable<T> items)
Under Jared's model this would be one method with one implementation. I also agree that IDisposable is very useful on enumerators. Especially those generated from yield return
. Sometime s you want to tie resources lifecycle to iteration.
I think TryGetNext
isn't bad but I'd probably want to still keep the abstraction of a Enumerator
@mburbea Just supporting params IEnumerable<T>
you also solve the additional overloads "problem" and can have one method with one implementation. You could then opt-into the additional overloads to provide optimized scenarios for accepting arrays or specific numbers of arguments but only if you wanted to. If your one implementation is just going to be using foreach
or LINQ then it doesn't really matter as you're using the most expensive implementation anyway. But if you know that you can have an optimized path given exactly one or two arguments then it always makes sense to have the overloads since that already works very well with overload resolution and provides the least expensive path for passing the values.
@jaredpar Actually struct enumerators were added in 2.0, check ArrayList for example, it has a class enumerator and its GetEnumerator returns IEnumerator. And yes, I read your blogpost and I know you worked with Joe Duffy on a certain project :smile:.
Oh well, I suppose it makes sense for the compiler and framework to strive to minimize allocations. The main problem is how will the indexer deal with IEnumerable
C#
T this[int index] {
if (count > 2) {
var list = _enumerable as IList<T>;
if (list == null) {
list = _enumerable.ToArray();
_enumerable = _list;
}
return list[index];
}
...
As for boxing, that's likely unavoidable. You'd need something similar to C++'s variadic templates to get make it work.
@mikedn guess you learn something new every day. I would have sworn struct enumerators were in 1.0. :)
@HaloFour , when I'm at the point where I'm taking variadic arguments I rarely will be doing much differently then with one argument or N arguments.e.g.DoSomething(arg1)
vs foreach(var arg in args){ DoSomething(arg)}
I'm after that easier calling convention. Perhaps I could avoid the loop, but this type of code usually isn't the bottle neck of performance for me.
I probably would use it like params IEnumerable<T>
as long as Arguments implements IEnumerable<T>
and the compiler would handle the conversion itself. Can the jit put structs like this into registers or is that a bridge too far?
The JIT may place a struct in a register if it contains only one field that fits in a register (a field of type int
for example) so Arguments
@mburbea In which case you'd likely not be writing those additional overloads anyway as they would serve no purpose for you and having an intermediate struct would be of no benefit, either syntactically or performance-wise.
I believe where you would see the improvement of an intermediate struct
would be to have the compiler emit such a struct as the IEnumerable<T>
rather than just creating an T[]
as the former can be more optimized than the default array enumeration.
@HaloFour What do you mean by the former can be more optimized than the default array enumeration? Array enumeration is as fast as it gets, there's nothing more optimized than that. Maybe you meant the opposite or I did I read it wrong?
I think he means array enumerators aren't the fastest. e.g.
void SomeFunc(T[] args){
foreach(var a in args){ DoSomething(a)};
}
will actually get compiled as a for loop to avoid creating an enumerator. If you instead change the signature to IEnumerable<T>
it'll actually be slower than if you past it a List<T>
. This has something to do with the SzArrayHelper
jit stuff last I read about it on stackoverflow.
It's too late now but is there any reason that the array enumerator isn't a struct one?
@mikedn The compiler will automatically convert a foreach
to a for
over an array, which is definitely a lot faster. If you cast the array to an IEnumerable<T>
prior to foreach
the enumeration is a good 5-6 times slower. Custom enumerators can beat that by a good 20% margin or so.
@HaloFour Exactly, the foreach
is converted to for
so what's this 20% faster custom enumerator thing? An enumerator that tries to cast the IEnumerable<T>
back to array?
@mburbea I know that and HaloFour knows it to, the misunderstanding is somewhere else. And a struct array enumerator wouldn't not help, it will end up boxed to IEnumerator
@mikedn This feature request is to support params IEnumerable<T>
, which is what makes this relevant. In terms of the existing params T[]
support, you're correct, although if people are just passing that array to an overload that accepts IEnumerable<T>
then the benefit is lost.
In the C# 6.0 timeframe the proposed solution was to support params IEnumerable<T>
and to have the caller emit new T[] { 1, 2, 3 }
, so we're back to enumerating via IEnumerable<T>
unless that method converts/casts to an array, which would kind of defeat the point of supporting params IEnumerable<T>
. My statement is that if the compiler instead emitted a struct
which had a better IEnumerable<T>/IEnumerator<T>
implementation that it could edge out the performance of the array enumeration in that case.
C# 5.0:
public void Foo(params int[] args) {
Foo((IEnumerable<int>)args);
}
public void Foo(IEnumerable<int> args) {
// do stuff with args
}
...
Foo(1, 2, 3); // -> Foo(new int[] { 1, 2, 3 });
C# 6.0 proposal which was cut due to schedule:
public void Foo(params IEnumerable<T> args) {
// do stuff with args
}
...
Foo(1, 2, 3); // -> Foo(new int[] { 1, 2, 3 });
Alternate proposal:
public void Foo(params IEnumerable<T> args) {
// do stuff with args
}
...
Foo(1, 2, 3); // -> Foo(new CompilerGeneratedArgsStruct(1, 2, 3));
@HaloFour OK, so it's like I said in my previous post, the claimed performance improvement would came from an enumerator which tries to cast back to array (or IList<T>
). All clear.
As I shown in a previous post something like this also has to happen in the indexer. Let's not forget that methods like String.Format do not need to enumerate the params, they need to access them by index and they may access an index more than one time
@mikedn Well, the performance improvement is because I can write a better case-specific implementation than SzArrayEnumerator
. :smile:
I do think that the indexer case is already covered, just continue to use the existing params T[]
. You can then opt-in to a non-params
overload that accepts an IEnumerable<T>
which would cast/convert to an array before passing it to the params
overload.
Actually just a quick set of trails, Array seems really fast when passed to a method that takes an IEnumerable<T>
. Even for a small number of arguments ( I tried 2 3 and 4) It won every time. I suppose it may not be the best from a memory perspective but its hard to beat unless you change the signature from params IEnumerable<T>
.
@mburbea
I was able to handily beat the array enumerator by using the following:
I ran that through the following test methods:
public int Test1(CompilerGeneratedArgs3 args) {
int tally = 0;
foreach (int value : args) {
tally += value;
}
return tally;
}
public int Test2<TParams>(TParams args) where TParams : struct, IEnumerable<int> {
int tally = 0;
foreach (int value : args) {
tally += value;
}
return tally;
}
public int Test3(IEnumerable<int> args) {
int tally = 0;
foreach (int value : args) {
tally += value;
}
return tally;
}
With 10,000,000 hard-loop iterations the performance using .NET 4.5.2 was as follows:
Test1: 00:00:08.3613572
Test2: 00:00:08.6540710
Test3: 00:00:09.2925556
Test3/Array: 00:00:11.3223353
Here's the source if you want to give it a spin:
You didn't beat the array enumerator, your code simply avoids an allocation, the array. If you hoist the array allocation out of the loop in Test3/Array then you'll get similar results:
Test1: 00:00:04.3801256
Test2: 00:00:04.2367054
Test3: 00:00:04.7327582
Test3/Array: 00:00:04.8988729
Do you mind using a gist? I did the following and I came to the conclusion that arrays seem to be a winner most of the time. I wrote these quickly so they may not be the best struct enumerators possible.
CallWithArray : 527ms
CallWithList : 1301ms
CallWithArg2 : 677ms
CallWithArg2Unsafe: 638ms
My times with this gist::
https://gist.github.com/mburbea/683f74ff5cc589d512c5
This is doing 1<<24 iterations so quite a good bit. Every method calls the sum method taking 2 bytes and using an enumerator to sum them. Even an unsafe buffer was slower than an array.
@mikedn Sure, but that's not what params
is going to be doing. The allocation is a big part of it, but even without that the custom struct does edge out an array.
Sheesh, this is getting complicated and I'm not sure why.
@HaloFour But your example isn't what the proposed params
is going to be doing either. Arguments<T>
will need to also be able to wrap an existing IEnumerable<T>
and as a result the code will be more complex and probably slightly slower. And your example fails to actually make use of the struct
enumerator because it boxes it to IEnumerator<T>
and that means that Argument<T>
will actually be faster. So the only claim that you can make is that you beat the array enumerator and it turns out that this doesn't have anything to do with the enumerator itself and has everything to do with allocation.
@mburbea Trying List
The differences between array, arg2 and arg2unsafe appear to be order dependent, if you run CallWithArg2 first then you'll see that it's faster than CallWithArray. That's probably a side-effect of GC and that means you need to take the differences with a grain of salt. Doing benchmarks that involve allocations is problematic. Take out GC.Collect and run the tests multiple times to get a more realistic picture.
@mikedn
I agree, this is getting complicated.
I meant that params
wouldn't be caching the array off so to make the claim that doing so even out the benchmark is entirely pointless. I don't cache the initialization of the structs in my tests either because that's not realistic use.
What stands as the "proposal" at this point? I've seen what @jaredpar wants to do (and I know that he's on the Roslyn team) but I've not seen anything "official" beyond what was on CodePlex which was, simply, params IEnumerable<T>
and the compiler emitting a normal array allocation like it does today.
I do agree that going straight struct is faster, but how does that play out with something like Arguments<T>
? A single struct can't represent the wide range of possibilities that a compiler-emitted struct could represent, adapting to the appropriate count of arguments rather than carrying the extra baggage of representing all of those alternative cases.
And does going with Arguments<T>
actually meet the needs for the request? If the point is to be able to have a single method accept an IEnumerable<T>
as well as being params
then having yet another type doesn't satisfy that desire. You end up with two methods, one accepting params Arguments<T>
and another accepting IEnumerable<T>
. Sure, the developer could manually call the one accepting the Arguments<T>
but does it make sense to use a construct for variadic arguments when you're not supplying variadic arguments?
Also, wouldn't Arguments<T>
necessitate the release of a new framework to even permit this capability? The IEnumerable<T>
/array based approach at least would work on every version of .NET going back to 1.0 (assuming you could use the non-generic version).
structs are faster, I don't think enough so to matter
@mikedn
... but I've not seen anything "official" beyond what was on CodePlex which was, simply, params IEnumerable
and the compiler emitting a normal array allocation like it does today.
The IEnumerable<T>
version emits an array and enumerator allocation.
You end up with two methods, one accepting params Arguments
and another accepting IEnumerable .
Why? If Arguments<T>
has an implicit conversion from IEnumerable<T>
then there is no need for an overload.
structs are faster, I don't think enough so to matter
I strongly disagree. It matters quite a bit.
Not saying this based on personal belief. Saying this based off of real world profiling in large systems. The allocations matter quite a bit when considered across the system.
@mikedn, I was trying with permutations on the orders of the Tests array and I was consistently getting that CallWithArray would be the fastest no matter what. I tried with and without the GC.Collects.
As for the Arguments<T>
business, As I understood it, it would be a new struct that would probably have an implicit converting operator from IEnumerable<T>
, and the compiler would stick small number of arguments to reified fields and for larger amounts it would just allocate an array. Since arguments would be passed as an explicit argument, then a struct enumerator would be beneficial. However, as I suggested on uservoice a billion years ago, I'd be 100% what was originally suggested for C#6.
@jaredpar Well, if that's a separate option from params IEnumerable<T>
then it at least hits all of this feature request.
How about also handling the following signature?:
public void Foo<TParams>(params TParams args) where TParams : struct, IEnumerable<int> { ... }
My primitive testing put it on par with a specialized struct
and if the compiler generated the implementation it could be tailored to the type and number of arguments which should be a little more efficient.
@jaredpar You replied to the wrong person :smile:
@mburbea I tried again your code on a x64 machine with RyuJIT, in that case the array test is consistently faster. But the code in Arguments2 is bad, the call to index from Current isn't inlined and using switch is a bad idea. Fix it and it will be slightly faster than the array test. I'm not sure what's going on with the unsafe thing, that one is slow for no obvious reason.
@HaloFour Mixing generics with structs can be a problem. If you create CompilerGeneratedArgs1, CompilerGeneratedArgs2 and CompilerGeneratedArgs3 then you'll end up with 3 native versions of Foo
@mikedn blarg. Need to drink more coffee :(
@HaloFour
For me the mainline performance case is avoiding creating arrays for calls that have 1-3 arguments. In particular avoiding allocating the memory for those elements on the heap.
The solution you are proposing won't help that because it will still end up allocating memory for the elements on the heap. Even if the collection type is generated by the compiler as a value type it still has this problem because it must produce an IEnumerator<T>
instance. This enumerator must necessarily refer to the original elements which in turn forces them to be in the heap somewhere.
@mikedn That would be a one-time cost though and shouldn't really amount to much. The advantage is that the generated structs wouldn't need all of this additional general purpose code or additional fields to represent the other use cases. Having these multiple types doesn't seem to be a problem with anonymous types are concerned.
@jaredpar Wouldn't that be true of Arguments<T>
as well if all you're doing is using foreach
over it or using it with some LINQ extension methods?
@HaloFour Maybe it doesn't look like much but I don't think the compiler guys will easily agree to enforcing a pattern that can lead to native code bloat. And anonymous types have nothing to do with this, they're reference types so even when used with generics they don't force native code unsharing.
And to spare Jared Parsons the trouble: no, if you use foreach with a proper struct enumerator there's no allocation. The problem with your suggestion is that it forces the code in Foo to access the enumerator via IEnumerable<T>.GetEnumerator
and that will cause boxing.
As for using LINQ extension methods, yes, that will cause allocations no matter what you do. In this regard Arguments<T>
will probably be worse than an array.
@HaloFour
It wouldn't be true for foreach
. Because Arguments<T>
is a concrete type the foreach
loop can take advantage of the struct
enumerator on the type. This avoids the heap allocation.
If you used LINQ
methods then yes that would cause the allocation. But that is true for every collection type though due to the architecture of the enumerator pattern.
@mikedn Anonymous types themselves are reference types but every different variation that happens to contain a value type would result in a new version generated by the runtime, no? Same as how List<object>
, List<string>
, List<int>
and List<double>
results in three different flavors of the List<T>
class actually in use.
@jaredpar Ah, I see what you mean. The struct could implement IEnumerable<T>
explicitly and then offer a public GetEnumerator()
method that returns a concrete struct type that implements IEnumerator<T>
and the compiler will never box that struct. I actually didn't know that the compiler could do that. I guess that does preclude the ability to use a compiler-generated struct.
I tossed that into my tests and the performance difference is pretty staggering:
struct enumerable with struct enumerator: 4.7830485 sec
struct enumerable with boxed struct enumerator: 8.4173345 sec
generic enumerable with struct constraint: 8.1790828 sec
boxed enumerable with boxed struct enumerator: 9.0478730 sec
array: 14.0110617 sec
You learn something new every day.
@HaloFour
So, because TParams
is struct
you'll get one native version for every TParams type that you are using, that can easily get out of hand. You don't know how big your Foo method will be in practice, maybe someone will write a 1000 lines method and want it to have params
.
Anonymous types have little code. In particular, the property getters are trivial and candidates for inlining. Additionally, anonymous types fulfill a certain practical need but your proposition doesn't do that. It's an attempt at optimization that risks being a deoptimization if the code gets large.
Does the proposed struct Arguments<T>
_always_ requires the implementer to check for the number of arguments passed and treat them differently based on whether its one, two or many arguments? Or does it allow you to enumerate over it directly?
C#
void Foo<T>(params args Arguments<T>) { foreach(var arg in args) { } }
I believe the idea is you can just foreach it and it will do the right thing (TM). The main optimization is apparently around 3 or less arguments which is the most common use of params anyway. From what I understand from Jared's idea. If you are writing code that may want to do specialization on smaller number of arguments then the indexer will be exposed, but if you just want nicer call conventions then its fine to just foreach it.
@MgSam the Argumens<T>
type implements IEnumerable<T>
hence it can be used with foreach
exactly as you stated.
While the performance implications of each feature proposal are quite important, my understanding was that the params IEnumerable<T>
feature was put forth simply to allow APIs developers to more easily expose cleaner and simpler code to consumers. Optimized overloads can still be provided.
The original feature, as it was proposed for C# 6 has plenty of use cases. For example, I often find myself writing overloads of constructors or factory methods that take params T[]
in order to provide a more readable syntax for use when manually writing test data. I dislike having to have multiple overloads simply to provide syntactic convenience to calling code, but I find it necessary. This is a nice simple feature proposal and it provides value.
I think @HaloFour sums this up nicely in his/her initial three posts in this discussion.
Couldn't we just
void Func(params items)
and
void Func<T>(params<T> items)
@Thaina What would the type of items
be in those cases? You're not specifying the element type of items
, let alone how the multiple values should be represented.
@HaloFour IEnumerable
and IEnumerable<T>
It just that dropping from full params IEnumerable items
and use only short params
We can't params List
or params Dictionary
anyway
So just params to represent IEnumerable
params items
element type is object
params<T> items
element type is T
done
@Thaina
Per this proposal you would be able to pass either params T[] items
or params IEnumerable<T> items
. You may have had an argument before this proposal since you could only pass arrays, but I would argue that specifying that the type is an array is still more appropriate since the parameter type is of array (and you can pass a raw array), the params
is just metadata to let compilers know to let their callers treat the method as vararg
.
@HaloFour That's the point I want to let params T[] items
as it is so it backward compatible but introduce params items
instead of params IEnumerable items
because we shouldn't need params AnyTypeInTheFuture items
anyway
And with return ref. IEnumerable should also implement ref yield so we could foreach with i++ instead of rely on array
And we could cast items as T[]
if we really really need array
Just curious, what's the status of this now we're in 2016?
The change of year hasn't had any effect on this.
lol. Oh I don't know @gafter. It'll be Spring before you know it.
@gafter Lol, good one.
Seriously, though. This is on the list of things we'd be happy to see get into C# 7 if time permits. A community member's contribution of an implementation along with tests would go a long way toward making that happen (hint hint).
@Thaina the syntax you suggest seems highly irregular as it is a keyword followed by a generic type parameter. Also why do you assume that there would never be params IDictionary<TKey, TValue> items
? Although it is not currently proposed, why rule it out?
@gafter I'm honestly a little scared of writing code for Roslyn (I'm not even familiar with the API), so don't know if I'd be the right choice haha. Would be really awesome if someone else could start work on it though.
@aluanhaddad The syntax would be
void M(params IEnumerable<string> strings) ...
I know what the spec would look like for that (the compiler would continue to create an array for the arguments, as it does today). I have no idea what spec you have in mind for other types. I don't know that it has been ruled out so much as it has not yet been proposed or considered.
@gafter I don't have any spec in mind nor am I proposing anything. I was just saying that I dislike the idea of using the syntax params<string> items
instead of params IEnumerable<string> items
because it would rule out using other types.
@aluanhaddad params IDictionary<TKey, TValue> items
is params IEnumerable<KeyValuePair<TKey, TValue>> items
Has anyone considered what would/should happen if a developer passes a single IEnumerable<T>
as the params
parameter's value? Should the source IEnumerable<T>
be iterated (say, to create the array) when it is passed? What happens if the source IEnumerable<T>
is some kind of stream that has no end? Shouldn't the receiving method initiate the iteration and not the method call itself?
@nathan-alden Depending on the type it will create an array, just like existing params
on arrays.
F( "s" ); // creates an array
F(new string[] { "s" }); // pass
G(new string[] { "s" }); // creates another array
void F(params string[] a){}
void G(params string[][] a){}
@alrz I'm not sure I understand your answer. Allow me to demonstrate with an example. Currently, this is what we all have to write in our libraries and frameworks:
public void MyMethod(params object[] x)
{
MyMethod((IEnumerable<object>)x);
}
public void MyMethod(IEnumerable<object> x)
{
}
In cases where the developer calls the IEnumerable<object>
overload, the underlying enumerable won't be iterated until MyMethod
iterates it (if it ever does). With some of the above comments, it seems like a new array would first be populated, then passed to the params IEnumerable<object> x
parameter:
public void MyMethod(params IEnumerable<object> x)
{
}
IEnumerable<object> expensiveToIterate = GetExpensiveEnumerable();
MyMethod(expensiveToIterate); // The above proposals would call for this enumerable to be iterated and thrown into an array
See the issue?
@nathan-alden The compiler isn't going to materialize the enumerable into an array. It will only create a new array in the case of trying to call the method with multiple arguments:
IEnumerable<object> expensiveToIterate = GetExpensiveEnumerable();
MyMethod(expensiveToIterate); // passes the enumerable as-is
MyMethod("a", "b", "c"); // passes new object[] { "a", "b", "c" }
@nathan-alden Note that this proposal exists largely to eliminate the pattern that you're following, where you want to accept an IEnumerable<T>
but you have an overload that accepts params T[]
just to enable calling the method with variable arguments. Going forward you could just declare a single method accepting params IEnumerable<T>
and get the benefits of both.
Yep yep, cool. Your first reply answers my question. Thanks! :+1:
I would like to:
``` C#
public void Foo1(params T[] args) { ... }
public void Foo2(params IEnumerable
public void Foo3(params IReadOnlyCollection
public void Foo4(params IReadOnlyList
Method calls:
``` C#
FooX(a, b, c);
is translated as:
C#
FooX(new T[] { a, b, c });
@SergeyZhurikhin
Might as well include IList<T>
and ICollection<T>
as well to complete the list of generic interfaces implemented by a single-dimension array, no?
@HaloFour
Absolutely not!
For example:
C#
public void Foo4(params IList<T> args) { args.Add(t); ... }
That would be a runtime error, just like it would be today if you passed an array manually.
@HaloFour
So I'm doing it a compile-time error.
I agree with @SergeyZhurikhin. A method getting an IList<T>
is likely to call Add
, so it is not a good idea to throw exceptions with normal uasge of a language feature.
@alrz
Any such method is already taking that risk if they don't first check the IsReadOnly
property. The IList
interface makes no claim that the implementation is writable. Arrays have always implemented these interfaces.
This would also provide an option to pass an interface of an indexable list with params
when targeting a framework older than 4.5.
Not that I care all that much either way.
@HaloFour, @alrz
``` C#
public void Foo4(params IList
...
Foo4(1, 2, 3); // - Tolerated!
var d = new[] { 1, 2, 3 };
Foo4(d);
Print(d) // int[3] { 1, 5, 3 } - Nightmare!!
```
@SergeyZhurikhin
That's already perfectly legal with params
:
public void Foo4(params int[] args) {
args[1] = 5;
}
...
var d = new [] { 1, 2, 3 };
Foo4(d);
Debug.Assert(d[1] == 5);
@HaloFour
Legally - not to help not make mistakes, do not build their reefs.
@HaloFour
Any such method is already taking that risk if they don't first check the IsReadOnly property. The IList interface makes no claim that the implementation is writable. Arrays have always implemented these interfaces.
And this is one of my least-favorite feature in BCL. Standad collection hierarchy is so ugly, and it seems that it will be as it is forever, becuase backward compatibility is the holy cow for Microsoft. It's bizzare, that array implements IList, and half of methods throws NotAllowedException. Is it statically typed language or not?
@Pzixel compatibility is actually a really important feature but that's another topic.
The good news is that we as developers have access to modern, typesafe and well specified interfaces such as IEnumerable<T>
and IReadOnlyList<T>
.
@aluanhaddad problem with IEnumerable us that it can be a query, which implies multiple consequences.
IReadOnlyList<T>
is good enough, but sometimes I need to modify something within.
Proper collection hierarchy was posed several years ago on SO:
- Just Enumeration IEnumerable
- Readonly but no indexer (.Count, .Contains,...)
- Resizable but no indexer, i.e. set like (Add, Remove,...) current ICollection
- Readonly with indexer (indexer, indexof,...)
- Constant size with indexer (indexer with a setter)
- Variable size with indexer (Insert,...) current IList
Becuase now if I want to create my own collection with Count property, I should provide multiple methods which I really do not need, I just want to tell to user hey, this is materialized collection with N elements, don't worry about multiple query executions
.
IMHO BCL team should provide new hierarchy of interfaces, well elaborated, and insert it in current hierarchy. It won't break anything, we'l just have new interfaces (for example like it's showed above).
For example, split ICollection<T>
on ICommonReadonlyCollection<T>
and ICommonCollection<T>
, and then ICollection<T> : ICommonReadonlyCollection<T>, ICommonCollection<T>
(names are just for example).
So we'l have compatibility with much better hierarchy. But it seems that BCL team don't think that current situation is bad in any sense. But it obviously is, because of properties IsReadOnly
and IsFixedSize
. Why should I check those before interacting with collection? Where is polymorphism? If method could be invalid in some situation then interface shouldn't contain it.
Just asking, is this would support generic params?
I mean something like this
C#
void Iter<E,T>(params E collections) where E : IEnumerable<T>
{
// iterate T
}
@Thaina you can replace IEnumerable
@Pzixel so you just want an interface with a Count
property. It will need to extend IEnumerable<T>
to be useful in most situations, so it's really just IReadonlyList<T>
without the indexer. It doesn't seem like much of a problem. All that would need to be done would be to extract that interface into the bcl.
@aluanhaddad hmm, we have such interface, it's a ReadOnlyCollection<T>
, but here is a problem that its name do not reflect its purpose. If collection has a property count
it doesn't mean that it's readonly. But at least we have it...
@jaredpar I wanted to weigh in on your Arguments
Is there any reason why you only have _one_ Arguments
_(Supporting the last one is important, because for many of us, the whole point of this request is to avoid having two method signatures for params vs BCL collections.)_
This will allow a smaller footprint too (the struct only stores the exact data that it needs, and only performs the exact logic it needs to in the intelligent indexers, etc.).
You could make an interface for the structs -- something like IArguments
Personally, I would prefer that our method signatures be _allowed_ to use IEnumerable
@BrainSlugs83
The point of the single Arguments
struct is avoiding having to hide them behind an interface. Doing so requires boxing the struct and performing virtual dispatch which would eliminate the performance benefit of going with an Arguments
struct in the first place.
@HaloFour I believe I addressed the boxing issue in my comment.
@BrainSlugs83
(If IArguments won't work because of boxing or some such, consider using function pointers to hook up the "intelligent" logic in the struct to emulate OOP behavior)
Care expanding that into something that applies to the CLR? IIRC, the only way you could do this would be to force the method to be generic where the params
argument is of a generic type parameter constrained to IArguments
and struct
, which would allow for the IL modifier constrained.
which would permit calling the members without boxing. But that would change the contract of the method.
@HaloFour sure; it's a pretty common thing though (in non-OOP languages -- the idea is you at least get to avoid the speed/complexity hit of determining which block of code to run at runtime).
Here's some pseduo code in the context of the original struct:
public struct Arguments<T> : IEnumerable<T>
{
// All values assigned at creation time.
private T _arg1;
private T _arg2;
private IEnumerable<T> _enumerable;
private Func<int, T> _getter;
private Func<int> _count;
public int Count { get { return _count(); } }
public T this[int idx]
{
get { return _getter(idx); }
}
public Arguments<T>(T arg)
{
_arg1 = arg;
_count = 1;
_getter = _getSingle; // method not shown for brevity
_count = ()=> 1;
}
public Arguments<T>(T arg1, T arg2)
{
_arg1 = arg1;
_arg2 = arg2;
_getter = _getDouble; // method not shown for brevity
_count = ()=> 2;
}
public Arguments<T>(IEnumerable<T> args)
{
_enumerable = args;
_getter = _getEnumerable // method not shown for brevity
_count = ()=>_enumerable.Count() ; // avoids iterating until necessary
}
}
@BrainSlugs83
Adding delegates to the mix would only add to the overhead since delegate invocation is not cheap. That still doesn't solve the issue of what you'd be passing to this method. At best you'd pass the two delegates and incur the penalty of a _n_+1 delegate invocations. But that would mean that you're no longer passing a single params
argument and the developer would have to learn how to write methods that accept these delegates. Or the method would accept an interface containing those two delegates, which brings us right back to the boxing issue.
Given that the entire point of Arguments<T>
is to avoid the overhead associated with boxing and dispatch going with delegates is completely pointless. The cost of copying a struct that happens to be larger than it needs to be would be significantly less.
We are now taking language feature discussion on https://github.com/dotnet/csharplang for C# specific issues, https://github.com/dotnet/vblang for VB-specific features, and https://github.com/dotnet/csharplang for features that affect both languages.
This request corresponds to https://github.com/dotnet/csharplang/issues/179
Most helpful comment
If we are going to invest in a new
params
type I would like to see us invest in one that is more efficient than eitherT[]
orIEnumerable<T>
but equally inclusive.Today low level APIs which expose a
params
method typically end up exposing several other non-params overloads in order to avoid call site allocation of the array for small numbers of arguments. It's not uncommon at all to see the following:Switching to
IEnumerable<T>
would not help here and would in fact make it works because it creates two allocations for a small number of arguments:An idea we've sketched around a few times is a struct of the following nature:
The
Arguments<T>
struct represents a collection of arguments. In can be built off of individual arguments or a collection. In the case of a small number of individual arguments, which for many APIs is the predominant use case, it can do so allocation free. The compiler would be responsible for picking the best method of constructingArguments<T>
at the callsite.