roslyn 🚀 - Allow support for params IEnumerable<T> methods

This feature was already proposed for C# 6, but didn't make it due to scheduling. So my guess is that it's pretty likely it will make it into C# 7.

svick on 19 Jan 2015

👍1

If we are going to invest in a new params type I would like to see us invest in one that is more efficient than either T[] or IEnumerable<T> but equally inclusive.

Today low level APIs which expose a params method typically end up exposing several other non-params overloads in order to avoid call site allocation of the array for small numbers of arguments. It's not uncommon at all to see the following:

WriteLine(params object[] args)
WriteLine(object arg)
WriteLine(object arg1, object arg2)

...

// The above pattern lets the following bind without the object[] allocation
WriteLine("dog");

Switching to IEnumerable<T> would not help here and would in fact make it works because it creates two allocations for a small number of arguments:

The collection allocation at the call site
The `IEnumerator allocation in the caller

An idea we've sketched around a few times is a struct of the following nature:

struct Arguments<T> : IEnumerable<T>
{
  T _arg1; 
  T _arg2;
  IEnumerable<T> _enumerable;
  int _count;

  // struct based enumerator 
  // index which intelligently switches between T fields and enumerable
}

WriteLine(params Arguments<T> args)

The Arguments<T> struct represents a collection of arguments. In can be built off of individual arguments or a collection. In the case of a small number of individual arguments, which for many APIs is the predominant use case, it can do so allocation free. The compiler would be responsible for picking the best method of constructing Arguments<T> at the callsite.

jaredpar on 19 Jan 2015

👍11

@jaredpar It sounds like you want to use the calli and arglist IL instructions in C# code...

I could see a signature like this working pretty well:

WriteLine(params ArgIterator args)

sharwell on 19 Jan 2015

@sharwell yes and no. Those definitely achieve part of the problem: efficient way of calling a method with an arbitrary number of arguments. It doesn't hit the other part though which is easy interaction with .Net collections.

jaredpar on 19 Jan 2015

I actually see your concern as a separate issue from params IEnumerable<T>. This topic (IEnumerable<T>) is about exposing a convenient method for simplified APIs, where memory allocation characteristics are often (but not always) much less of a concern.

sharwell on 19 Jan 2015

Having yet another separate class to represent params defeats most of the purpose of this feature which is to allow an API to expose a method that accepts IEnumerable<T> and support using params without having to explicitly write two separate methods.

Also, the performance of using foreach over an IEnumerable<T> which happens to be an instance of T[] is quite efficient, generally on par (if not faster) than using for, as unintuitive as that sounds.

HaloFour on 19 Jan 2015

@HaloFour @sharwell

The intent of (params Arguments<T>) is to accept both individual arguments and IEnumerable<T> values equally well. There would be no need for (params IEnumerable<T>) in this scenario because it would already be covered.

void Method(params Arguments<int> args) { ... } 

void Example(IEnumerable<int> e) 
{
  Method(42);  // binds to Method(new Arguments<int>(42))
  Method(e);    // binds to Method(new Arguments<int>(e);
}

jaredpar on 19 Jan 2015

@jaredpar

I'm still not seeing any advantage to having an Arguments intermediary. If an API has specific optimized paths for dealing with a small number of arguments then even with Arguments they would have to be coded separately anyway so why is Arguments an improvement over having the overloads? The separate overloaded methods already provide the appropriate separation for the different algorithms and passing the arguments as individual values on the stack is more efficient than copying those values into a struct and copying that struct. If the API doesn't have specialized paths and will just enumerate the arguments via for or foreach then there is no performance benefit to this struct, even with 2 or fewer arguments.

HaloFour on 19 Jan 2015

@HaloFour

The advantage is simply avoiding the allocation for the collection at the call site. It's not about specific optimizations within the method. The allocation may seem small, and typically is in isolated scenarios. Added up over an entire application though the array allocations add up (and quite fast).

I've worked on several projects where we've measured significant performance improvements by adding a few overloads to params methods that avoid call site allocations. The implementation of the non-params overload had no special optimizations. The algorithms, minus the loop, were identical to the params overload.

This is why I don't see the value in adding params IEnumerable<T>. It is solving only one of the current issues with params (not being inclusive to all collection types). I'd much rather invest in a solution that solves all of them. A solution like Arguments<T> has the possibility of doing so because:

It avoids the call site allocation in the target cases.
It can accept IEnumerable<T> as input (which in turn includes T[]).
The compiler can easily hide the implementation details in the same way it does for params T[] methods today.

jaredpar on 19 Jan 2015

The algorithms, minus the loop, were identical to the params overload.

Which is a different algorithm by definition since you're treating the arguments differently. You're better off keeping the specialized overloads since bouncing through a struct intermediary will be slower, even if you reference the fields containing the values directly. If you have to instead bounce through an indexer it will be significantly slower, on par with the speed of just working with an array.

The purpose of supporting params IEnumerable<T> is for the cases where performance isn't important because the performance will absolutely be slower even compared to even enumerating over params T[]. It's simply to eliminate the need to write that additional overload as this has become a common pattern:

public void Foo(params int[] args) {
   Foo((IEnumerable<T>)args);
}

public void Foo(IEnumerable<int> args) {
   // enumerate here
}

For those cases where the performance is important and you can behave differently given a small number of arguments the compiler story is already quite good through overload resolution. I see no need to complicate that and force a single code path which couldn't be a single code path anyway.

HaloFour on 19 Jan 2015

👍1

You're better off keeping the specialized overloads since bouncing through a struct intermediary will be slower, even if you reference the fields containing the values directly

Completely disagree. We've tested out solutions in the past and found that the allocation benefit dominates any sort of indirection you would get from the struct.

Note: IEnumerable<T> adds even more indirection than T[] given that it will be

An extra allocation.
Double interface dispatch for every element in the loop.

It's simply to eliminate the need to write that additional overload as this has become a common pattern:

The Arguments<T> solution would fix the exact same scenario. It can accept individual arguments and IEnumerable<T> values.

Why push for a solution that is slower and allocates more memory over one which is solves the same problem + additional scenarios?

jaredpar on 19 Jan 2015

Why push for a solution that is slower and allocates more memory over one which is solves the same problem + additional scenarios?

Because those additional scenarios don't benefit without adding onus onto the developer of the method to write even more code than they need to today.

The following is slower than params T[]:

void Foo(params Arguments<int> args) {
    for (int i = 0; i < args.Count; i++) {
        int arg = args[i];
        // do stuff here
    }
}

And the following is slower than overloads:

void Foo(params Arguments<int> args) {
    if (args.Count == 2) {
        // assuming accessible public readonly fields, using indexers here is significantly slower
        int x = args.arg1;
        int y = args.arg2;
    }
    // need to handle other possibilities here as well
}

The one place where this could be nominally faster is in the case of params IEnumerable<T> where optimized IEnumerable<T> implementations can be provided for specific argument counts, which is something that the C# team could do when consuming params IEnumerable<T> rather than just emitting an array.

HaloFour on 19 Jan 2015

@HaloFour the Arguments<T> type implements IEnumerable<T> using a struct based enumerator. I would expect the vast majority of consumers to use a foreach loop.

edit typo

jaredpar on 19 Jan 2015

@HaloFour talk about the bad time for typos. I meant to say the exact opposite of that :( Editted the comment.

jaredpar on 19 Jan 2015

I would expect the vast majority of consumers to use a foreach loop.

I agree, and in those cases I do think that it would be worthwhile for the C# team to emit specialized implementations of IEnumerable<T> and IEnumerator<T> to improve performance, but that is a detail that could be hidden. The one gotcha there is that if the method is written to also check to see if the IEnumerable<T> is an T[] as that would not be true.

HaloFour on 19 Jan 2015

Hmm, and if you want more than 2 arguments what happens? Will the compiler go and create an array and pass it to Arguments<T>? And if you need arguments of different types and they're value types you still end up allocating due to boxing, the only T you can use in such cases is Object. I'm a bit of a performance freak myself but this particular optimization attempt seems a bit overdone.

And I can't help not to notice that the allocation cost problem comes up quite often and every time we end up with solutions that are either incomplete or have some other not so great effects. Back in .NET 2.0 generic collections got struct enumerators, great, allocations avoided. And then you look at the JIT generated code and go hrmm. It looks like RyuJIT will produce better code but it took "only" 10 years.

Maybe at some point we need to accept that the system works the way it works and it has certain performance characteristics. If you want to make it better, well, use .NET Native, add escape analysis and stack allocation to it and call it a day.

Just my 2 cents.

mikedn on 19 Jan 2015

@mikedn

Hmm, and if you want more than 2 arguments what happens? Will the compiler go and create an array and pass it to Arguments?

Yes.

And if you need arguments of different types and they're value types you still end up allocating due to boxing, the only T you can use in such cases is Object

Correct but this is not a new problem. It already happens today with params.

It is definitely something I would love to see solved. So far though an elegant solution hasn't presented itself. If it did though I would likely be very interested in that as well.

Back in .NET 2.0 generic collections got struct enumerators, great, allocations avoided.

Struct based enumerators have been around since 1.0. It was the original way to have type safe, non-allocation enumeration. I do share the frustrations on enumerators though and I've written some thoughts about it here.

http://blog.paranoidcoding.com/2014/08/19/rethinking-enumerable.html

Maybe at some point we need to accept that the system works the way it works and it has certain performance characteristics.

Speaking as someone who's worked on a lot of perf sensitive applications over the years: allocations matter much more than most developers give them credit for. Most performance investigations end up doing little more than trying to reduce the GC time which translates into curbing unnecessary allocations.

Any time we create a feature in the language that has unnecessary allocations, it's a feature that will likely be avoided by perf sensitive applications. I'd much rather focus on features that are applicable to all types of programs.

jaredpar on 19 Jan 2015

👍1

@jaredpar

I do like elements of your rethinking on IEnumerable<T>. Without IEnumerator<T> don't you lose the capacity for generic variance? Also, and probably a fairly minor tweak, I'd also prefer TEnumerator to have a generic constraint of IDisposable, although I guess the compiler could just emit a try/finally which would check to see if TEnumerator was disposable and, if so, call Dispose(). Variance aside I love the idea of current being an out parameter. TEnumerator being a ref seems a tad weird but I get why you do it.

In the end, though, I think I'd rather be stuck with one slightly-less-perfect method than have a bunch of disparate but similar methods. IEnumerable<T> is still better than Java's Iterable<T> or Enumeration<T>.

HaloFour on 19 Jan 2015

@HaloFour , But at the same point, unless you are writing some very high specialized code like LINQ where you have optimized paths for different type of IEnumerable<T>, its very uncommon to actually inspect the underlying type of an IEnumerable<T>.

@mikedn, But boxing is always the problem if you have incompatible types. If you need different types and presumably need different handling for said type, you have no choice but to box. As for the rest of your points I'm not sure.

I like that @jaredpar 's idea as it help solves the extra overloads problem that my original suggestion was getting at.
e.g.

void SomeMethod(T item1, T item2)
void SomeMethod(params T[] items)
void SomeMethod(IEnumerable<T> items)

Under Jared's model this would be one method with one implementation. I also agree that IDisposable is very useful on enumerators. Especially those generated from yield return. Sometime s you want to tie resources lifecycle to iteration.
I think TryGetNext isn't bad but I'd probably want to still keep the abstraction of a Enumerator.

mburbea on 19 Jan 2015

@mburbea Just supporting params IEnumerable<T> you also solve the additional overloads "problem" and can have one method with one implementation. You could then opt-into the additional overloads to provide optimized scenarios for accepting arrays or specific numbers of arguments but only if you wanted to. If your one implementation is just going to be using foreach or LINQ then it doesn't really matter as you're using the most expensive implementation anyway. But if you know that you can have an optimized path given exactly one or two arguments then it always makes sense to have the overloads since that already works very well with overload resolution and provides the least expensive path for passing the values.

HaloFour on 19 Jan 2015

@jaredpar Actually struct enumerators were added in 2.0, check ArrayList for example, it has a class enumerator and its GetEnumerator returns IEnumerator. And yes, I read your blogpost and I know you worked with Joe Duffy on a certain project :smile:.

Oh well, I suppose it makes sense for the compiler and framework to strive to minimize allocations. The main problem is how will the indexer deal with IEnumerable. I suppose you'll end up with something like this, otherwise you'll end up enumerating multiple times:

C# T this[int index] { if (count > 2) { var list = _enumerable as IList<T>; if (list == null) { list = _enumerable.ToArray(); _enumerable = _list; } return list[index]; } ...

As for boxing, that's likely unavoidable. You'd need something similar to C++'s variadic templates to get make it work.

mikedn on 19 Jan 2015

@mikedn guess you learn something new every day. I would have sworn struct enumerators were in 1.0. :)

jaredpar on 20 Jan 2015

@HaloFour , when I'm at the point where I'm taking variadic arguments I rarely will be doing much differently then with one argument or N arguments.e.g.DoSomething(arg1) vs foreach(var arg in args){ DoSomething(arg)} I'm after that easier calling convention. Perhaps I could avoid the loop, but this type of code usually isn't the bottle neck of performance for me.

I probably would use it like params IEnumerable<T> as long as Arguments implements IEnumerable<T> and the compiler would handle the conversion itself. Can the jit put structs like this into registers or is that a bridge too far?

mburbea on 20 Jan 2015

The JIT may place a struct in a register if it contains only one field that fits in a register (a field of type int for example) so Arguments has no chance of being passed in registers. Anyway, x86 and even x64 have too few registers for this to be useful.

mikedn on 20 Jan 2015

@mburbea In which case you'd likely not be writing those additional overloads anyway as they would serve no purpose for you and having an intermediate struct would be of no benefit, either syntactically or performance-wise.

I believe where you would see the improvement of an intermediate struct would be to have the compiler emit such a struct as the IEnumerable<T> rather than just creating an T[] as the former can be more optimized than the default array enumeration.

HaloFour on 20 Jan 2015

@HaloFour What do you mean by the former can be more optimized than the default array enumeration? Array enumeration is as fast as it gets, there's nothing more optimized than that. Maybe you meant the opposite or I did I read it wrong?

mikedn on 20 Jan 2015

I think he means array enumerators aren't the fastest. e.g.

void SomeFunc(T[] args){
   foreach(var a in args){ DoSomething(a)};
}

will actually get compiled as a for loop to avoid creating an enumerator. If you instead change the signature to IEnumerable<T> it'll actually be slower than if you past it a List<T>. This has something to do with the SzArrayHelper jit stuff last I read about it on stackoverflow.
It's too late now but is there any reason that the array enumerator isn't a struct one?

mburbea on 20 Jan 2015

@mikedn The compiler will automatically convert a foreach to a for over an array, which is definitely a lot faster. If you cast the array to an IEnumerable<T> prior to foreach the enumeration is a good 5-6 times slower. Custom enumerators can beat that by a good 20% margin or so.

HaloFour on 20 Jan 2015

@HaloFour Exactly, the foreach is converted to for so what's this 20% faster custom enumerator thing? An enumerator that tries to cast the IEnumerable<T> back to array?

@mburbea I know that and HaloFour knows it to, the misunderstanding is somewhere else. And a struct array enumerator wouldn't not help, it will end up boxed to IEnumerator.

mikedn on 20 Jan 2015

@mikedn This feature request is to support params IEnumerable<T>, which is what makes this relevant. In terms of the existing params T[] support, you're correct, although if people are just passing that array to an overload that accepts IEnumerable<T> then the benefit is lost.

In the C# 6.0 timeframe the proposed solution was to support params IEnumerable<T> and to have the caller emit new T[] { 1, 2, 3 }, so we're back to enumerating via IEnumerable<T> unless that method converts/casts to an array, which would kind of defeat the point of supporting params IEnumerable<T>. My statement is that if the compiler instead emitted a struct which had a better IEnumerable<T>/IEnumerator<T> implementation that it could edge out the performance of the array enumeration in that case.

C# 5.0:

public void Foo(params int[] args) {
    Foo((IEnumerable<int>)args);
}

public void Foo(IEnumerable<int> args) {
    // do stuff with args
}

...
Foo(1, 2, 3);  // -> Foo(new int[] { 1, 2, 3 });

C# 6.0 proposal which was cut due to schedule:

public void Foo(params IEnumerable<T> args) {
    // do stuff with args
}

...
Foo(1, 2, 3); // -> Foo(new int[] { 1, 2, 3 });

Alternate proposal:

public void Foo(params IEnumerable<T> args) {
    // do stuff with args
}

...
Foo(1, 2, 3); // -> Foo(new CompilerGeneratedArgsStruct(1, 2, 3));

HaloFour on 20 Jan 2015

@HaloFour OK, so it's like I said in my previous post, the claimed performance improvement would came from an enumerator which tries to cast back to array (or IList<T>). All clear.

As I shown in a previous post something like this also has to happen in the indexer. Let's not forget that methods like String.Format do not need to enumerate the params, they need to access them by index and they may access an index more than one time

mikedn on 20 Jan 2015

@mikedn Well, the performance improvement is because I can write a better case-specific implementation than SzArrayEnumerator. :smile:

I do think that the indexer case is already covered, just continue to use the existing params T[]. You can then opt-in to a non-params overload that accepts an IEnumerable<T> which would cast/convert to an array before passing it to the params overload.

HaloFour on 20 Jan 2015

Actually just a quick set of trails, Array seems really fast when passed to a method that takes an IEnumerable<T>. Even for a small number of arguments ( I tried 2 3 and 4) It won every time. I suppose it may not be the best from a memory perspective but its hard to beat unless you change the signature from params IEnumerable<T>.

mburbea on 22 Jan 2015

@mburbea

I was able to handily beat the array enumerator by using the following:

Struct Enumerable for params

I ran that through the following test methods:

public int Test1(CompilerGeneratedArgs3 args) {
    int tally = 0;
    foreach (int value : args) {
        tally += value;
    }
    return tally;
}

public int Test2<TParams>(TParams args) where TParams : struct, IEnumerable<int> {
    int tally = 0;
    foreach (int value : args) {
        tally += value;
    }
    return tally;
}

public int Test3(IEnumerable<int> args) {
    int tally = 0;
    foreach (int value : args) {
        tally += value;
    }
    return tally;
}

With 10,000,000 hard-loop iterations the performance using .NET 4.5.2 was as follows:

Test1:       00:00:08.3613572
Test2:       00:00:08.6540710
Test3:       00:00:09.2925556
Test3/Array: 00:00:11.3223353

Here's the source if you want to give it a spin:

Enumerable/Struct Perf Tests

HaloFour on 22 Jan 2015

You didn't beat the array enumerator, your code simply avoids an allocation, the array. If you hoist the array allocation out of the loop in Test3/Array then you'll get similar results:

Test1:       00:00:04.3801256
Test2:       00:00:04.2367054
Test3:       00:00:04.7327582
Test3/Array: 00:00:04.8988729

mikedn on 22 Jan 2015

Do you mind using a gist? I did the following and I came to the conclusion that arrays seem to be a winner most of the time. I wrote these quickly so they may not be the best struct enumerators possible.

CallWithArray     :    527ms
CallWithList      :   1301ms
CallWithArg2      :    677ms
CallWithArg2Unsafe:    638ms

My times with this gist::
https://gist.github.com/mburbea/683f74ff5cc589d512c5
This is doing 1<<24 iterations so quite a good bit. Every method calls the sum method taking 2 bytes and using an enumerator to sum them. Even an unsafe buffer was slower than an array.

mburbea on 22 Jan 2015

@mikedn Sure, but that's not what params is going to be doing. The allocation is a big part of it, but even without that the custom struct does edge out an array.

HaloFour on 22 Jan 2015

Sheesh, this is getting complicated and I'm not sure why.

@HaloFour But your example isn't what the proposed params is going to be doing either. Arguments<T> will need to also be able to wrap an existing IEnumerable<T> and as a result the code will be more complex and probably slightly slower. And your example fails to actually make use of the struct enumerator because it boxes it to IEnumerator<T> and that means that Argument<T> will actually be faster. So the only claim that you can make is that you beat the array enumerator and it turns out that this doesn't have anything to do with the enumerator itself and has everything to do with allocation.

@mburbea Trying List is a waste of time, it will never be faster than anything. It's one additional allocation, one additional indirection, one additional range check.

The differences between array, arg2 and arg2unsafe appear to be order dependent, if you run CallWithArg2 first then you'll see that it's faster than CallWithArray. That's probably a side-effect of GC and that means you need to take the differences with a grain of salt. Doing benchmarks that involve allocations is problematic. Take out GC.Collect and run the tests multiple times to get a more realistic picture.

mikedn on 22 Jan 2015

@mikedn

I agree, this is getting complicated.

I meant that params wouldn't be caching the array off so to make the claim that doing so even out the benchmark is entirely pointless. I don't cache the initialization of the structs in my tests either because that's not realistic use.

What stands as the "proposal" at this point? I've seen what @jaredpar wants to do (and I know that he's on the Roslyn team) but I've not seen anything "official" beyond what was on CodePlex which was, simply, params IEnumerable<T> and the compiler emitting a normal array allocation like it does today.

I do agree that going straight struct is faster, but how does that play out with something like Arguments<T>? A single struct can't represent the wide range of possibilities that a compiler-emitted struct could represent, adapting to the appropriate count of arguments rather than carrying the extra baggage of representing all of those alternative cases.

And does going with Arguments<T> actually meet the needs for the request? If the point is to be able to have a single method accept an IEnumerable<T> as well as being params then having yet another type doesn't satisfy that desire. You end up with two methods, one accepting params Arguments<T> and another accepting IEnumerable<T>. Sure, the developer could manually call the one accepting the Arguments<T> but does it make sense to use a construct for variadic arguments when you're not supplying variadic arguments?

Also, wouldn't Arguments<T> necessitate the release of a new framework to even permit this capability? The IEnumerable<T>/array based approach at least would work on every version of .NET going back to 1.0 (assuming you could use the non-generic version).

tl;dr:

structs are faster, I don't think enough so to matter

HaloFour on 22 Jan 2015

@mikedn

... but I've not seen anything "official" beyond what was on CodePlex which was, simply, params IEnumerable and the compiler emitting a normal array allocation like it does today.

The IEnumerable<T> version emits an array and enumerator allocation.

You end up with two methods, one accepting params Arguments and another accepting IEnumerable.

Why? If Arguments<T> has an implicit conversion from IEnumerable<T> then there is no need for an overload.

structs are faster, I don't think enough so to matter

I strongly disagree. It matters quite a bit.

Not saying this based on personal belief. Saying this based off of real world profiling in large systems. The allocations matter quite a bit when considered across the system.

jaredpar on 22 Jan 2015

@mikedn, I was trying with permutations on the orders of the Tests array and I was consistently getting that CallWithArray would be the fastest no matter what. I tried with and without the GC.Collects.

As for the Arguments<T> business, As I understood it, it would be a new struct that would probably have an implicit converting operator from IEnumerable<T>, and the compiler would stick small number of arguments to reified fields and for larger amounts it would just allocate an array. Since arguments would be passed as an explicit argument, then a struct enumerator would be beneficial. However, as I suggested on uservoice a billion years ago, I'd be 100% what was originally suggested for C#6.

mburbea on 22 Jan 2015

@jaredpar Well, if that's a separate option from params IEnumerable<T> then it at least hits all of this feature request.

How about also handling the following signature?:

public void Foo<TParams>(params TParams args) where TParams : struct, IEnumerable<int> { ... }

My primitive testing put it on par with a specialized struct and if the compiler generated the implementation it could be tailored to the type and number of arguments which should be a little more efficient.

HaloFour on 22 Jan 2015

@jaredpar You replied to the wrong person :smile:

@mburbea I tried again your code on a x64 machine with RyuJIT, in that case the array test is consistently faster. But the code in Arguments2 is bad, the call to index from Current isn't inlined and using switch is a bad idea. Fix it and it will be slightly faster than the array test. I'm not sure what's going on with the unsafe thing, that one is slow for no obvious reason.

@HaloFour Mixing generics with structs can be a problem. If you create CompilerGeneratedArgs1, CompilerGeneratedArgs2 and CompilerGeneratedArgs3 then you'll end up with 3 native versions of Foo. The explosion in code size isn't usually justified by the small gains in perf you may get by specializing on the number of arguments.

mikedn on 22 Jan 2015

@mikedn blarg. Need to drink more coffee :(

jaredpar on 22 Jan 2015

@HaloFour

For me the mainline performance case is avoiding creating arrays for calls that have 1-3 arguments. In particular avoiding allocating the memory for those elements on the heap.

The solution you are proposing won't help that because it will still end up allocating memory for the elements on the heap. Even if the collection type is generated by the compiler as a value type it still has this problem because it must produce an IEnumerator<T> instance. This enumerator must necessarily refer to the original elements which in turn forces them to be in the heap somewhere.

jaredpar on 22 Jan 2015

@mikedn That would be a one-time cost though and shouldn't really amount to much. The advantage is that the generated structs wouldn't need all of this additional general purpose code or additional fields to represent the other use cases. Having these multiple types doesn't seem to be a problem with anonymous types are concerned.

@jaredpar Wouldn't that be true of Arguments<T> as well if all you're doing is using foreach over it or using it with some LINQ extension methods?

HaloFour on 22 Jan 2015

@HaloFour Maybe it doesn't look like much but I don't think the compiler guys will easily agree to enforcing a pattern that can lead to native code bloat. And anonymous types have nothing to do with this, they're reference types so even when used with generics they don't force native code unsharing.

And to spare Jared Parsons the trouble: no, if you use foreach with a proper struct enumerator there's no allocation. The problem with your suggestion is that it forces the code in Foo to access the enumerator via IEnumerable<T>.GetEnumerator and that will cause boxing.

As for using LINQ extension methods, yes, that will cause allocations no matter what you do. In this regard Arguments<T> will probably be worse than an array.

mikedn on 22 Jan 2015

@HaloFour

It wouldn't be true for foreach. Because Arguments<T> is a concrete type the foreach loop can take advantage of the struct enumerator on the type. This avoids the heap allocation.

If you used LINQ methods then yes that would cause the allocation. But that is true for every collection type though due to the architecture of the enumerator pattern.

jaredpar on 22 Jan 2015

@mikedn Anonymous types themselves are reference types but every different variation that happens to contain a value type would result in a new version generated by the runtime, no? Same as how List<object>, List<string>, List<int> and List<double> results in three different flavors of the List<T> class actually in use.

@jaredpar Ah, I see what you mean. The struct could implement IEnumerable<T> explicitly and then offer a public GetEnumerator() method that returns a concrete struct type that implements IEnumerator<T> and the compiler will never box that struct. I actually didn't know that the compiler could do that. I guess that does preclude the ability to use a compiler-generated struct.

I tossed that into my tests and the performance difference is pretty staggering:

struct enumerable with struct enumerator:  4.7830485 sec
struct enumerable with boxed struct enumerator: 8.4173345 sec
generic enumerable with struct constraint:  8.1790828 sec
boxed enumerable with boxed struct enumerator:  9.0478730 sec
array:  14.0110617 sec

You learn something new every day.

HaloFour on 22 Jan 2015

@HaloFour

So, because TParams is struct you'll get one native version for every TParams type that you are using, that can easily get out of hand. You don't know how big your Foo method will be in practice, maybe someone will write a 1000 lines method and want it to have params.

Anonymous types have little code. In particular, the property getters are trivial and candidates for inlining. Additionally, anonymous types fulfill a certain practical need but your proposition doesn't do that. It's an attempt at optimization that risks being a deoptimization if the code gets large.

mikedn on 22 Jan 2015

Does the proposed struct Arguments<T> _always_ requires the implementer to check for the number of arguments passed and treat them differently based on whether its one, two or many arguments? Or does it allow you to enumerate over it directly?

C# void Foo<T>(params args Arguments<T>) { foreach(var arg in args) { } }

MgSam on 4 Feb 2015

I believe the idea is you can just foreach it and it will do the right thing (TM). The main optimization is apparently around 3 or less arguments which is the most common use of params anyway. From what I understand from Jared's idea. If you are writing code that may want to do specialization on smaller number of arguments then the indexer will be exposed, but if you just want nicer call conventions then its fine to just foreach it.

mburbea on 4 Feb 2015

@MgSam the Argumens<T> type implements IEnumerable<T> hence it can be used with foreach exactly as you stated.

jaredpar on 4 Feb 2015

While the performance implications of each feature proposal are quite important, my understanding was that the params IEnumerable<T> feature was put forth simply to allow APIs developers to more easily expose cleaner and simpler code to consumers. Optimized overloads can still be provided.

The original feature, as it was proposed for C# 6 has plenty of use cases. For example, I often find myself writing overloads of constructors or factory methods that take params T[] in order to provide a more readable syntax for use when manually writing test data. I dislike having to have multiple overloads simply to provide syntactic convenience to calling code, but I find it necessary. This is a nice simple feature proposal and it provides value.

I think @HaloFour sums this up nicely in his/her initial three posts in this discussion.

aluanhaddad on 23 Apr 2015

Couldn't we just
void Func(params items)
and
void Func<T>(params<T> items)

Thaina on 9 Sep 2015

@Thaina What would the type of items be in those cases? You're not specifying the element type of items, let alone how the multiple values should be represented.

HaloFour on 9 Sep 2015

@HaloFour IEnumerable and IEnumerable<T>

It just that dropping from full params IEnumerable items and use only short params
We can't params List or params Dictionary anyway

So just params to represent IEnumerable
params items element type is object
params<T> items element type is T
done

Thaina on 9 Sep 2015

@Thaina

Per this proposal you would be able to pass either params T[] items or params IEnumerable<T> items. You may have had an argument before this proposal since you could only pass arrays, but I would argue that specifying that the type is an array is still more appropriate since the parameter type is of array (and you can pass a raw array), the params is just metadata to let compilers know to let their callers treat the method as vararg.

HaloFour on 9 Sep 2015

@HaloFour That's the point I want to let params T[] items as it is so it backward compatible but introduce params items instead of params IEnumerable items because we shouldn't need params AnyTypeInTheFuture items anyway

And with return ref. IEnumerable should also implement ref yield so we could foreach with i++ instead of rely on array

And we could cast items as T[] if we really really need array

Thaina on 9 Sep 2015

Just curious, what's the status of this now we're in 2016?

jamesqo on 21 Feb 2016

The change of year hasn't had any effect on this.

gafter on 21 Feb 2016

😄1

lol. Oh I don't know @gafter. It'll be Spring before you know it.

DustinCampbell on 21 Feb 2016

@gafter Lol, good one.

jamesqo on 21 Feb 2016

Seriously, though. This is on the list of things we'd be happy to see get into C# 7 if time permits. A community member's contribution of an implementation along with tests would go a long way toward making that happen (hint hint).

gafter on 21 Feb 2016

@Thaina the syntax you suggest seems highly irregular as it is a keyword followed by a generic type parameter. Also why do you assume that there would never be params IDictionary<TKey, TValue> items? Although it is not currently proposed, why rule it out?

aluanhaddad on 21 Feb 2016

@gafter I'm honestly a little scared of writing code for Roslyn (I'm not even familiar with the API), so don't know if I'd be the right choice haha. Would be really awesome if someone else could start work on it though.

jamesqo on 21 Feb 2016

@aluanhaddad The syntax would be

void M(params IEnumerable<string> strings) ...

I know what the spec would look like for that (the compiler would continue to create an array for the arguments, as it does today). I have no idea what spec you have in mind for other types. I don't know that it has been ruled out so much as it has not yet been proposed or considered.

gafter on 21 Feb 2016

@gafter I don't have any spec in mind nor am I proposing anything. I was just saying that I dislike the idea of using the syntax params<string> items instead of params IEnumerable<string> items because it would rule out using other types.

aluanhaddad on 21 Feb 2016

@aluanhaddad params IDictionary<TKey, TValue> items
is params IEnumerable<KeyValuePair<TKey, TValue>> items

Thaina on 21 Feb 2016

Has anyone considered what would/should happen if a developer passes a single IEnumerable<T> as the params parameter's value? Should the source IEnumerable<T> be iterated (say, to create the array) when it is passed? What happens if the source IEnumerable<T> is some kind of stream that has no end? Shouldn't the receiving method initiate the iteration and not the method call itself?

nathan-alden-sr on 18 Mar 2016

@nathan-alden Depending on the type it will create an array, just like existing params on arrays.

F( "s" ); // creates an array
F(new string[]  { "s" }); // pass
G(new string[]  { "s" }); // creates another array

void F(params string[] a){}
void G(params string[][] a){}

alrz on 18 Mar 2016

@alrz I'm not sure I understand your answer. Allow me to demonstrate with an example. Currently, this is what we all have to write in our libraries and frameworks:

public void MyMethod(params object[] x)
{
    MyMethod((IEnumerable<object>)x);
}

public void MyMethod(IEnumerable<object> x)
{
}

In cases where the developer calls the IEnumerable<object> overload, the underlying enumerable won't be iterated until MyMethod iterates it (if it ever does). With some of the above comments, it seems like a new array would first be populated, then passed to the params IEnumerable<object> x parameter:

public void MyMethod(params IEnumerable<object> x)
{
}

IEnumerable<object> expensiveToIterate = GetExpensiveEnumerable();

MyMethod(expensiveToIterate); // The above proposals would call for this enumerable to be iterated and thrown into an array

See the issue?

nathan-alden-sr on 18 Mar 2016

@nathan-alden The compiler isn't going to materialize the enumerable into an array. It will only create a new array in the case of trying to call the method with multiple arguments:

IEnumerable<object> expensiveToIterate = GetExpensiveEnumerable();
MyMethod(expensiveToIterate); // passes the enumerable as-is

MyMethod("a", "b", "c"); // passes new object[] { "a", "b", "c" }

HaloFour on 18 Mar 2016

@nathan-alden Note that this proposal exists largely to eliminate the pattern that you're following, where you want to accept an IEnumerable<T> but you have an overload that accepts params T[] just to enable calling the method with variable arguments. Going forward you could just declare a single method accepting params IEnumerable<T> and get the benefits of both.

HaloFour on 18 Mar 2016

Yep yep, cool. Your first reply answers my question. Thanks! :+1:

nathan-alden-sr on 18 Mar 2016

I would like to:

``` C#
public void Foo1(params T[] args) { ... }
public void Foo2(params IEnumerable args) { ... }
public void Foo3(params IReadOnlyCollection args) { ... }
public void Foo4(params IReadOnlyList args) { ... }

Method calls:

``` C#
FooX(a, b, c);

is translated as:

C# FooX(new T[] { a, b, c });

SergeyZhurikhin on 22 Mar 2016

👍2

@SergeyZhurikhin

Might as well include IList<T> and ICollection<T> as well to complete the list of generic interfaces implemented by a single-dimension array, no?

HaloFour on 22 Mar 2016

👎1

@HaloFour
Absolutely not!
For example:

C# public void Foo4(params IList<T> args) { args.Add(t); ... }

SergeyZhurikhin on 23 Mar 2016

That would be a runtime error, just like it would be today if you passed an array manually.

HaloFour on 23 Mar 2016

@HaloFour
So I'm doing it a compile-time error.

SergeyZhurikhin on 26 Mar 2016

I agree with @SergeyZhurikhin. A method getting an IList<T> is likely to call Add, so it is not a good idea to throw exceptions with normal uasge of a language feature.

alrz on 26 Mar 2016

@alrz

Any such method is already taking that risk if they don't first check the IsReadOnly property. The IList interface makes no claim that the implementation is writable. Arrays have always implemented these interfaces.

This would also provide an option to pass an interface of an indexable list with params when targeting a framework older than 4.5.

Not that I care all that much either way.

HaloFour on 27 Mar 2016

👍1

@HaloFour, @alrz

``` C#
public void Foo4(params IList args) { args[1] = 5; ... }
...
Foo4(1, 2, 3); // - Tolerated!

var d = new[] { 1, 2, 3 };
Foo4(d);

Print(d) // int[3] { 1, 5, 3 } - Nightmare!!
```

SergeyZhurikhin on 5 Apr 2016

@SergeyZhurikhin

That's already perfectly legal with params:

public void Foo4(params int[] args) {
    args[1] = 5;
}

...

var d = new [] { 1, 2, 3 };
Foo4(d);
Debug.Assert(d[1] == 5);

HaloFour on 5 Apr 2016

@HaloFour
Legally - not to help not make mistakes, do not build their reefs.

SergeyZhurikhin on 6 Apr 2016

@HaloFour

Any such method is already taking that risk if they don't first check the IsReadOnly property. The IList interface makes no claim that the implementation is writable. Arrays have always implemented these interfaces.

And this is one of my least-favorite feature in BCL. Standad collection hierarchy is so ugly, and it seems that it will be as it is forever, becuase backward compatibility is the holy cow for Microsoft. It's bizzare, that array implements IList, and half of methods throws NotAllowedException. Is it statically typed language or not?

Pzixel on 20 Jun 2016

👍3

@Pzixel compatibility is actually a really important feature but that's another topic.

The good news is that we as developers have access to modern, typesafe and well specified interfaces such as IEnumerable<T> and IReadOnlyList<T>.

aluanhaddad on 20 Jun 2016

@aluanhaddad problem with IEnumerable us that it can be a query, which implies multiple consequences.

IReadOnlyList<T> is good enough, but sometimes I need to modify something within.

Proper collection hierarchy was posed several years ago on SO:

Just Enumeration IEnumerable

Readonly but no indexer (.Count, .Contains,...)

Resizable but no indexer, i.e. set like (Add, Remove,...) current ICollection

Readonly with indexer (indexer, indexof,...)

Constant size with indexer (indexer with a setter)

Variable size with indexer (Insert,...) current IList

Becuase now if I want to create my own collection with Count property, I should provide multiple methods which I really do not need, I just want to tell to user hey, this is materialized collection with N elements, don't worry about multiple query executions.

IMHO BCL team should provide new hierarchy of interfaces, well elaborated, and insert it in current hierarchy. It won't break anything, we'l just have new interfaces (for example like it's showed above).
For example, split ICollection<T> on ICommonReadonlyCollection<T> and ICommonCollection<T>, and then ICollection<T> : ICommonReadonlyCollection<T>, ICommonCollection<T> (names are just for example).

So we'l have compatibility with much better hierarchy. But it seems that BCL team don't think that current situation is bad in any sense. But it obviously is, because of properties IsReadOnly and IsFixedSize. Why should I check those before interacting with collection? Where is polymorphism? If method could be invalid in some situation then interface shouldn't contain it.

Pzixel on 20 Jun 2016

Just asking, is this would support generic params?

I mean something like this

C# void Iter<E,T>(params E collections) where E : IEnumerable<T> { // iterate T }

Thaina on 27 Jun 2016

@Thaina you can replace IEnumerable with T[] and see if it works. I'm sure that it doesn't.

Pzixel on 27 Jun 2016

@Pzixel so you just want an interface with a Count property. It will need to extend IEnumerable<T> to be useful in most situations, so it's really just IReadonlyList<T> without the indexer. It doesn't seem like much of a problem. All that would need to be done would be to extract that interface into the bcl.

aluanhaddad on 29 Jun 2016

@aluanhaddad hmm, we have such interface, it's a ReadOnlyCollection<T>, but here is a problem that its name do not reflect its purpose. If collection has a property count it doesn't mean that it's readonly. But at least we have it...

Pzixel on 29 Jun 2016

@jaredpar I wanted to weigh in on your Arguments suggestion:

Is there any reason why you only have _one_ Arguments struct? -- It seems like the intelligent indexer you suggest could be as much of a bottle neck as the problem you're trying to solve. I suggest making several versions of the struct, and letting the compiler chose which one makes the most sense at compile time -- excuse the poor naming, but an example of the compiler decision could be:

1 Argument => new Argument1(arg0);
2 Arguments => new Argument2(arg0, arg1);
3 Arguments => new Argument3(arg0, arg1, arg2);
4+ arguments => new ArgumentArray(new T[n] { arg0, arg1, arg2, ... argN });
Other IEnumerable, BCL collection, etc. => new ArgumentEnumerator(arg0);

_(Supporting the last one is important, because for many of us, the whole point of this request is to avoid having two method signatures for params vs BCL collections.)_

This will allow a smaller footprint too (the struct only stores the exact data that it needs, and only performs the exact logic it needs to in the intelligent indexers, etc.).

You could make an interface for the structs -- something like IArguments -- that way we could just have our signatures as "params IArguments". (If IArguments won't work because of boxing or some such, consider using function pointers to hook up the "intelligent" logic in the struct to emulate OOP behavior).

Personally, I would prefer that our method signatures be _allowed_ to use IEnumerable if we wish, for methods that don't require the use of indexers (the same exact compiler logic would work, because [I?]Arguments inherits IEnumerable) -- and it will allow us to mark methods as being able to accept object streams (i.e. long running "yield return" jobs, etc.).

BrainSlugs83 on 13 Jul 2016

@BrainSlugs83

The point of the single Arguments struct is avoiding having to hide them behind an interface. Doing so requires boxing the struct and performing virtual dispatch which would eliminate the performance benefit of going with an Arguments struct in the first place.

HaloFour on 13 Jul 2016

@HaloFour I believe I addressed the boxing issue in my comment.

BrainSlugs83 on 13 Jul 2016

@BrainSlugs83

(If IArguments won't work because of boxing or some such, consider using function pointers to hook up the "intelligent" logic in the struct to emulate OOP behavior)

Care expanding that into something that applies to the CLR? IIRC, the only way you could do this would be to force the method to be generic where the params argument is of a generic type parameter constrained to IArguments and struct, which would allow for the IL modifier constrained. which would permit calling the members without boxing. But that would change the contract of the method.

HaloFour on 13 Jul 2016

@HaloFour sure; it's a pretty common thing though (in non-OOP languages -- the idea is you at least get to avoid the speed/complexity hit of determining which block of code to run at runtime).

Here's some pseduo code in the context of the original struct:

public struct Arguments<T> : IEnumerable<T>
{
  // All values assigned at creation time.
  private T _arg1; 
  private T _arg2;
  private IEnumerable<T> _enumerable;

  private Func<int, T> _getter;
  private Func<int> _count;

  public int Count { get { return _count(); } }

  public T this[int idx]
  {
    get { return _getter(idx); }
  }

  public Arguments<T>(T arg)
  {
      _arg1 = arg;
      _count = 1;
      _getter = _getSingle; // method not shown for brevity
      _count = ()=> 1;
  }

  public Arguments<T>(T arg1, T arg2)
  {
      _arg1 = arg1;
      _arg2 = arg2;
      _getter = _getDouble; // method not shown for brevity
      _count = ()=> 2;
  }

  public Arguments<T>(IEnumerable<T> args)
  {
      _enumerable = args;
      _getter = _getEnumerable // method not shown for brevity
      _count = ()=>_enumerable.Count() ; // avoids iterating until necessary 
  }
}

BrainSlugs83 on 13 Jul 2016

@BrainSlugs83

Adding delegates to the mix would only add to the overhead since delegate invocation is not cheap. That still doesn't solve the issue of what you'd be passing to this method. At best you'd pass the two delegates and incur the penalty of a _n_+1 delegate invocations. But that would mean that you're no longer passing a single params argument and the developer would have to learn how to write methods that accept these delegates. Or the method would accept an interface containing those two delegates, which brings us right back to the boxing issue.

Given that the entire point of Arguments<T> is to avoid the overhead associated with boxing and dispatch going with delegates is completely pointless. The cost of copying a struct that happens to be larger than it needs to be would be significantly less.

HaloFour on 13 Jul 2016

We are now taking language feature discussion on https://github.com/dotnet/csharplang for C# specific issues, https://github.com/dotnet/vblang for VB-specific features, and https://github.com/dotnet/csharplang for features that affect both languages.

This request corresponds to https://github.com/dotnet/csharplang/issues/179

gafter on 20 Mar 2017

Roslyn: Allow support for params IEnumerable<T> methods

Most helpful comment

All 99 comments

tl;dr:

Related issues