Runtime: Reconsider code paths where ICollection<T> is only used to access count for IReadonlyCollection<T>

Created on 15 Sep 2020  路  18Comments  路  Source: dotnet/runtime

There are many places in the Linq / Collection code that leverage detecting if an IEnumerable<T> is an ICollection<T> to perform optimizations (e.g. presizing a new array, etc.)

List.cs

Because ICollection<T> implements IReadonlyCollection<T>, IReadonlyCollection<T> should be exclusively used in these scenarios to support custom IReadonlyCollection<T> implementations that don't necessary want to expose Add(T item)

Currently, collection authors have to implement ICollection to take advantage of the performance gains and leave Add throwing NotImplementedException to convey proper usage.

area-System.Linq tenet-performance

Most helpful comment

I found quite a few complains or rejected attempts to optimize LINQ for IReadOnlyCollection<T>:

https://github.com/dotnet/runtime/issues/28651 - LINQ results implicit support for IReadOnlyCollection
https://github.com/dotnet/runtime/issues/27517 - Performance: Make constructor List(IEnumerable collection) know about IReadOnlyCollection
https://github.com/dotnet/runtime/issues/26679 - Linq ToDictionary() should presize for IReadOnlyCollection
https://github.com/dotnet/runtime/issues/14366 - System.Linq performance improvement suggestions (mentions IROC<>)
https://github.com/dotnet/runtime/issues/24793 - Respect IReadOnlyList in the BCL
https://github.com/dotnet/runtime/issues/23910 - Add optimized path for IReadOnlyCollection/IReadOnlyList in System.Linq
https://github.com/dotnet/runtime/issues/18714 - Consider checking for IReadOnlyCollection in Enumerable.ToArray
https://github.com/dotnet/runtime/issues/27516 - Performance of LINQ .Any() - type check to leverage .Count property? (mentions IROC)
https://github.com/dotnet/runtime/issues/27517 - Performance: Make constructor List(IEnumerable collection) know about IReadOnlyCollection
https://github.com/dotnet/corefx/pull/28472 - Check for IReadOnlyCollection
https://github.com/dotnet/runtime/issues/43001 - LINQ IEnumerable extension methods should add special case IReadOnlyCollection<T>

All 18 comments

Tagging subscribers to this area: @eiriktsarpalis, @jeffhandley
See info in area-owners.md if you want to be subscribed.

Because ICollection<T> implements IReadonlyCollection<T>

It doesn't.

If I recall correctly such suggestions to add (not to replace ICollection) fast paths for IReadOnlyCollection here and there were rejected several times because such casts to covariant interfaces were super slow, however these performance issues were fixed as far as I know (cast caches, inlined checks) so maybe it worth checking if we can add them in some places?

@davidwrighton are these kinds of "optimistic checks for interfaces" indeed much cheaper now than in the past?

cc @VSadov

Because ICollection<T> implements IReadonlyCollection<T>

It doesn't.

As I was! Egg on my face for sure. Apologies, i thought this would be a drop-in request. Thanks for the quick replies

@danmosemsft a quick benchmark:

static IEnumerable<string> strings = new List<string>();

[Benchmark]
public bool IsCollection() =>   strings is ICollection<string>;

[Benchmark]
public bool IsReadOnlyList() => strings is IReadOnlyCollection<string>;

.NET Core 2.2:

|          Method |      Mean |     Error |    StdDev |
|---------------- |----------:|----------:|----------:|
|   IsCollection  |  2.637 ns | 0.0038 ns | 0.0035 ns |
| IsReadOnlyList  | 41.492 ns | 0.0911 ns | 0.0808 ns |

.NET Core 3.0:

|          Method |      Mean |     Error |    StdDev |
|---------------- |----------:|----------:|----------:|
|   IsCollection  |  1.069 ns | 0.0008 ns | 0.0007 ns |
| IsReadOnlyList  | 40.578 ns | 0.0316 ns | 0.0264 ns |

.NET 5.0:

|          Method |      Mean |     Error |    StdDev |
|---------------- |----------:|----------:|----------:|
|   IsCollection  |  1.121 ns | 0.0010 ns | 0.0009 ns |
| IsReadOnlyList  |  2.976 ns | 0.0094 ns | 0.0088 ns |

Related PR: https://github.com/dotnet/coreclr/pull/23548

Covariant interfaces are not super slow now.

Cost can vary for both regular interface casts and for fancy ones. Regular interface cast is a linear search, but typically does not need to search far. Cached cast may need to deal with hash collisions, but typically just gets a cached value.

As a veeery rough estimate a fancy cast can be counted as a 2X of a regular interface cast.

In the past the cost of complicated casts was technically unbounded. As you nest variant generics, the cost would go up and considerably. Thus they were avoided by library owners.

Thanks @EgorBo that is indeed much faster.

It is definitely faster than before, but there is still a non-zero cost to performance when making the suggested change.

I have a few possible concerns here.

  1. What about customers that only implement ICollection<T> and not IReadOnlyCollection<T>?
  2. If we mitigate concern #1 by having checks for both ICollection<T> and IReadOnlyCollection<T> how much of a penalty does making the LINQ functions larger have?
  3. Do we have any concerns around customers who may have implemented IReadOnlyCollection<T> in such a way that it does not match with the behavior of IEnumerable<T>? My guess is that we would not treat such scenarios specially, but it is a real possibility that customers with custom written collections may have incorrect implementations of code that hasn't been tested.
  4. As @VSadov notes, the performance impact is now much less severe, but its not nothing.

It is definitely faster than before, but there is still a non-zero cost to performance when making the suggested change.

I have a few possible concerns here.

  1. What about customers that only implement ICollection<T> and not IReadOnlyCollection<T>?
  2. If we mitigate concern #1 by having checks for both ICollection<T> and IReadOnlyCollection<T> how much of a penalty does making the LINQ functions larger have?
  3. Do we have any concerns around customers who may have implemented IReadOnlyCollection<T> in such a way that it does not match with the behavior of IEnumerable<T>? My guess is that we would not treat such scenarios specially, but it is a real possibility that customers with custom written collections may have incorrect implementations of code that hasn't been tested.
  4. As @VSadov notes, the performance impact is now much less severe, but its not nothing.

A good example is Linq's Count: https://github.com/dotnet/runtime/blob/master/src/libraries/System.Linq/src/System/Linq/Count.cs#L11-L46

IReadOnlyCollection<T> check used to apply a penalty for the O(N) foreach-based fallback (note other fast paths), but now that penalty ~15 times smaller.
But of course it depends on how often users have IROC there - I am not in charge to answer 馃檪
Currently Count for IReadOnlyCollection<T> input leads to O(N) loop which can be quite slow for large collections.

There were also perf issues with limited numbers of "fast" dictionary slots (see #11971) that should now be (largely) mitigated by the dynamic dictionary expansion added in 5.0.

I found quite a few complains or rejected attempts to optimize LINQ for IReadOnlyCollection<T>:

https://github.com/dotnet/runtime/issues/28651 - LINQ results implicit support for IReadOnlyCollection
https://github.com/dotnet/runtime/issues/27517 - Performance: Make constructor List(IEnumerable collection) know about IReadOnlyCollection
https://github.com/dotnet/runtime/issues/26679 - Linq ToDictionary() should presize for IReadOnlyCollection
https://github.com/dotnet/runtime/issues/14366 - System.Linq performance improvement suggestions (mentions IROC<>)
https://github.com/dotnet/runtime/issues/24793 - Respect IReadOnlyList in the BCL
https://github.com/dotnet/runtime/issues/23910 - Add optimized path for IReadOnlyCollection/IReadOnlyList in System.Linq
https://github.com/dotnet/runtime/issues/18714 - Consider checking for IReadOnlyCollection in Enumerable.ToArray
https://github.com/dotnet/runtime/issues/27516 - Performance of LINQ .Any() - type check to leverage .Count property? (mentions IROC)
https://github.com/dotnet/runtime/issues/27517 - Performance: Make constructor List(IEnumerable collection) know about IReadOnlyCollection
https://github.com/dotnet/corefx/pull/28472 - Check for IReadOnlyCollection
https://github.com/dotnet/runtime/issues/43001 - LINQ IEnumerable extension methods should add special case IReadOnlyCollection<T>

  1. What about customers that only implement ICollection<T> and not IReadOnlyCollection<T>?

This can be expanded to different scenarios:

  1. The collection is implemented by ICollection<T>, but exposed as IReadOnlyCollection<T> in public surface.
    This should be very common. Linq will not be impact here.
  2. The collection is immutable. It only implements IReadOnlyCollection<T>.
    This depends on the actual implementation type:
  3. ReadOnlyCollection<T> (including ReadOnlyObservableCollection<T>): while it's designed to be read-only, it still implement the non-readonly interfaces. Not the case.
  4. ImmutableArray<T>: it has it's own extension methods of linq to avoid boxing. Won't worry about the default linq implementation.
  5. Custom readonly collection: Though this would be definitely impact, it should be a relative uncommon scenario.
  6. The collection is implemented by ICollection<TDerived>, but exposed as IReadOnlyCollection<TBase>, and gets linq called with <TBase>.
    This should be the scenario that's most probably get performance impact. Covariant interface check are slower, but it powers this scenario.

There are a few other interfaces used to determine IEnumerable counts, also not in a subtype relationship with either ICollection<T> or IReadOnlyCollection<T>. Would it make sense to include those as well?

Next step should be to enumerate a list of methods that could benefit from specialization.

Tagging subscribers to this area: @eiriktsarpalis, @jeffhandley
See info in area-owners.md if you want to be subscribed.

Related: #23337.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

GitAntoinee picture GitAntoinee  路  3Comments

yahorsi picture yahorsi  路  3Comments

EgorBo picture EgorBo  路  3Comments

btecu picture btecu  路  3Comments

noahfalk picture noahfalk  路  3Comments