Runtime: Make a new interface ICountable

Created on 27 Aug 2017 · 24Comments · Source: dotnet/runtime

Description

I often bump into scenarios where I have to evaluate an IEnumerable that might be either ICollection<T>, ICollection, IEnumerable<T> or IEnumerable, to get their size if available, so to avoid iterating on them to count them.

The thing with ICollection<T>, is that it doesn't inherit from ICollection, and there are types that inherit from ICollection<T> but not from ICollection (for instance HashSet<T>).
Additionally, ICollection<T> isn't co-variant, meaning not every ICollection<T> is an ICollection<object>, like IEnumerable<T> is.
I'm not arguing about this design, which is a good thing to avoid confusion and mis-adding of types.

Motives

The thing I do find annoying here, and I bump into it a lot, is the Count property, which is common to ICollection and ICollection<T> and IReadOnlyCollection<T>, but these interfaces don't overlap each other.
Obtain a value from a generic type when the type argument is unknown at compile time is only possible with reflection. So in order to get the known size of a collection you'd have to do the following (see discussion here). And with the branching of the TypeInfo class, this gets even worse:

```c#
//gets count if known
int? GetCountIfKnown(IEnumerable collection)
{
if(collection is ICollection collection)
return collection.Count;
else
{
Type genericCol = value.GetType().GetTypeInfo().ImplementedInterfaces.Select(t =>
t.GetTypeInfo()).FirstOrDefault(i =>
i.IsGenericType && i.GetGenericTypeDefinition() == typeof(ICollection<>));
if (genericCol != null)
retun (int)genericCol.GetTypeInfo().GetProperty("Count").GetValue(value);
}
return null;
}

As pointed out later in [comments](https://github.com/dotnet/corefx/issues/23578#issuecomment-325664702) by @NightOwl888, the `IReadOnlyCollection<T>` interface also exposes the `Count` property, but the reasons it doesn't address my issue is:

- It's generic, so when the type is unknown at compile time, we can only evaluate it using reflection
- `ICollection<T>` doesn't inherit from it.
- Not all collections with known size inherit from it

There are many use cases of non-generic collections I bump into, the real reason we need non-generic support, is not because their not generic, but rather because their generic argument may not be known at runtime.
I can't remember all of the scenarios I've encountered this demand as they are many, but I'll name a few.
- When developing general tools (i.e. XAML converters) that are supposed to eat a wide variety of value types and work differently when the value is a collection.
- We need to store a collection of `ICollection<T>`s of a variety of types. When enumerating the main collection, the count of the items can only be achieved by acquiring the collection type with reflection.

Another reason is that `ICollection` has more methods to implement, and is more tedious.

### Solution

My suggestion is then, to introduce a new interface `ICountable` (or `IReadOnlyCollection` - non generic) that its contract determines that a collection has a `Count` property and exposed the collection size regardless of its item type.

This interface should be implemented by (please comment if you're aware of others):
- `ICollection`
- `ICollection<T>`
- `IReadOnlyCollection<T>`

The new interface should have one sole task: expose a known size of a collection via a `Count` property, with disregard to the collection type (`T`), editability (`ICollection` vs `IReadOnlyCollection`), or other features.

```c#
int? GetCountIfKnown(IEnumerable enumerable) =>
  (enumerable as IReadOnlyCollection)?.Count;

Concequences

Implicit implementations of ICollection, ICollection<T> and IReadOnlyCollection<T> will have to change the pattern to ICountable (or if we call it IReadOnlyCollection or anything else).
Implementations that call the property Count on ICollection (or the others) as "declared", will have to switch to either calling the runtime property, or call it on ICountable etc.

api-needs-work area-System.Collections

Source

weitzhandler

👍4 ❤2

Most helpful comment

EDIT: I've pivoted towards standardizing the implementation of collections rather than introducing another interface. See my later comment.

I, too, wish there was a go-to means for recognizing a collection as opposed to an iterable, whether that be an interface (ICountable, IReadOnlyCollection, IHasCount etc.) or something else. The prevalence of IEnumerable makes it a perfect lowest common denominator for something you iterate over. There's currently no lowest common denominator for a collection-- instead, there are 3 different interfaces with their own Count property: ICollection, ICollection<T>, and IReadOnlyCollection<T>. I don't know how Span<T>/Memory<T> will end up impacting the framework-- perhaps they'll make things easier, or perhaps it'll be one more "collection-like" type requiring special handling.

I did want to address @NightOwl888's proposed abstract base classes: you wouldn't be able to use them for collections that are value types, such as ImmutableArray<T>.

zeldafreak on 29 Nov 2017

👍4

All 24 comments

Note that if we really want to go down this road that not all "countable" types have a property named Count. Files, arrays, and streams all use the property name Length rather than Count to determine their size. So by the same token, we would need an ILengthable ? or probably better ILength or ISize interface with a property Length.

I do see the advantage with covariance, but it doesn't seem right to force every IEnumerable<T> to implement a Count property that might not be practical to do. Perhaps there could be an ICountableEnumerable<T> (inheriting IEnumerable<T>) that exists between IEnumerable<T> and ICollection<T>, which would not break backward compatibility (except perhaps with LINQ).

NightOwl888 on 28 Aug 2017

@NightOwl888 Please allow me to disagree with you, because most of those types you mention implement ICollection<T> or ICollection, and return the Length.
It's just that some types implement ICollection<T> but not ICollection, which in case getting the Count property can only be achieved with reflection, hence my suggestion.
The need for the ICountable is specifically is to support an easy determination of ICollection<T> with a known count/length, but T is unknown at compile time, because really the Count property of ICollection<T> has nothing to do with T, the reason ICollection<T> isn't covariant (i.e. ICollection<out T> like IEnumerable<T>, ~and doesn't inherit from ICollection (not really sure)~, is to avoid mixing types, if I get it correctly, but with that we find the Count property bound to T when it shouldn't.

ICountableEnumerable is precisely what I meant. Name it whatever you like but separate it out.

weitzhandler on 28 Aug 2017

👍2

What would be the real-world use of such interface? Interfaces don't come for free - they have runtime impact, size on disk impact and also maintenance cost associated.

karelz on 29 Aug 2017

@karelz
Currently, getting the Count property for a generic ICollection of unknown type is only possible with reflection:

```c#
int GetCountIfAvailable(IEnumerable obj)
{
if(obj is ICollection collection)
return collection.Count;

var gCollection = typeof(ICollection<>);
var iCollection = obj.GetType().GetInterfaces().FirstOrDefault(i =>
i.IsGenericType && i.GetGenericTypeDefinition() == typeof(ICollection<>));

if (iCollection != null)
return (int)iCollection.GetProperty("Count").GetValue(obj);

return -1; //count unavailable
}

If we branch out the `Count` property (which really has nothing to do with the collection type in first place), we'll be able to achieve the above by:

```c#
int GetCountIfAvailable(IEnumerable obj) =>
  return obj is ICountable countable ? countable.Count : -1;

Additionally, ICollection<T> doesn't inherit from ICollection, while their Count property should have been mutual. Having the intermediate interface shared between them will spare us from always having to check for ICollection and ICollection<T> separately when the collection size is available without iterating.
Besides, the ability to modify a collection and the ability to get its size are two different aspects and shouldn't be tied together in one interface.

weitzhandler on 29 Aug 2017

👍1

Note that if we really want to go down this road that not all "countable" types have a property named Count.

That's no problem. You'd just implement the interface explicitly:

public class SomeCollectionWithLengthProperty : ICountable
{
    public int Length { get; }
    public int ICountable.Count => Length;
}

khellang on 29 Aug 2017

👍4

@khellang @NightOwl888 in fact, arrays and some other types actually do just this.

weitzhandler on 29 Aug 2017

👍2

I do see the advantage with covariance, but it doesn't seem right to force every IEnumerable<T> to implement a Count property that might not be practical to do. Perhaps there could be an ICountableEnumerable<T> (inheriting IEnumerable<T>) that exists between IEnumerable<T> and ICollection<T>, which would not break backward compatibility (except perhaps with LINQ).

I am going to have to retract this statement. There already is a IReadOnlyCollection<T> type that is effectively the same thing as the ICountableEnumerable<T> type I envisioned, being a covariant alternative to ICollection<T>.

As for the ICollection vs ICollection<T> vs IReadOnlyCollection vs IReadOnlyCollection<T> issue, I agree there is a gap here. It would be nice if there were a way to work with any collection without having to handle several different interface types.

NightOwl888 on 29 Aug 2017

@NightOwl888

There already is a IReadOnlyCollection type that is effectively the same thing as the ICountableEnumerable type I envisioned, being a covariant alternative to ICollection.

Yes, but IReadOnlyCollection means something else in it's essence, and as you said for yourself, arrays, and other common collection types don't implement it.

What I am after is a centralized interface that tells that any collection (IEnumerable) implementing it, has a known size without having to iterate it.

weitzhandler on 29 Aug 2017

@weitzhandler T[] does implement IReadOnlyCollection<T>. What common collection types do you think don't implement it?

svick on 29 Aug 2017

While I understand the technical value on the caller side, I am still not sure how common such code is which needs to know the Count out of IEnumerable (if available). IMO it is usable mostly/only in Linq.
Adding new interface just to make Linq code easier, maybe a bit faster (would be nice to see the numbers) seems like overkill to me.

Or are there other real-world usages where this would provide significant value?

karelz on 29 Aug 2017

@NightOwl888 @svick The problem with IReadOnlyCollection<T> is again, it's generic, and I'm looking for a non generic interface that should expose Count and should be common to ICollection<T> and ICollection, so any collection with a known size will be available regardless to its generic definition.

@karelz it could be a bit faster, and it could be n times faster depending on the collection size. I bump into this scenario every day. Given an IEnumerable that can be either ICollection<T>, ICollection, add to this if, T is unknown.

My code gets splattered with ICollection and ICollection<T> because I'm dealing with large collections
If I don't know the collection type I can't rely on the Count property, even I know it'll be an ICollection<T> at runtime

Here's an example scenario, and that's when T is known. Add to this class plenty more check for when T is unavailable. Here's another example (lines 81-95).

weitzhandler on 30 Aug 2017

👍1

As someone who has recent experience dealing with custom .NET collection types and the challenges surrounding them, let me see if I can offer up a solution that would work not just for the Count property, but for better APIs around collections in general.

The main issue here is not that we are lacking an interface, it is that we are lacking any kind of relationship between the interfaces to make them generally usable within APIs. One of the biggest issues is there is no way to create APIs that work with both generic and non-generic interfaces without resorting to lots of reflection and casting. This is not limited to getting the Count property from an IEnumerable, but is a challenge for any API logic that needs to work with both generic and non-generic collection types.

Another issue exists when trying to implement custom collection types. We typically can't just pick an interface and go, we typically need to implement several interfaces, which involve time and research. And none of the existing collection types are very helpful. Although they are not sealed, none of their members are virtual, so we are always starting from scratch to make custom collection types in .NET because nothing was made reusable.

Back on point, let's say for the sake of argument we want to make a method that works on any IList or IList<T> (that is, any generic or non-generic Array or List). There isn't one interface in the box that can do this for us. The best we can do is use IList, since that is the only interface we have that meets all of our criteria.

```c#
void DoSomething(IList list)

Now, a problem with doing this is that `IList<T>` doesn't inherit `IList`. So, we have to check which type we are dealing with in order to get to any functionality, even if that functionality is common between the two interfaces. And as @weitzhandler pointed out, we are stuck with Reflection to deal with the generic type case since we have no idea what type `T` actually is at runtime.

```c#
void DoSomething(IList list)
{
    if (list.GetType().ImplementsGenericInterface(typeof(IList<>))
          // handle generic
    else
         // handle non-generic
}

private static bool ImplementsGenericInterface(this Type target, Type interfaceType)
{
    return target.GetTypeInfo().IsGenericType && target.GetGenericTypeDefinition().GetInterfaces().Any(
                x => x.GetTypeInfo().IsGenericType && interfaceType.IsAssignableFrom(x.GetGenericTypeDefinition())
    );
}

To further complicate things, although all of the built-in types implement all of the interfaces to ensure our API works, there are no guarantees that custom collection types will implement all of the interfaces that we may need (depending on the functionality we are after). So despite our best efforts to make APIs that "just work", there are no guarantees that they always will.

What's missing from the .NET toolbox is the fact there are no abstractions that allow us to design APIs that seamlessly work with both generic and non-generic types. So, I propose that we make abstract classes to fill that gap. This is not without precedence - this is exactly how they do it in Java.

Although in .NET, we need to explicitly handle both generic and non-generic types and we might also make abstractions for the covariant types.

AbstractReadOnlyCollection
AbstractReadOnlyCollection<T>
AbstractCollection
AbstractCollection<T>
AbstractReadOnlyList
AbstractReadOnlyList<T>
AbstractList
AbstractList<T>
AbstractReadOnlySet
AbstractReadOnlySet<T>
AbstractSet
AbstractSet<T>
AbstractReadOnlyDictionary
AbstractReadOnlyDictionary<T>
AbstractDictionary
AbstractDictionary<T>

Here is a basic example. The real advantage comes with the inheritance hierarchy which ensures our contract is shared between generic and non-generic implementations.

```c#
// All read-only non-generic arrays, lists, sets, and dictionaries subclass this
public abstract class AbstractReadOnlyCollection : IEnumerable
{
public abstract int Count { get; }

    public IEnumerator GetEnumerator()
    {
        throw new NotImplementedException();
    }
}

// All read-only generic arrays, lists, sets, and dictionaries subclass this
public abstract class AbstractReadOnlyCollection<T> : AbstractReadOnlyCollection, IReadOnlyCollection<T>, IEnumerable<T>, IEnumerable
{
    IEnumerator<T> IEnumerable<T>.GetEnumerator()
    {
        throw new NotImplementedException();
    }
}

// All non-generic arrays, lists, sets, and dictionaries subclass this
public abstract class AbstractCollection : AbstractReadOnlyCollection, ICollection, IEnumerable
{
    public abstract bool IsSynchronized { get; }
    public abstract object SyncRoot { get; }

    public abstract void CopyTo(Array array, int index);
}

// All generic arrays, lists, sets, and dictionaries subclass this
public abstract class AbstractCollection<T> : AbstractReadOnlyCollection<T>, ICollection<T>, IReadOnlyCollection<T>, ICollection, IEnumerable<T>, IEnumerable
{
    public abstract bool IsReadOnly { get; }
    public abstract bool IsSynchronized { get; }
    public abstract object SyncRoot { get; }

    public abstract void Add(T item);
    public abstract void Clear();
    public abstract bool Contains(T item);
    public abstract void CopyTo(T[] array, int arrayIndex);
    public abstract void CopyTo(Array array, int index);
    public abstract bool Remove(T item);
}

Note that the above is intended as an example of the public API contract, not as a finished implementation. 

With that in place, it becomes *much* easier to implement custom collection types, since a lot of the boilerplate code can be moved into the abstract base types. For example, we could potentially move the `List<T>.Sort()` functionality into an abstract class so all `IList<T>` implementations don't need to create their own implementation. In addition, all of our interfaces are guaranteed to exist by all implementations.

Note also that none of this is breaking - it is completely backward compatible with existing APIs.

But more on point, @weitzhandler won't need to make a custom function to get the Count, he simply needs to accept the lowest level abstract collection and it will always work whether collection is generic or not (including Array types). In fact, all shared functionality is readily available without reflection or casting.

```c#
void DoSomething(AbstractReadOnlyCollection collection)
{
    var count = collection.Count;
}

@weitzhandler - do you think this proposal would address all of your concerns?

@karelz - As you can see, this proposal has a broader real-world impact, both around creating APIs that deal with collection types and with creating custom collection types. Both of these tasks are challenging to do with today's .NET APIs. LINQ is a great for working with generic collections, but doesn't bridge the gap between generic and non-generic collections and in some cases the performance cost is too high to be practical.

Collection Equality

One other thing I would like to chime in on is the fact that in .NET it is very difficult to compare 2 collections and their nested collections for equality. This is yet another common thing that is asked for that Enumerable.SequenceEqual only covers part of because dictionaries and sets require the comparison to be done regardless of order, and it also doesn't work recursively. This is a very common requirement during unit testing.

While I don't propose we change the default behavior, it would be nice if there were a constructor parameter that could be used to change the equality checking on built-in collection types from reference equality to deep value equality checking including all nested collection types. The actual default implementations of both of these strategies could be put into the abstract base classes, giving implementers a choice of which type of equality checking they prefer.

Perhaps there could also be a DeepEquals() and GetDeepHashCode() or something along those lines so APIs could have their choice of equality checking. Food for thought...more consideration required.

NightOwl888 on 30 Aug 2017

😕1 👍1

@NightOwl888

One of the biggest issues is there is no way to create APIs that work with both generic and non-generic interfaces without resorting to lots of reflection and casting.

Why do you need to work with non-generic interfaces in the first place? Is it because of some legacy API?

And none of the existing collection types are very helpful. Although they are not sealed, none of their members are virtual, so we are always starting from scratch to make custom collection types in .NET because nothing was made reusable.

There is Collection<T>, which exists exactly for this purpose and has virtual members.

I propose that we make abstract classes to fill that gap.

I think you should open a separate proposal for that, it's far out of scope of this issue. (Which is also why I won't comment on the proposal itself here.)

svick on 30 Aug 2017

👍3

Why do you need to work with non-generic interfaces in the first place? Is it because of some legacy API?

For one, to implement the aforementioned deep equality checking so it works on existing platform collection types. I have no control over what type of collections that a user might utilize in a generic type and since I opted to save time and only support nested generic collection types (without support for non-generic collections), there are gaps in my implementation.

But in a nutshell, I am running into the same challenges that @weitzhandler is.

There is Collection<T>, which exists exactly for this purpose and has virtual members.

Thanks for pointing that out. Although, that still sounds too low level to be very useful, since lists, sets, and dictionaries have common behaviors that are generally very different from one another (for example, it only makes sense to sort lists in general, and most sets and dictionaries have undefined sort order).

It is also not commonly implemented by existing collection types (for example List<T> or Dictionary<T>), and therefore does not serve as a common way to work with all collections from APIs (which is the primary reason for this proposal).

I think you should open a separate proposal for that, it's far out of scope of this issue. (Which is also why I won't comment on the proposal itself here.)

Thanks again. I will do that when I get a chance and take into account the existing abstract Collection<T>.

NightOwl888 on 30 Aug 2017

👍2

I'm on mobile not able to text much.
Anyway I'm quite happy with how the collection system works.
I would appreciate if List<T> and Dictionary<TKey, TValue> can become virtual indeed, but for that I'll suggest opening a new issue.
I'm here only to request the count property to be extracted and implemented by all known size collections regardless of type.

weitzhandler on 30 Aug 2017

👍1

@svick

Why do you need to work with non-generic interfaces in the first place? Is it because of some legacy API?

There are many cases, and I often bump into them. One is when you need to store a collection of ICollection<T> where T is of various types. Every generic type, when it needs to be worked on in a batched manner has to become non-generic, or else it has to be done via reflection. Also whenever T isn't supposed to be known, we'll want to use the common non-generic methods before going the reflection ugly path.

@svick

@NightOwl888 And none of the existing collection types are very helpful. Although they are not sealed, none of their >>members are virtual, so we are always starting from scratch to make custom collection types in .NET >>because nothing was made reusable.

There is Collection<T>, which exists exactly for this purpose and has virtual members.

What I was just gonna say. Tho it would be nice having List<T>, Dictionary<TKey, TValue> and some other common implementations more virtual. Collection<T> is far from being as powerful as List<T>. But for this comes what you said:

I think you should open a separate proposal for that, it's far out of scope of this issue. (Which is also why I won't comment on the proposal itself here.)

weitzhandler on 31 Aug 2017

Really this post belongs on the coreclr repo. Migrated.

weitzhandler on 31 Aug 2017

@weitzhandler

Really this post belongs on the coreclr repo.

It does not, all API proposals for .Net Core belong to this repo:

For all managed API addition proposals use the CoreFX Issues Page and follow the API Review Process.

That applies even for proposals whose implementation is going to be in the CoreCLR repo.

svick on 31 Aug 2017

👍1

I like the idea of this from a design perspective e.g. if we were starting anew.

However, my issue with this proposal is how many changes it would require devs to make i.e.

Implicit implementations of ICollection, ICollection and IReadOnlyCollection will have to change the pattern to ICountable (or if we call it IReadOnlyCollection or anything else).
Implementations that call the property Count on ICollection (or the others) as "declared", will have to switch to either calling the runtime property, or call it on ICountable etc.

I don't think that being able to have a shared interface with the Count property is enough of a convenience to justify those breaking changes.

Because of the required breaking changes, it would be very unlikely that we would be able to push a change this large through the API review process without a huge number of people that wanted it.

ianhays on 11 Sep 2017

👍3

EDIT: I've pivoted towards standardizing the implementation of collections rather than introducing another interface. See my later comment.

I did want to address @NightOwl888's proposed abstract base classes: you wouldn't be able to use them for collections that are value types, such as ImmutableArray<T>.

zeldafreak on 29 Nov 2017

👍4

Personally, I am not convinced the value is worth the effort & overhead of yet another interface. But I let @safern make the final call here as area owner.

karelz on 10 Oct 2018

Currently, getting the Count property for a generic ICollection of unknown type is only possible with reflection

We typically use ICollection to determine the count for non-generic enumerables. So perhaps this is an issue about having more collections implement this interface explicitly?

While I do acknowledge that ICollection is more complex than what is being proposed here, my concerns around introducing a new interface is summed up perfectly by this xkcd classic.

Assuming we did decide to add a non-generic IReadOnlyCollection interface, most likely it would not be inherited by IReadOnlyCollection<T> or ICollection<T>.

eiriktsarpalis on 16 Nov 2020

👍1

I'm going to pivot from my 2017 position and agree with @eiriktsarpalis. If we had the ability to start over from scratch, I'd advocate for a non-generic IReadOnlyCollection hard. But you're right, if you added that interface today, you couldn't have the generic interfaces inherit from it without huge breaking changes.

I think the only workable solution is to pick one of the interfaces, such as ICollection, and make sure every collection (looking at you, HashSet<T>) in the framework implements it. _Document_ the fact that for the sake of runtime performance, any custom collection type should implement that interface. _Document_ a recommended way collections should implement the aspects of ICollection that are not always applicable (e.g. Add and Remove for read-only collections). Perhaps make use of Roslyn analyzers to enforce these patterns (e.g. look for types that implement ICollection<T>, perhaps IReadOnlyCollection<T> as well, and warn if they don't implement ICollection).

If the framework cannot provide a way to unify these interfaces, the next best thing is for its types and documentation to be consistent and clear that implementing ICollection is the standard practice. Clear documentation is equally (if not more) important as the code changes. Then there will be a clear path forward, a "standard" way to obtain the Count, and only those forced to work with non-standard collection types would have to resort to ugly reflection hacks. The goal is no longer elimination, but mitigation.

zeldafreak on 16 Nov 2020

👍1

Related: #42254, #24793.

weitzhandler on 17 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings