Highly used generic types (mostly Collections) pay high cost for every API (method) we add to it.
We should revisit the decisions of adding new APIs (esp. convenience APIs) directly on the type and consider moving them into extension methods instead (in future and potentially revisit those we added during 2.0 and did not publish yet).
Context: see @jkotas's comment and guidance here: https://github.com/dotnet/coreclr/pull/10203#issuecomment-287526223
@terrajobst @stephentoub @weshaggard @danmosemsft @KrzysztofCwalina we should discuss it on Tuesday the latest ...
@jkotas might want to join us if we expect difference in opinions during the discussion ;-)
cc @ianhays @safern @AlexGhiondea @joperezr for Collections and Runtime areas which are potentially impacted the most in CoreFX.
Remove -- https://github.com/dotnet/corefx/issues/15622
TryAdd https://github.com/dotnet/corefx/issues/1942 -- this one I think we were OK with the perf and code size
It is not a high cost for every API. It becomes high cost once the per-API cost gets multiplied by non-trivial number of APIs. Death by thousand paper cuts sort of problem.
Also, this is not a black and white. Some new additions are fine - in particular the ones that make a significant dent in the usability of the type (e.g. new APIS on ArraySegment), but there is balance to strike.
Here's a proposal (for common generic classes):
Dictionaty<,>.GetValueOrDefault(), Remove, etc.TryAdd (see https://github.com/dotnet/corefx/issues/17275#issuecomment-287564658)Examples: Dictionaty<,>.GetValueOrDefault(), Remove, TryAdd, etc.
TryAdd was convenience but also to avoid two lookups when one would suffice, in a commonly-used pattern. The latter part can't be done externally to the type.
Some apis get a benefit from being a member e.g. Dictionary<TKey,TValue>.TryAdd not doing a double check
Others just use the public interface so shouldn't loose much from being an extension
e.g.
Span<T>.Clear() -> .Clear(this Span<T>) (though struct copy overhead?)Span<T>.Fill(T value) -> .Fill(this Span<T>, T value)Array.Fill<T>(T[] array, T value) -> .Fill<T>(this T[] array, T value)Array.Fill<T>(T[] array, T value, int startIndex, int count) -> .Fill<T>(this T[] array, T value, int startIndex, int count)Slightly tangential, but related does ValueTuple need to implement 7 interfaces? (especially the ones that end up boxing)
Thanks @stephentoub @benaadams for pointing out that TryAdd belongs to category [3] and not [1]. I didn't go deeper on the examples - I probably should have. My proposal is now updated to avoid further confusion.
Another reason for using extensions on generic types is it allows for specialisation of the methods based on type (e.g. Span.IndexOf(this Span<T>, ...) that also has Span.IndexOf(this Span<byte>, ...))
We discussed this. Conclusion:
Remove with an out, GetValueOrDefault and decided what to do with it.CollectionExtensions) and we can design most of the convenience target the interfaces in which case extension methods are the most logical design point.@ianhays can you please list the APIs we added in 2.0 on any collection? @weshaggard can help if needed.
Thanks!
Is a principle established for API's that provide perf benefits, like TryAdd? I assume those can continue to go on the real types so long as they don't excessively bloat the code nor have a perf impact on existing API.
It will be case by case decisions, but high-level principles are:
Huh, I just noticed that TryAdd is used as perf method: https://github.com/dotnet/corefx/pull/17201 ... I think we will be easier on that one :(
Sure thing @karelz. Here are all the APIs that are exclusive to netcoreapp2.0 added in System.Collections:
```c#
namespace System.Collections
{
public sealed partial class BitArray : System.Collections.ICollection, System.Collections.IEnumerable, System.ICloneable
{
public System.Collections.BitArray RightShift(int count) { throw null; }
public System.Collections.BitArray LeftShift(int count) { throw null; }
}
}
namespace System.Collections.Generic
{
public static class CollectionExtensions
{
public static TValue GetValueOrDefault
public static TValue GetValueOrDefault
public static TValue GetValueOrDefault
public static TValue GetValueOrDefault
}
public partial class Dictionary<TKey, TValue> : System.Collections.Generic.ICollection<System.Collections.Generic.KeyValuePair<TKey, TValue>>, System.Collections.Generic.IDictionary<TKey, TValue>, System.Collections.Generic.IEnumerable<System.Collections.Generic.KeyValuePair<TKey, TValue>>, System.Collections.Generic.IReadOnlyCollection<System.Collections.Generic.KeyValuePair<TKey, TValue>>, System.Collections.Generic.IReadOnlyDictionary<TKey, TValue>, System.Collections.ICollection, System.Collections.IDictionary, System.Collections.IEnumerable, System.Runtime.Serialization.IDeserializationCallback, System.Runtime.Serialization.ISerializable
{
public Dictionary(System.Collections.Generic.IEnumerable<System.Collections.Generic.KeyValuePair<TKey, TValue>> collection) { }
public Dictionary(System.Collections.Generic.IEnumerable<System.Collections.Generic.KeyValuePair<TKey, TValue>> collection, System.Collections.Generic.IEqualityComparer<TKey> comparer) { }
public bool TryAdd(TKey key, TValue value) { throw null; }
public TValue GetValueOrDefault(TKey key) { throw null; }
public TValue GetValueOrDefault(TKey key, TValue defaultValue) { throw null; }
}
public partial class HashSet<T> : System.Collections.Generic.ICollection<T>, System.Collections.Generic.IEnumerable<T>, System.Collections.Generic.IReadOnlyCollection<T>, System.Collections.Generic.ISet<T>, System.Collections.IEnumerable, System.Runtime.Serialization.IDeserializationCallback, System.Runtime.Serialization.ISerializable
{
public HashSet(int capacity) { }
public HashSet(int capacity, System.Collections.Generic.IEqualityComparer<T> comparer) { }
public bool TryGetValue(T equalValue, out T actualValue) { throw null; }
}
public partial class Queue<T> : System.Collections.Generic.IEnumerable<T>, System.Collections.Generic.IReadOnlyCollection<T>, System.Collections.ICollection, System.Collections.IEnumerable
{
public bool TryDequeue(out T result) { throw null; }
public bool TryPeek(out T result) { throw null; }
}
public partial class SortedSet<T> : System.Collections.Generic.ICollection<T>, System.Collections.Generic.IEnumerable<T>, System.Collections.Generic.IReadOnlyCollection<T>, System.Collections.Generic.ISet<T>, System.Collections.ICollection, System.Collections.IEnumerable, System.Runtime.Serialization.IDeserializationCallback, System.Runtime.Serialization.ISerializable
{
public bool TryGetValue(T equalValue, out T actualValue) { throw null; }
}
public partial class Stack<T> : System.Collections.Generic.IEnumerable<T>, System.Collections.Generic.IReadOnlyCollection<T>, System.Collections.ICollection, System.Collections.IEnumerable
{
public bool TryPeek(out T result) { throw null; }
public bool TryPop(out T result) { throw null; }
}
}
```
To get this list i diffed our current System.Collections ref file with the oldest one on Github, then removed API's that were present in both or that were present in the recent ref file but a part of the last netfx release (i.e. netstandard2.0 APIs).
@karelz do you think this is going to make it into 2.0.0 or should we probably move it to 2.1.0/future?
@safern moving it doesn't make sense. Either we do it, or we close it.
I was not marked as api-ready-for-review (and we didn't have time anyway yesterday). Pushing to next week.
@safern moving it doesn't make sense. Either we do it, or we close it.
I was not marked as api-ready-for-review (and we didn't have time anyway yesterday). Pushing to next week.
So with the list that @ianhays provided is it ready to be marked as api-ready-for-review so that it can be reviewed next week?
Yes it is. I just did it (got distracted from doing that after my previous reply, sorry)
No problem. Just wanted to make sure that it was good to go so that it is reviewed next Tuesday. Thanks!
Yep, thanks for chasing it down, appreciate it!
We've finalized our set of principles to this list:
Span<byte> vs. Span<int>IDictionary<K,V>.GetFoo() over Dictionary<K, V>.GetFoo()For the APIs @safern posted earlier this means:
BitArray looks good as proposedCollectionExtensions looks good as proposedDictionary<K,V> look good as proposedDictionary<K,V>.GetValueOrDefault - dotnet/corefx#17917TryAdd should remainTryRemove (i.e. Remove) should follow TryAdd - https://github.com/dotnet/corefx/issues/15622 (PR in progress)TryAdd/TryRemove (i.e. Remove): we also add extension methods for the interfaces on CollectionExtensions - dotnet/corefx#17918TryGetValue should remain, impossible to implement reasonablyQueue/Stack look good as proposed as we don't have interfaces@safern please make sure we have appropriate issues for any work
@terrajobst could you consider an addition to the list?
DO prefer extension methods to instance methods on generic types if they gain no extra benefit from having access to non-public members.
This is because the extension methods allow additional specialisation by type whereas generic instance methods allow no further specialization (from https://github.com/dotnet/corefxlab/issues/1314#issuecomment-291645150)
As shown by
class Program
{
static void Main(string[] args)
{
var value = new MyType<byte>();
value.Output();
}
}
public class MyType<T>
{
public void Output()
{
Console.WriteLine("Generic instance method called");
}
}
public static class MyTypeExtensions
{
public static void Output(this MyType<byte> myType)
{
Console.WriteLine("MyType<byte> Extension method called");
}
}
Outputs
Generic instance method called
This is because the extension methods allow additional specialisation by type
For value types, instance methods can specialize implementation via guarded type checks that'll result in dead branches being trimmed by the JIT. But for reference types it's true. And extensions let you add methods that only apply to certain instantiations, e.g. TaskExtensions.Unwrap, which can't be done with instance methods... and that may have been your main point (though at that point, it's not really a choice, since there's no way to do it with instance methods, unless you want to throw from every other usage).
For value types, instance methods can specialize implementation via guarded type checks that'll result in dead branches being trimmed by the JIT
Only for ahead of time decided types. If a library author creates their own struct type that can't be added to the type check chain in the instance method and will always take the general object path.
Obviously if there is a gain from access to internals (like the dictionary TryAdd method) that's one thing. Or their is a specific reason (like Vector which are JIT/CPU recognised types)
However, if its a method that is just using the public api and being an instance method gains no measurable benefit over an extension method; then it should be implemented as an extension method over the generic type; which then allows future specialisations for other types and future types to be written.
Sure, in general, but this thread is about types in the BCL... types like Dictionary aren't going to provide methods that operate on BensCoolType, even as an extension method. :smile:
You say that but... you could do
public static GetValueOrDefault(this Dictionary<K,V> dict, K key)
{
if (dict.TryGet(out var value)
{
return value;
}
return default(V)
}
Then the CoolType library could add
public static GetValueOrDefault(this Dictionary<K,BensCoolType> dict, K key)
{
if (dict.TryGet(out var value)
{
return value;
}
return BensCoolType.DefaultValue;
}
馃槈
Or maybe you hate string.IsNullOrEmpty and nulls in general for strings
public static GetValueOrDefault(this Dictionary<K,string> dict, K key)
{
if (dict.TryGet(out var value)
{
return value;
}
return string.Empty;
}
Rather than writing var value = dict.GetValueOrDefault(key) ?? string.Empty;
Then the CoolType library could add
I think what you're really pointing out is that the C# compiler only considers extension methods if it can't find a matching instance method, and thus defining an instance method prevents others from providing their own replacement extensions with a more specialized signature.
Yes exactly
I see @benaadams's point, however, I am a little bit hesitant to optimize for method overloading in the compiler. IMO it creates quite a lot of complexity people need to be aware of ... (imagine you call GetValueOrDefault and the library you depend on overloaded it, but you don't know it and all you look at is public docs on MSDN and you are puzzled why it returns something else ...)
If a type wants specialized extension method overload, I would suggest to the author to create a brand new extension method ...
@terrajobst is it worth saving the rules in some API design doc in the repo?
@terrajobst is it worth saving the rules in some API design doc in the repo?
@karelz I think that is a good idea as for the future we can have that in a doc instead of looking for this specific issue/comment.
@safern please make sure we have appropriate issues for any work
Taking care of that.
imagine you call GetValueOrDefault and the library you depend on overloaded it...
My examples might have been poor; though the counter argument equally applies to virtual class methods; interfaces, operator overloading etc. The extension methods allow for a kind of inheritance of the generic type when it is an of specific type.
Perhaps a real example would have been better; Span<T> is a good example. So its shipping with specializations for byte and char maybe later it will have int; however they have not been written yet. So as a user I can add specilizations for int right now at a faster cadence than corefx will ship them; then contribute back to they will get rolled in at a later date.
If these were written as a single instance method with typeof(T) specialization; then as a user I'd have to add a different method to plug the gap; then search and replace when the corefx library provided them, rather than just deleting my extension methods.
Anything that's a "delete a single method" I see as a better option than search and replace over your code base.
Also extension method changes are easier to shim back onto desktop than instance method changes; as desktop moves at an even slower cadence than corefx (which is relatively quick)
@benaadams your new example falls under [2] in @terrajobst's list. I agree with that one.
I am just hesitant to go overboard with that guidance (see my previous concern). Do you have other examples which would not fall under [2]?
I've opened issues:
To track the work needed for this issue. So I will go ahead and close this issue.
Do you have other examples which would not fall under [2]?
No. 馃槃
@terrajobst I have updated your approval post https://github.com/dotnet/corefx/issues/17275#issuecomment-291636863 with links to tracking issues (easier to check all are covered) and I added explanation that by TryRemove, you actually meant Remove (it confused me for few minutes). Please let me know if that's incorrect.
Most helpful comment
We've finalized our set of principles to this list:
Span<byte>vs.Span<int>IDictionary<K,V>.GetFoo()overDictionary<K, V>.GetFoo()For the APIs @safern posted earlier this means:
BitArraylooks good as proposedCollectionExtensionslooks good as proposedDictionary<K,V>look good as proposedDictionary<K,V>.GetValueOrDefault- dotnet/corefx#17917TryAddshould remainTryRemove(i.e.Remove) should followTryAdd- https://github.com/dotnet/corefx/issues/15622 (PR in progress)TryAdd/TryRemove(i.e.Remove): we also add extension methods for the interfaces onCollectionExtensions- dotnet/corefx#17918TryGetValueshould remain, impossible to implement reasonablyQueue/Stacklook good as proposed as we don't have interfaces