Runtime: Proposal: Add Sort(...) extension methods for Span<T>

Created on 19 Jan 2017  ยท  53Comments  ยท  Source: dotnet/runtime

Currently, there is no way to sort native or fixed memory (e.g. coming from a pointer) in .NET, this proposal intends to fix that by adding sorting extension methods to Span<T>, but also proposes some different overloads than seen on Array to allow for inlined comparisons via the possibility to use value type comparers.

Proposed API

Add a set of Sort extension methods to Span<T> in SpanExtensions:

public static class SpanExtensions
{
     public static void Sort<T>(this Span<T> span);

     public static void Sort<T, TComparer>(this Span<T> span, TComparer comparer) 
        where TComparer : IComparer<T>;

     public static void Sort<T>(this Span<T> span, System.Comparison<T> comparison);

     public static void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> items);

     public static void Sort<TKey, TValue, TComparer>(this Span<TKey> keys, 
        Span<TValue> items, TComparer comparer) 
        where TComparer : IComparer<TKey>;

     public static void Sort<TKey, TValue>(this Span<TKey> keys, 
        Span<TValue> items, System.Comparison<TKey> comparison);
}

Rationale and Usage

Provide a safe yet fast way of sorting of any type of contiguous memory; managed or unmanaged.

Sorting Native Memory

var span = new Span<int>(ptr, length);
span.Sort(); // Sort elements in native memory

Sorting with Inlineable Comparer

struct ReverseComparer : IComparer<int>
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Compare(int a, int b)
    {
        if (a == b) { return 0; }
        if (a > b) { return -1; }
        return 1;
    }
}

Span<int> span = GetSomeSpan();
// Sort elements, in reverse order with inlined Compare,
// without heap allocation for comparer
span.Sort(new ReverseComparer()); 

Sorting based on Lambda

Span<int> span = GetSomeSpan();
// Sort elements, in reverse order with lambda/delegate
span.Sort((a, b) => a == b ? 0 : (a > b ? -1 : 1)); 

Sorting Compound Type

struct Compound
{
    public float FeatureValue;
    public int FeatureIndex;
    public object Payload;
}

struct InlineableFeatureValueComparer : IComparer<Compound>
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Compare(Compound a, Compound b)
    {
        if (a.FeatureValue == b.FeatureValue) { return 0; }
        if (a.FeatureValue < b.FeatureValue) { return -1; }
        return 1;
    }
}

var span = new Span<Compound>();

span.Sort();

// Inlineable struct comparer
var comparer = new InlineableFeatureValueComparer();
span.Sort(comparer);

// Lambda comparer
span.Sort((a, b) => a.FeatureValue == b.FeatureValue ? 0 : 
    (a.FeatureValue < b.FeatureValue ? -1 : 1));

The argumentation for adding this is:

  • To increase the efficiency of code doing sorting and prevent people from reinventing the wheel.
  • Allow performance optimizations depending on memory type and contents.
  • Allow sorting on contiguous memory of any kind.

Use Cases

In almost any domain where a high volume of data is processed with sorting being one operation and memory management (e.g. memory recycling, buffer pooling, native memory) is a must to achieve high performance with minimal latency, these sorts would be useful. Example domains are database engines (think indexing), computer vision, artificial intelligence etc.

A concrete example could be in the training of Random Forests some methods employ feature sorting (with indeces) to find decision boundaries on. This involves a lot of data and data that can originate from unmanaged memory.

Open Questions

An important question regarding this proposal is whether the pattern with generic parameter TComparer (e.g. constrained to where TComparer : IComparer<T>) is a pattern that can be approved. This pattern allows for inlineable comparers at the cost of increased code size, if no value type comparers are used, there should be no difference. This pattern is also used in the proposal for BinarySearch in https://github.com/dotnet/corefx/issues/15818

The API relies on being able to depend upon System.Collections.Generic, could this be an issue?

@karelz @jkotas @jamesqo

Updates

UPDATE 1: Change API to be defined as extension methods.
UPDATE 2: Add compounded type usage.
UPDATE 3: Add link to BinarySearch and point on the pattern used.

Existing Sort APIs

A non-exhaustive list of existing sorting APIs is given below for comparison.

Array.Sort Static Methods

Found in ref/System.Runtime.cs

public static void Sort(System.Array array) { }
public static void Sort(System.Array keys, System.Array items) { }
public static void Sort(System.Array keys, System.Array items, System.Collections.IComparer comparer) { }
public static void Sort(System.Array keys, System.Array items, int index, int length) { }
public static void Sort(System.Array keys, System.Array items, int index, int length, System.Collections.IComparer comparer) { }
public static void Sort(System.Array array, System.Collections.IComparer comparer) { }
public static void Sort(System.Array array, int index, int length) { }
public static void Sort(System.Array array, int index, int length, System.Collections.IComparer comparer) { }
public static void Sort<T>(T[] array) { }
public static void Sort<T>(T[] array, System.Collections.Generic.IComparer<T> comparer) { }
public static void Sort<T>(T[] array, System.Comparison<T> comparison) { }
public static void Sort<T>(T[] array, int index, int length) { }
public static void Sort<T>(T[] array, int index, int length, System.Collections.Generic.IComparer<T> comparer) { }
public static void Sort<TKey, TValue>(TKey[] keys, TValue[] items) { }
public static void Sort<TKey, TValue>(TKey[] keys, TValue[] items, System.Collections.Generic.IComparer<TKey> comparer) { }
public static void Sort<TKey, TValue>(TKey[] keys, TValue[] items, int index, int length) { }
public static void Sort<TKey, TValue>(TKey[] keys, TValue[] items, int index, int length, System.Collections.Generic.IComparer<TKey> comparer) { }

List<T>.Sort Member Methods

Found in ref/System.Collections.cs

public partial class List<T> : System.Collections.Generic.ICollection<T>, System.Collections.Generic.IEnumerable<T>, System.Collections.Generic.IList<T>, System.Collections.Generic.IReadOnlyCollection<T>, System.Collections.Generic.IReadOnlyList<T>, System.Collections.ICollection, System.Collections.IEnumerable, System.Collections.IList
{
    public void Sort() { }
    public void Sort(System.Collections.Generic.IComparer<T> comparer) { }
    public void Sort(System.Comparison<T> comparison) { }
    public void Sort(int index, int count, System.Collections.Generic.IComparer<T> comparer) { }
}

LINQ OrderBy Extension Methods

Found in ref/System.Linq.cs

public static System.Linq.IOrderedEnumerable<TSource> OrderBy<TSource, TKey>(this System.Collections.Generic.IEnumerable<TSource> source, System.Func<TSource, TKey> keySelector) { throw null; }
public static System.Linq.IOrderedEnumerable<TSource> OrderBy<TSource, TKey>(this System.Collections.Generic.IEnumerable<TSource> source, System.Func<TSource, TKey> keySelector, System.Collections.Generic.IComparer<TKey> comparer) { throw null; }
public static System.Linq.IOrderedEnumerable<TSource> OrderByDescending<TSource, TKey>(this System.Collections.Generic.IEnumerable<TSource> source, System.Func<TSource, TKey> keySelector) { throw null; }
public static System.Linq.IOrderedEnumerable<TSource> OrderByDescending<TSource, TKey>(this System.Collections.Generic.IEnumerable<TSource> source, System.Func<TSource, TKey> keySelector, System.Collections.Generic.IComparer<TKey> comparer) { throw null; }
api-approved area-System.Memory

Most helpful comment

All 53 comments

Or perhaps as extension methods.

These should be extension methods. They are in same category as the other similar methods being added: https://github.com/dotnet/corefx/pull/15080
https://github.com/dotnet/corefx/pull/15123

cc @KrzysztofCwalina

These should be extension methods.

Ok, would this not mean any optimizations specific to native/managed memory would not be possible as directly as when they are members? Not sure it has such big a relevance since sorting can be done on refs, but not everything is possible with refs as with pointers. Anyway, if you think this should not be an issue I will update the proposal.

Another concern, is generic type inference which in C# is not always working so well with many generic parameters... not that I can think of a specific case for the proposed API though, but in the face of implicit conversions or similar it is often lacking.

@nietras IMHO, I don't know if this is really a needed API. What scenarios are there in which someone would want to sort a Span? It's mostly going to be used to hold byte/char buffers, where sorting won't make much sense. Also, things like equality comparers are high-level and don't really fit in with low-level constructs like Span.

What scenarios are there in which someone would want to sort a Span?

@jamesqo plenty! ๐Ÿ˜ƒ Ranging from database engines, computer vision to artificial intelligence. In the latter cases, the types will also range fully from byte to double. And in these fields memory recycling, buffer pooling is paramount to archieving reliable performance. And performance in general important so inlining the compare is important too.

things like equality comparers are high-level

Not sure I get that, but is the issue with the "default" comparers needed for Sort() without parameters? I could live with removing the overloads that do not take a TComparer if that is the concern?

```c#
public void Sort(this Span keys, Span items, TComparer comparer) where TComparer : IComparer;
public void Sort(this Span keys, Span items, System.Comparison comparison); // Convenience overload

Nitpick: the `T`s in the two lines above should be `TKey`:

```c#
public void Sort<TKey, TValue, TComparer>(this Span<TKey> keys, Span<TValue> items, TComparer comparer) where TComparer : IComparer<TKey>;
public void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> items, System.Comparison<TKey> comparison); // Convenience overload

@svick thanks I also forgot static

@nietras

Ranging from database engines, computer vision to artificial intelligence. In the latter cases, the types will also range fully from byte to double. And in these fields memory recycling, buffer pooling is paramount to archieving reliable performance. And performance in general important so inlining the compare is important too.

Do you have a concrete example where a sort function would be needed, though? I think the 95% use case for this API would be passing around byte buffers/char buffers. In those scenarios, sorting isn't a function that would be needed, and would only be more API bloat.

Not sure I get that, but is the issue with the "default" comparers needed for Sort() without parameters? I could live with removing the overloads that do not take a TComparer if that is the concern?

The issue, actually, is with the overloads with the comparers. Making them generic on the type of the comparer won't improve perf because everyone passes around comparers as IEqualityComparer<T>. So you will still have virtual calls.

concrete example where a sort function would be needed, though?

@jamesqo I tried giving a more concrete in the proposal regarding machine learning.

95% use case for this API would be passing around byte buffers/char buffers

If .NET Core and Span<T> only target audience is web developers, where shuffling bytes to/from text/chars, then perhaps you are right. But IMHO it is a very limited target audience and misses the fact that in a cloud/mobile first world that is going to be infused with artificial intelligence everywhere, handling massive amounts of data/measurements is a must. This includes all data types, byte to double. Pure sales talk, uff ๐Ÿ™„

because everyone passes around comparers as IEqualityComparer

I don't and never have. ๐Ÿ˜‰ Passing comparers around is not something we do and it does not have any impact on the design of this API. It still works as a "non-generic" API and if it is only used with reference type comparers there will be no code bloat or anything. So I do not see the downside here, the overloads are there for convenience and flexibility just as with the existing APIs incl. Array.Sort.

The implementation can forward to the same single implementation, for example. However, by having the possibility for using a value type comparer we allow developers that need the perf and understand that this will cause code to be generated specifically for that case to do what they need. No virtual calls. In the domain of numerical processing that is what we need.

We could do Sort(TComparer comparer) where TComparer : IComparer. This would be devirtualized.

@KrzysztofCwalina It will not unless you pass in a custom struct comparer. For example, Sort(Comparer<T>.Default) will have TComparer == Comparer<T> and you will be making virtual calls to Comparer<T>.Compare.

There is another thing-- I don't think the overloads accepting TComparer will even compile. See here. I ran into this issue a few months ago when I was trying to use generics in a similar fashion.

I don't think the overloads accepting TComparer will even compile

Yes, it does ๐Ÿ˜„ Check out Sort Extensions examples with the following code:

using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;

namespace System
{
    public struct Span<T> { }

    public static class SpanExtensions
    {
        public static void Sort<T>(this Span<T> span)
        { }

        public static void Sort<T, TComparer>(this Span<T> span, TComparer comparer)
           where TComparer : IComparer<T>
        { }

        public static void Sort<T>(this Span<T> span, System.Comparison<T> comparison)
        { }

        public static void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> items)
        { }

        public static void Sort<TKey, TValue, TComparer>(this Span<TKey> keys,
           Span<TValue> items, TComparer comparer)
           where TComparer : IComparer<TKey>
        { }

        public static void Sort<TKey, TValue>(this Span<TKey> keys,
           Span<TValue> items, System.Comparison<TKey> comparison)
        { }
    }

    public static class Usage
    {
        struct ReverseComparer : IComparer<int>
        {
            [MethodImpl(MethodImplOptions.AggressiveInlining)]
            public int Compare(int a, int b)
            {
                if (a == b) { return 0; }
                if (a > b) { return -1; }
                return 1;
            }
        }

        public static void SingleSpan()
        {
            var span = new Span<int>();

            span.Sort();

            // Inlineable struct comparer
            var reverseComparer = new ReverseComparer();
            span.Sort(reverseComparer);

            // Lambda comparer
            span.Sort((a, b) => a == b ? 0 : (a > b ? -1 : 1));
        }

        public static void TwoSpans()
        {
            var keys = new Span<int>();
            var items = new Span<double>();
            keys.Sort(items);

            // Inlineable struct comparer
            var reverseComparer = new ReverseComparer();
            keys.Sort(items, reverseComparer);

            keys.Sort(items, (a, b) => a == b ? 0 : (a > b ? -1 : 1));
        }
    }
}

This compiles fine, ReverseComparer will be inlined. If people use Comparer<T>.Default the behaviour is the same as for Array.Sort and I see no issue with that, i.e. it will use it as Comparer<T> and there will be a virtual call. If necessary we can do some of the same checks that I think happen in Array.Sort to check if it is default comparer e.g. for int then use optimized int sort. Just as the non-generic Array.Sortdoes.

UPDATE: @jamesqo in your example you have:

public void M<T, TEnumerable>(TEnumerable e) where TEnumerable : IEnumerable<T> { }

which means you expected it to infer T from TEnumerable alone, which C# doesn't do... well at least not as far as I know. Another issue is generic type inference in the face of implicit conversion, this also often chokes the compiler. That will might show up when I get to BinarySearch... on ReadOnlySpan<T>.

@KrzysztofCwalina is this ready for review?

@joperezr since @KrzysztofCwalina hasn't replied, I think it is ready for review.

which one is the final proposal? The APIs at the top of this issue, or there are updates described inline?

Yes, the extension methods.

On Feb 1, 2017 3:29 PM, "Krzysztof Cwalina" notifications@github.com
wrote:

which one is the final proposal? The APIs at the top of this issue?

โ€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/dotnet/corefx/issues/15329#issuecomment-276670744,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKTG73siLLfb5BrZS48wAc146jtJU9oCks5rYJbIgaJpZM4LoiKj
.

The scenario and proposed overloads all make sense. Minor issue: the one that takes two spans (keys and items) should call the items values as that's what array did. So it should be:

C# public static class SpanExtensions { public static void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> values); public static void Sort<TKey, TValue, TComparer>(this Span<TKey> keys, Span<TValue> values, TComparer comparer) where TComparer : IComparer<TKey>; public static void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> values, System.Comparison<TKey> comparison); }

We should discuss the extension method pattern for span holistically -- I don't think that this a good idea for our customers. It introduces a somewhat weird type (SpanHelpers) we no additional benefit. What is the reason these should be extension methods? (filed dotnet/corefx#15914 for it)

Approved and up-for-grabs, anyone wants to take it? @nietras?

should call the items values

@terrajobst if you look at the ref sources I included at the bottom of the proposal, these are in fact called items, this is where I took it from. E.g.:

public static void Sort<TKey, TValue>(TKey[] keys, TValue[] items) { }

Are there conflicting names, or are the ref sources not correct?

We should discuss the extension method pattern for span holistically -- I don't think that this a good idea for our customers.

The original proposal was for member methods, but I guess there are some implementation issues related to fast/slow span. @jkotas should answer that. I would have preferred member methods too.

@nietras

You're correct. I looked at apisof.net but apparently I suck at reading because it's called items there as well. Never mind then; items it is :-)

Approved and up-for-grabs, anyone wants to take it? @nietras?

@karelz I would love to to do this. I even had this idea to base this on "Multi-Pivot Quicksort: Theory and Experiments" (or newer e.g. "comparison-optimal dual-pivot quicksort algorithms" see http://itu.dk/people/maau/additional/2016_vortrag_arco_QS.pdf) using a multi-pivot quick-sort algorithm (point being it is more important to have fewer cache misses than comparisons).

But given the scope of this, testing alone, perf etc., I do not think I will have time to do this within a reasonable time frame. No doubt, it would also be best to simply port the existing sort code and tests to SpanExtensions. If I get time and no one has started this, I will let you know.

Approved and up-for-grabs, anyone wants to take it? @nietras?

@karelz I would love to to do this. I even had this idea to base this on "Multi-Pivot Quicksort: Theory and Experiments" (or newer e.g. "comparison-optimal dual-pivot quicksort algorithms" see http://itu.dk/people/maau/additional/2016_vortrag_arco_QS.pdf) using a multi-pivot quick-sort algorithm (point being it is more important to have fewer cache misses than comparisons).

But given the scope of this, testing alone, perf etc., I do not think I will have time to do this within a reasonable time frame. No doubt, it would also be best to simply port the existing sort code and tests to SpanExtensions. If I get time and no one has started this, I will let you know.

Sounds good, let's see if anyone else wants to help with the work in the meantime ...

@karelz you can assign this to me if you like. I will start working on this tentatively, it will probably take some time, due to writing tests mostly I think. First goal is simply porting the existing Array.Sort code with some minor modifications.

And then we could just use Span<T>.Sort as implementation for Array and List<T>?

@nietras How are things going with this? We recently forked for 2.1-preview1, but there may still be time to get this in for the next preview. If you find yourself pressed for time, no worries, let us know and we can reassign. Thanks! :)

How are things going with this?

@GrabYourPitchforks things are progressing fine. I have ported all the code, basic tests pass, but I'm still trying to iron out some perf issues after I consolidated code to have both with and without items in one code base. Before that perf was on par or even better than coreclr even for the C++ optimized types.

The code can actually be found at https://github.com/DotNetCross/Sorting/blob/span-sort/src/DotNetCross.Sorting/SpanSortHelpers.cs I am working on it in that repo. Any feedback is welcomed.

I'll try to make a preminary PR for this soon, I need some feedback anyway. Especially regarding testing etc.

It would be great if this could be part of next preview.

And then we could just use Span.Sort as implementation for Array and List?

@LYP951018 that question is a bit tricky due to how the coreclr and corefx code is structured. Array.Sort is part of coreclr for example. Someone else might be better at answering whether in the future it might be possible to use this version for all sorts.

consolidated code to have both with and without items in one code base

I do not think you need to do that. It is fine to have two copies - one without and one with items.

I do not think you need to do that. It is fine to have two copies - one without and one with items.

@jkotas I thought it would be less maintenance and with some typeof == typeof tricks without perf issues. But yes I may have to revert those changes then. It just means a lot of duplicated code. It's almost 1000 lines, and it also impacts type specializations I have done.

Are there any workarounds to apply sorting on a Span without having to copy it to an array?

This is so ugly:
C# var ohNoAnArray = awesomeSpan.ToArray(); Array.Sort(ohNoAnArray, (p1, p2) => p1.CompareTo(p2))); ohNoAnArray.CopyTo(awesomeSpan);

@thorgeirk11 this proposal of course addresses this, unfortunately my PRs (https://github.com/dotnet/corefx/pull/26859 and https://github.com/dotnet/coreclr/pull/16986) for adding these got rejected/closed. Does not seem to have been added by anyone else after.

However, the implementation works and all tests pass of this, the main issue was around security I think. In https://github.com/dotnetcross/sorting the same implementation can be found. I have not had time to finalize this and make it available in a nuget package though, as a complement to the -- hopefully at some point -- implementation of this proposal in .NET Core proper.

Regarding use cases:
Maybe I am misunderstanding this, but I was trying to stackalloc a Span of simple structs, then sort it, but this seems impossible to do without this issue being solved. Is that correct?

Example (which doesn't compile):

struct BoneAttachment {
  int Bone;
  float Weight;

  private sealed class WeightRelationalComparer : IComparer<BoneAttachment>
  {
    public int Compare(BoneAttachment x, BoneAttachment y)
    {
      return y.Weight.CompareTo(x.Weight);
    }
  }

  public static IComparer<BoneAttachment> WeightComparer { get; } = new WeightRelationalComparer();
}

Span<BoneAttachment> attachments = stackalloc BoneAttachment[16];
// ... fill with data
attachments.Sort(BoneAttachment.WeightComparer); // <--- This doesn't exist

@shartte yes only when this proposal has been implemented will you be able to do that.

I did a PR for this in https://github.com/dotnet/corefx/pull/26859 but this was rejected.

In https://github.com/DotNetCross/Sorting I have this in separate source form but haven't had time to clean this up and package as a nuget package for general use.

If people are still waiting for things like this; I wrote a nuget package that's similar in spirit: Anoprsst. However, I didn't try to keep the API similar to Array.Sort since I was aiming to avoid perf-pitfals (i.e. no virtual calls, ever). In particular that means you can do someSpan.SortIComparableUsing().Sort() but that's constrained to sorting value-types only, otherwise you need to use someSpan.SortUsing(new MyOrdering()).Sort() with a struct ordering. The serial implementation appears to outperform Array.Sort in all cases, sometimes by a large margin; and for large arrays it uses parallel quick sort for further speedup.

I have a question regarding the MemoryExtensions.Sort<T, TComparer>(...) APIs, I've looked at both the source and this issue (and the related ones) but I'm confused about the current implementation. I might just be missing something here so in that case please pardon my ignorance ๐Ÿ˜…

I've seen both in this issue and the other one on the same subject, that the idea was to add overloads with a generic comparer so that when users passed that with a value type implementing IComparer<T>, the reified generics would come into play allow the JIT to just inline the whole comparer, which is great. I'm not actually seeing this in the current implementation though, is this just a placeholder and the actual API is still work in progress on this front? I mean:

https://github.com/dotnet/runtime/blob/6b5850fa7cf476b66be9d01a3ed05f4484b01d3a/src/libraries/System.Private.CoreLib/src/System/MemoryExtensions.cs#L1810-L1817

Here the whole comparer is just boxed and passed as an IComparer<T>, so isn't the whole point of having a generic comparer lost along the way? And if this is indeed just a placeholder that right now falls back to the normal "slow" implementation, then why was this added before the actual API was available - is it just to allow users to write code around this API, and just enabling this in the future with a runtime update giving a performance boost to those that have been using it already?

isn't the whole point of having a generic comparer lost along the way?

@Sergio0694 yes. That feature is not part of the current implementation, and was as you mention one of the main things about my proposal to allow inlineable value type comparers. I don't know the exact reasons not to include this but the downside is code size.

Shameless plug, I have this feature implemented in DotNetCross.Sorting. Just haven't finished it yet for nuget publication. One point with the nuget library was also to have the new sorting API available for "older" runtimes.

From my perspective, we should either remove the generic parameter or avoid the boxing. My preference is the former; I don't think having it be generic over the comparer type adds enough benefit to counteract the cost and complexity (even the array-sorting overload that takes an interface allocates an object).

@jkotas, do you have an opinion?

@stephentoub I agree with your assessment.

Oh I'm sorry to hear that, I thought the direction with a heavier usage of generics to squeeze out more performance was actually pretty great. Just as an idea, if the concern is the increased code size with the two implementations (the generic one, and the one using Comparison<T>), wouldn't it be possible to introduce a new (internal) type like this one, to wrap IComparer<T> values:

public readonly struct ValueComparer<T> : IComparer<T>
{
    private readonly Comparison<T> comparer;

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public ValueComparer(IComparer<T> comparer)
    {
        this.comparer = comparer.Compare;
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Compare(T x, T y)
    {
        return comparer(x, y);
    }
}

And then make the ArraySortHelper<T> type always be generic in both the item type and the comparison type? This could use either the user-specific comparer if one is provided, or in the other paths for all the current APIs it'd just wrap the supplied Comparison<T> delegate into an instance of this ValueComparer<T> struct and just pass that around? This would allow to have a single implementation that would work in both cases. Shouldn't the performance improvement in the case that a generic comparer is supplied by the user be worth the time to make this change?

As another example, in the case where T : IComparable<T>, users could just pass this:

public readonly struct ImplicitValueComparer<T> : IComparer<T>
    where T : IComparable<T>
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Compare(T x, T y)
    {
        return x.CompareTo(y);
    }
}

And it would basically be free performance, as the sort algorithm could then always inline this and skip one indirect call that it's not doing for each single comparison, to that Comparison<T> instance.

As an example, we're now using this same approach in ImageSharp as well (here) (I actually got the original idea for this from bepuphysics2), and I've added similar APIs to the Microsoft.Toolkit.HighPerformance package as well (here). In my tests, there was a very decent performance improvement when the supplied "value delegate" was small enough to be inlined, which I guess would be desireable especially for users trying to sort a Span<T> in the first place?

Just a thought, I understand there might be other factors here I'm not properly considering ๐Ÿ˜Š

@Sergio0694 I tried to consolidate the sorting code based on the patterns you describe (going down the value type comparers rabbit hole futilely), but sadly the performance of this simply isn't there due to limitations in the runtime currently (inlining, CG, etc.). The canonical representation of generic types for reference type generic parameters is an issue too (the main issue, while the pattern works wonders for value types it doesn't for reference types, I have spent an exorbitant amount of time looking into this ๐Ÿคฆโ€โ™‚๏ธ ). Haven't completely given up consolidating some here with the advent of function pointers in C# 9, but the reality is I had to create 4 versions of sorting code to get decent perf. Or in some cases 1.66x better perf than what is in .NET 5 now. At least last I benchmarked.

I don't think having it be generic over the comparer type adds enough benefit to counteract the cost and complexity

@stephentoub @jkotas I, of course ๐Ÿ˜ƒ , disagree with this. I specifically designed the API around being able to define value type comparers for inlineable custom comparisons, which can have significant performance improvements for custom scenarios. Not all of users are sorting primitives only. If you remove it from the API, this could tie the API down for the future.

On the other hand, if I finish DotNetCross.Sorting this could be available outside of BCL and hence fulfil this requirement. But it is a heavy burden to bear ๐Ÿ˜… Fast and flexible sorting is in my view a fundamental part of a great platform.

If you remove it from the API, this could tie the API down for the future.

If we have:
```C#
public static void Sort(Span items, IComparer? comparer)

now, what prevents us from adding:
```C#
public static void Sort<T, TComparer>(Span<T> items, TComparer comparer) where TComparer : IComparer<T>?

as an overload later?

Right now the overload is a pit of failure: someone tries to use a struct TComparer thinking it'll avoid allocations and enable devirtualization and inlining, and not only does it not enable devirtualization and inlining, it actually results in two allocations.

what prevents us from adding

Nothing, except of course poor overload resolution ๐Ÿ˜… Especially, if implicit conversions are considered. But that may not be a big concern. I hope you will keep it though and that we could find a solution to the implementation issue. Either using source generator or something. Would it be a problem to have multiple versions of sorting code if they were generated? Only one version to maintain? Is the main objection to the API the runtime code generation cost e.g. instruction as bytes?

someone tries to use a struct TComparer thinking it'll avoid allocations and enable devirtualization and inlining, and not only does it not enable devirtualization and inlining, it actually results in two allocations

@stephentoub it's pretty bad now yes ๐Ÿ˜…. I am not a fan of the forced allocations going on for sure. Including for a reference type comparer since this gets converted to a delegate. This can be solved simply by adding appropriate code for it, though. It's not hard to add, it just another "variant".

@stephentoub I would also encourage you to look at float sorting performance in .NET 5 compared to the native in .NET Core 3.1. As far as I can tell it is not on par, due to problems with inlining CompareTo. That's why my version gets a decent perf boost compared to .NET 5. Example given below, not exactly 1.66x in this but definitely significantly faster. Note this was preview.2 though. ๐Ÿ˜ƒ

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.720 (1909/November2018Update/19H2)
Intel Core i7-8700 CPU 3.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.100-preview.2.20176.6
  [Host] : .NET Core 5.0.0 (CoreCLR 5.0.20.16006, CoreFX 5.0.20.16006), X64 RyuJIT

Platform=X64  Runtime=.NET Core 5.0  Toolchain=InProcessNoEmitToolchain  
IterationCount=9  LaunchCount=1  RunStrategy=Monitoring  
UnrollFactor=1  WarmupCount=3  

| Method | Filler | Length | Mean | Error | StdDev | Ratio | RatioSD | Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------------------------- |------- |-------- |----------:|----------:|----------:|------:|--------:|-----------:|------:|------:|------------:|
| CLR_ | Random | 1000000 | 288.72 ms | 0.558 ms | 0.332 ms | 1.00 | 0.00 | - | - | - | - |
| DNX_ | Random | 1000000 | 185.84 ms | 4.286 ms | 2.550 ms | 0.64 | 0.01 | - | - | - | - |

Of course this perf boost is nothing compared to @damageboy VxSort. Nothing compares to that ๐Ÿš€

Nothing, except of course poor overload resolution

In what way?

I hope you will keep it though and that we could a solution to the implementation issue.

We're close to being done with .NET 5. What solution are you proposing to implement? The current API is the worst of all relevant worlds: it has a more complicated and intimidating signature, it doesn't provide the intended benefits, and it leads one to use a coding style (using a struct and not caching an instance) that leads to worse performance than if a simple lambda were just used. You can also get the same advanced benefits by sorting structs with custom IComparable<T> implementations (though I understand that represents only a portion of the scenarios trying to be optimized.)

It's not hard to add, it just another "variant"

Yes, a lot of duplicated code, when we're already facing more duplication than we'd like.

If you really believe this is critical, show us what the complete solution looks like.

I would also encourage you to look at float sorting performance in .NET 5 compared to the native in .NET Core 3.1

That is unrelated to this API; you didn't show your code, but the perf should be comparable between 3.1 and 5 without a supplied comparer and just using the type's IComparable<T> implementation. So if it's not, please open a separate issue.

show us what the complete solution looks like.

The code is here https://github.com/DotNetCross/Sorting/tree/master/src/DotNetCross.Sorting/Implementations and is a mayhem of 2 x 4 variant == 8 copies. I have pointed to this before. Note this code was explicitly split up to show this in detail and to show the parts of the sorting. Probably not to every bodies liking. No reason this couldn't be generated, although I see perf issues for special cases. Note also these implementations are based on a "pure" refs and Unsafe code since this has to work well on old runtimes.

I still think there is a "solution" to the canonical representation issue, but the JIT has too many issues around inlining (doesn't inline methods with delegates for example ๐Ÿคทโ€โ™‚๏ธ) which I have commented on in other places. So I have given up on that honestly.

I'm fine with keeping this in a separate repo/nuget package. If no value can be seen in having this in the BCL.

"Right now the overload is a pit of failure: someone tries to use a struct TComparer thinking it'll avoid allocations and enable devirtualization and inlining, and not only does it not enable devirtualization and inlining, it actually results in two allocations."

Yes, I definitely can't disagree with that. My first thought when seeing that API was like "Ah, great! It's nice they went full-on generic for best performance", then just decided to check the source out of curiosity and discovered that that was actually a lie ๐Ÿคฃ

"You can also get the same advanced benefits by sorting structs with custom IComparable implementations (though I understand that represents only a portion of the scenarios trying to be optimized.)"

Ah, I just noticed there's a separate generic array sort helper for that case, missed that earlier.
I wonder if in cases where you just do a check for CompareTo(y) >= 0, the JIT is able to just inline the completely and just literally emit a single comparison for primitive types. I mean, since CompareTo would normally have a couple conditional branches otherwise ๐Ÿค”

I'd test this out but sharplab.io is down right now ๐Ÿ˜†

CompareTo(y) >= 0, the JIT is able to just inline the completely and just literally emit a single comparison for primitive types

For int it does. For float it doesn't. Hence, why I have a TDirectComparer version. Depends highly on the code around and the optimizations being done. You are basically praying to the JIT gods that this code quality will be good. Again these observations are points in time. The runtime changes/improves here. But if you are on an older runtime and can't switch you can't rely on these improvements.

IComparer<T> is a bit of a problem at least for sorting. For sorting it would be better to just have an interface like:

public interface ILessThan<T>
{
    bool LessThan(T x, T y);
}

same for IComparable<T>. But that's history. :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Timovzl picture Timovzl  ยท  3Comments

jzabroski picture jzabroski  ยท  3Comments

sahithreddyk picture sahithreddyk  ยท  3Comments

chunseoklee picture chunseoklee  ยท  3Comments

jchannon picture jchannon  ยท  3Comments