Fsharp: The performance of Comparing and Ordering things

Created on 31 May 2020 · 13Comments · Source: dotnet/fsharp

This story began _half a decade_ ago with an adventurous implementation, and received a more reasonable, albeit more limited, implementation a mere _two years ago_ yet as the tides of time go neither of these is (trivially) merge-able into current master. So rather than continuing to hit myself in the head with a hammer making an updated version, I'd rather _put this out to the people who actually pull the strings of Pull Requests in order to determine if this effort will yield any fruit_ (or just a completely mashed head).

So, why's this a problem anyway? Here's a little use of structural comparison within a dictionary setting with NodaTime's Instant as the key...

module Program

(*
<PackageReference Include="BenchmarkDotNet" Version="0.12.1" />
<PackageReference Include="NodaTime" Version="3.0.0" />
*)

open System
open System.Collections.Generic
open BenchmarkDotNet.Attributes
open BenchmarkDotNet.Running
open NodaTime

[<MemoryDiagnoser>]
type ComparisonComparison () =
    let mutable instants : array<Instant> = Unchecked.defaultof<_> // some bunch of instants

    let checkContains (dict:IDictionary<Instant,_>) =
        for i = 0 to instants.Length-1 do
            if i % 2 = 0 <> (dict.ContainsKey instants.[i]) then
                failwith "unexpected"

    let mutable dictOperators                 = Unchecked.defaultof<IDictionary<Instant, obj>>
    let mutable dictEqualityComparerDefault   = Unchecked.defaultof<IDictionary<Instant, obj>>
    let mutable dictHashIdentityStructural    = Unchecked.defaultof<IDictionary<Instant, obj>>
    let mutable dictHashIdentityNonStructural = Unchecked.defaultof<IDictionary<Instant, obj>>

    [<Params (1, 100, 10000)>] 
    member val public DictSize = 0 with get, set

    [<GlobalSetup>]
    member this.SetupData() =
        instants <-
            let rng = Random 42
            let range lower upper = rng.Next (lower, upper+1)
            Array.init (this.DictSize*2) (fun _ ->
                Instant.FromUtc (range 1900 2100, range 1 12, range 1 28, range 0 23, range 0 59, range 0 59))

        let tmpDictHashIdentityStructural    = Dictionary HashIdentity.Structural
        let tmpDictHashIdentityNonStructural = Dictionary HashIdentity.NonStructural
        let tmpDictEqualityComparerDefault   = Dictionary EqualityComparer.Default
        dictOperators <-
            instants
            |> Seq.mapi (fun idx instant -> (idx, instant))
            |> Seq.filter (fun (idx,_) -> idx % 2 = 0)
            |> Seq.map (fun (_,instant) ->
                tmpDictEqualityComparerDefault.[instant] <- null
                tmpDictHashIdentityStructural.[instant] <- null
                tmpDictHashIdentityNonStructural.[instant] <- null
                instant, null)
            |> dict
        dictEqualityComparerDefault   <- tmpDictEqualityComparerDefault
        dictHashIdentityStructural    <- tmpDictHashIdentityStructural   
        dictHashIdentityNonStructural <- tmpDictHashIdentityNonStructural

    [<Benchmark (Baseline=true)>]
    member _.Operators () = checkContains dictOperators

    [<Benchmark>]
    member _.EqualityComparerDefault () = checkContains dictEqualityComparerDefault

    [<Benchmark>]
    member _.HashIdentityStructural () = checkContains dictHashIdentityStructural

    [<Benchmark>]
    member _.HashIdentityNonStructural () = checkContains dictHashIdentityNonStructural

[<EntryPoint>]
let Main args =
    let switch = BenchmarkSwitcher [| typeof<ComparisonComparison>  |]

    switch.Run args |> ignore

    0

| Method | DictSize | Mean | Error | StdDev | Ratio | Gen 0 | Gen 1 | Gen 2 | Allocated |
|-------------------------- |--------- |-----------------:|--------------:|--------------:|------:|---------:|------:|------:|----------:|
| Operators | 1 | 1,412.23 ns | 5.008 ns | 4.439 ns | 1.00 | 0.0420 | - | - | 176 B |
| EqualityComparerDefault | 1 | 37.43 ns | 0.271 ns | 0.240 ns | 0.03 | - | - | - | - |
| HashIdentityStructural | 1 | 1,362.13 ns | 8.848 ns | 8.276 ns | 0.96 | 0.0305 | - | - | 128 B |
| HashIdentityNonStructural | 1 | 45.30 ns | 0.159 ns | 0.141 ns | 0.03 | - | - | - | - |
| | | | | | | | | | |
| Operators | 100 | 139,269.09 ns | 477.875 ns | 423.623 ns | 1.00 | 4.1504 | - | - | 17600 B |
| EqualityComparerDefault | 100 | 4,995.58 ns | 2.543 ns | 1.985 ns | 0.04 | - | - | - | - |
| HashIdentityStructural | 100 | 134,801.94 ns | 411.938 ns | 343.987 ns | 0.97 | 2.9297 | - | - | 12800 B |
| HashIdentityNonStructural | 100 | 5,993.03 ns | 4.839 ns | 4.290 ns | 0.04 | - | - | - | - |
| | | | | | | | | | |
| Operators | 10000 | 14,260,777.88 ns | 33,679.950 ns | 28,124.284 ns | 1.00 | 406.2500 | - | - | 1760021 B |
| EqualityComparerDefault | 10000 | 596,535.22 ns | 429.688 ns | 401.930 ns | 0.04 | - | - | - | 1 B |
| HashIdentityStructural | 10000 | 13,830,610.78 ns | 67,093.408 ns | 62,759.215 ns | 0.97 | 296.8750 | - | - | 1280232 B |
| HashIdentityNonStructural | 10000 | 708,332.98 ns | 643.601 ns | 502.482 ns | 0.05 | - | - | - | 1 B |

So I'm not sure if you notice there the slight difference in performance...

_* ?? Huh what ?? *_

That not a slight difference, that's a difference of ~20-25x slower using Structural comparison.

So why is this a problem and what can we do about it (well besides going back in a time machine and slurp in those PRs when they were ready...)

Well first of all we need to know _why_ the decision was made to do this, why it doesn't cause pain with built in types, and then decide what we can do about it.

In original .net land, by default, reference types were equal only if ReferenceEquals is true and value types were only equal if, in a bitwise comparison, they were true (which meant if they had embedded reference types they were only considered equal if those ReferenceEquals were equal), but this didn't provide the type of experience that fsharp wanted to provide. Containers (array, ResizeArray, Dictionary, etc) were just seen as ordinary reference types.

In f-sharp land, by default, records and discriminated unions (of both the reference and value type flavours) implement a series of methods/interfaces so that their compare ever element, and if the elements are all equal, then the object is declared equal. Containers (array, list, map, etc.) also implemented these rules so they too could do comparisons.

To facilitate the f-sharp rules, new interfaces IStructuralComparable and IStructuralEquatable were introduced. The first thing here to recognise is that these interface were not generic, which meant that in order to use the comparisons for value types boxing must occur. But they _had_ to be non-generic, because the one comparer was passed through chains of objects of different types to do the comparison. Rock and hard place land.

There was one more rule that f-sharp comparisons had that wasn't part of "normal" .net equality, and this was for floating point numbers NaN <> NaN.

So yeah that is basically the state of play. So what did I do about it? In my first attempt I created who comparison objects using lots of runtime magic to try to create efficient comparison objects that implemented all of the fsharp rules and was then dynamically created using Type.MakeGenericType. At the time this came across two obstacles. The first was that whilst the old 64-bit JIT did lots of runtime optimizations to make this very efficient, the new Ryu JIT didn't (fair enough, I really was bending, albeit not breaking, the rules). The second was that, at the time, AOT compilation was gaining traction and MakeGenericType was an obstacle to that path.

My second attempt recognised that the same magic that I was performing is basically performed by (Equality)?Comparer.Default (and AOT catered for that) that in most cases the fsharp compilers implementation of IStructural(Comparable|Equatable) was the same as it's implementation of I(IComparable|Equatable)<>. And so with a bit of RTTI I could filter out problematic cases (i.e. those that used floating point numbers due to the NaN case) and then just return the default comparer in most cases. _In fact this is basically what the standard f-sharp library is doing, as it just contains a list of known types (i.e intrinsic types) where it returns an efficient, non-boxing comparer. But they have no mechanism to make is available for non-built in types._

So the crux of my implementation relies on the following recursive type check:

let canUseDefaultEqualityComparer er (rootType:Type) =
    let processed = System.Collections.Generic.HashSet ()

    let rec checkType idx (types:Type[]) =
        if idx = types.Length then true
        else
            let ty = get types idx
            if not (processed.Add ty) then
                checkType (idx+1) types
            else
                let isValidGenericType ifNotType fullname =
                    if not (ty.IsGenericType && ty.GetGenericTypeDefinition().FullName.Equals fullname)
                    then ifNotType
                    else checkType 0 (ty.GetGenericArguments ())
                let isTypeAndGenericArgumentsOK fullname            = isValidGenericType false fullname
                let isNotTypeOrIsTypeAndGenericArgumentsOK fullname = isValidGenericType true  fullname

                // avoid any types that need special handling in GenericEqualityObj
                // GenericEqualityObj handles string as a special cases, but internally routes to same equality

                ty.IsSealed  // covers enum and value types
                             // ref types need to be sealed as derived class might implement  IStructuralEquatable
                && not (isArray ty)
                && (er || (not (ty.Equals typeof<float>)))
                && (er || (not (ty.Equals typeof<float32>)))
                && isNotTypeOrIsTypeAndGenericArgumentsOK "System.Nullable`1"
                && not (isStructuralEquatable ty
                        // we accept ValueTuple even though it supports IStructuralEquatable 
                        // if all generic arguements pass check
                        && not (   isTypeAndGenericArgumentsOK "System.ValueTuple`1"
                                || isTypeAndGenericArgumentsOK "System.ValueTuple`2"
                                || isTypeAndGenericArgumentsOK "System.ValueTuple`3"
                                || isTypeAndGenericArgumentsOK "System.ValueTuple`4"
                                || isTypeAndGenericArgumentsOK "System.ValueTuple`5"
                                || isTypeAndGenericArgumentsOK "System.ValueTuple`6"
                                || isTypeAndGenericArgumentsOK "System.ValueTuple`7"
                                || isTypeAndGenericArgumentsOK "System.ValueTuple`8"
                                || isTypeAndGenericArgumentsOK "Microsoft.FSharp.Collections.FSharpList`1"
                                || isTypeAndGenericArgumentsOK "Microsoft.FSharp.Core.FSharpOption`1"
                                || isTypeAndGenericArgumentsOK "Microsoft.FSharp.Core.FSharpValueOption`1"
                                || isTypeAndGenericArgumentsOK "Microsoft.FSharp.Core.FSharpResult`2"
                               )
                       )
                && checkType (idx+1) types

    checkType 0 [|rootType|]

So yeah, that's about it really. The rest of the PR was just tying this in so that it was efficiently used (i.e. only called once for any type) and attaching the various operators etc.

Anyway, this has come way, way, way too late to save my work code-base (where I have just ended up never using Structural Equality basically, not using default implementations of data structures, etc. sigh... maybe I'm the only one who cares about efficiency... why bother eh when you can just spin up another x number of cloud machines, he says, shaking his head...) but _if _ people are going to take this as a PR seriously then I'll update it and submit a new request...

But I ain't getting any younger!

(So yeah, if the answer is no, just close this Issue...)

Feature Request

Source

manofstick

❤3 👍2

All 13 comments

@KevinRansom @abelbraaksma well there you go, I have thrown it out to the Emperor for the thumbs up or thumbs down!

manofstick on 31 May 2020

@manofstick, so your algorithm, basically consists of figuring out which types can use the default comparer? What are the circumstances where the behavior of existing code may change, can it be detected?

I would say that we can't leave a 20-25 * speed up on the table, we just added a new library level special case for DateTime, and @TIHan is eager to improve the performance of structural equality. So if you will consider re-submitting the work, I will commit to fighting for it to be included, I can't guarantee that we will win the argument, but we can durned well try.

How does that sound?

Kevin

KevinRansom on 31 May 2020

@KevinRansom

OK; I'll see if I can make some time over the next week or so... But I would like confirmation that calling a bit of refection code (i.e. the above snippet contains IsGenericType and GetGenericTypeDefinition) is OK. It is only called once per execution (per Type), so for me that's fine, but that's because I'm usually writing long running processes... In the schema of things, even for short running ones, I really don't think it's an issue, but I would hate it to be a stumbling block for final acceptance, given that we know it's there at the start...

manofstick on 31 May 2020

This looks surprisingly much like a piece of code I have in my XPath implementation, where I've tried to get native type performance, and needed to cater for float semantics with NaN and INF. However, not completely comparable with my choice, nor reusable, as it allows mixing types in operators (float can be compared with int etc), which required some extra magic to get right.

@manofstick If I understand this correctly, the biggest slowdown is with struct types that are not native types, or that are not recognized specially (like now DateTime). And mixed struct (part is known, part isn't known).

I don't yet totally know what you do to get a comparer,and how that comparer looks, but I was thinking, if the type isn't special, would it be sufficient to do bitwise comparison? Isn't that what the default is in .NET land?

And another way forward: can we detect, at compile time, if there's a genetic compare/equate interface present, and use that?

Would it be helpful to set up a little brainstorm session on Slack or a call? It'll help to get our heads around this and in the same direction :)

abelbraaksma on 31 May 2020

&& checkType (idx+1) types

Little side note: your rec function is not tail-recursive. That may not matter much if struct types are expected to be small and contain few members. But if we choose this approach, and since this is a runtime check, we might as well optimize it to allow TCO.

abelbraaksma on 31 May 2020

@manofstick , if the measurements support the approach, then it wins. Otherwise not ... of course there may be issues of reliability, reproducibility that impact the outcome. But at the end of the day, numbers are the most convincing rhetoric I know.

Thanks

Kevin

KevinRansom on 3 Jun 2020

Very illuminating post by @manofstick . F# equality has a lot of technical debt:

Non-generic StructuralComparable and IStructuralEquatable. We need to get to an architecture without these.
Going out of it's way to have NaN <> NaN. There is no need to force NaN <> NaN more than .Net does. Just use whatever .Net gives and that will be more than enough deference to IEEE. x <> x is fundamentally illogical, however many engineers say otherwise.
Very complex equality rules.

I believe we need to take a breaking change to remove all the special cases. If it's possible:

Use a generic compare/equate interface when available as @abelbraaksma suggets.
Otherwise use the non-generic one.

charlesroddie on 4 Jun 2020

The route we would likely take for quality is this one by @TIHan - https://github.com/dotnet/fsharp/pull/6175

Likely a similar approach for comparison. Although the reflection-based approach can speed things up, we should avoid reflection if we can.

cartermp on 4 Jun 2020

👍1

Just use whatever .Net gives and that will be more than enough deference to IEEE. x <> x is fundamentally illogical, however many engineers say otherwise.

The most basic comparison instruction dealing with equality is the ceq opcode. Quote from the docs:

For floating-point number, ceq will return 0 if the numbers are unordered (either or both are NaN). The infinite values are equal to themselves.

I know from our discussion on Slack that you disagree with this point. And I've also come to understand that in C# == translates to ceq, while Double.Equals specifically returns true if both are NaN. Just saying that "whatever .NET does" doesn't have a simple answer, as it does both in surprisingly subtle ways (apart from the sentiment to adhere to standards like IEEE-754).

(unfortunately I don't have a "one size fits all" solution either)

abelbraaksma on 5 Jun 2020

@charlesroddie

In regards to IStructuralComparable and IStructuralEquatable you can't have a generic version of these (by usage, not just current design). The key is in the name Structural... i.e. the comparer is intended to be passed through a range of objects not just one top level one. As in if you're comparing a tuple, then you need to know all the underlying types - not just the top level type, which is all a generic version would help you with! So really you need some magical factory - which is effectively what System.Collections.Generic.Comparer.Default (and hence it's IEquatable<> implementation uses) is. I.e. it generates a new Type on the fly - the comparer - where there is some underlying code generation going on. This was really the path I took with my first attempt, where I originally used the Emit library (via Linq.Expressions) but managed to whittle that away to purely used MakeGenericType with lots of generic magic. This was frowned up.

But yeah, if it was to exist it would have to have something like:

type IComparerFactory =
    abstract Get<'a> : unit -> IStructuralComparable<'a>

and IStructuralComparable<'a> =
    abstract CompareTo : item:'a -> IComparerFactory -> int

Given that, if anything, there has been a push to reduce generics reliance in the BCL, I highly doubt that such a interface, requiring such magic, will ever exist.

In regards to NaN. Meh, it is what it is.

...and I highly, highly, doubt any breaking change would ever happen...

manofstick on 8 Jun 2020

@abelbraaksma

I don't yet totally know what you do to get a comparer,and how that comparer looks, but I was thinking, if the type isn't special, would it be sufficient to do bitwise comparison? Isn't that what the default is in .NET land?

All I do is use (Equality)?Comparer<'a>.Default where I can - i.e. following the rules based on type analysis. (i.e. I can use it on a struct tuple as the struct tuples implementation uses the default comparer for it's comparison, so if all it's elements satisfy use, then it's good)

Now really my code should only be used when the type isn't known at runtime (i.e. in libraries, such as the use of dict) The change that @TIHan did in Devirtualize Equality should be used (with the same set of rules that I use) at compile time. I haven't tried to merge in his change with this one. There may be some issues, as I do a little bit of work with the optimizer (to narrow call paths basically - as although they should just get inlined by the JIT it is better because a) it doesn't need the extra work and b) F# shoves tail instructions in there which cause the JIT to balk at this so you can get terrible performance for > 64 bit structures...)

manofstick on 8 Jun 2020

@cartermp

The route we would likely take for quality is this one by @TIHan - #6175

Hahah, you're not inspiring me with confidence given that that's still open after ~ a year and a half!

manofstick on 8 Jun 2020

👀1

@KevinRansom over to you. Feel free to close this issue; the code is all there in #9404 - let's see if can make it past the gatekeepers this time!!

manofstick on 9 Jun 2020

🚀1

Was this page helpful?

0 / 5 - 0 ratings