Wpf: Struct System.Windows.Size has non-optimal GetHashCode implementation

Created on 11 Sep 2020  路  3Comments  路  Source: dotnet/wpf

  • .NET Core Version: all until at least 3.1
  • Windows version: n/a
  • Does the bug reproduce also in WPF for .NET Framework 4.8?: Yes
  • Is this bug related specifically to tooling in Visual Studio (e.g. XAML Designer, Code editing, etc...)? no

    Problem description:
    struct System.Windows.Size uses non-effective GetHashCode function - it uses xor without shifting, recommended at the official guidelines.

    Actual behavior:
    Hash code of all square sizes are equal 0. E.g. Size(X, X).GetHashCode() == 0 for each X.
    Code: https://github.com/dotnet/wpf/blob/ae1790531c3b993b56eba8b1f0dd395a3ed7de75/src/Microsoft.DotNet.Wpf/src/WindowsBase/System/Windows/Generated/Size.cs#L148

    Expected behavior:
    Hash Code should be computed with shifting/multiplication, for example:
    ```C#
    public override int GetHashCode()
    {
    if (IsEmpty)
    {
    return 0;
    }
    else
    {
    return (Width.GetHashCode() * 397) ^ Height.GetHashCode();
    }
    }

 **Minimal repro:**
The following code has ```O(N^2)``` complexity, because of collisions:

```C#
var set = new HashSet<Size>();
foreach(var x in Enumerable.Range(0, 100_000) {
    set.Add(new Size(x, x)); // O(N) average complexity
}

Most helpful comment

I'd say a better alternative would be using HashCode.Combine instead of direct XORing.

All 3 comments

I'd say a better alternative would be using HashCode.Combine instead of direct XORing.

Rect and Point have similar issues: all fields are xor'ed together for hash value.
Actually, every type in this directory have this issue:
https://github.com/dotnet/wpf/tree/ae1790531c3b993b56eba8b1f0dd395a3ed7de75/src/Microsoft.DotNet.Wpf/src/WindowsBase/System/Windows/Generated

With this scale of issue. other types should be checked too.

  • The same issue is actual and for System.Drawing.Size, System.Drawing.Point types too.
  • Please also note that Size(x, y).GetHashCode() == Size(y, x).GetHashCode() for arbitrary x and y in the Microsoft's implementation. So it's obviously not an even distribution.

I'd propose this approach:

public override int GetHashCode()
{
    int x = unchecked(X * 397);
    // Swap low and high 16-bits of x and then XOR with Y
    return ((x << 16) | ((x >> 16) & ushort.MaxValue)) ^ Y;
}

The main idea is to not allow collisions for X and Y that are close to each other, making a distribution more even.
There is no collisions in the range -4096 <= X <= 4096; -4096 <= Y <= 4096 for this approach.

Was this page helpful?
0 / 5 - 0 ratings