Core: long instead of int - all over the place

Created on 17 Oct 2018 · 4Comments · Source: dotnet/core

With .NET Core being a kinda new development, have you considered to make the switch from int to long - all over the place?

So for exmample, any .Count/.Length -> GetHashCode - and many many other places?
Do you still consider the rise of memory consumption a problem?

Before you say something like
"But this would break existing code.", thats the whole point of .NET Core that you can easilier break stuff as you could have with .NET Framework, since .NET Core can be deployed on app basis instead of machine basis and we as a developer have to make the decision if we want or don't want to upgrade, with those changes.

Just curious what the opinion is on that matter, and if you have considered it the slightest.

As real problem I faced the other day, I noticed that my GetHashCode implementation didn't produce unique ints for my object, since I had a lot of them, the problem could easily be solved if GetHashCode would allow long.

I than had a look at the new System.HashCode which worked better but still wasn't 100% unique if alot alot of data is produced, its way more than I actually need but I was just curious.

On my machine, it generates 203,088,009 unique int out of 207,360,000.

```c#
class Program
{
static void Main(string[] args)
{
var part1 = Enumerable.Range(1, 120).ToArray();
var part2 = Enumerable.Range(1, 120).ToArray();
var part3 = Enumerable.Range(1, 120).ToArray();
var part4 = Enumerable.Range(1, 120).ToArray();

    int counter     = 0;
    var hashCodes   = new int[part1.Length * part2.Length * part3.Length * part4.Length];

    for (int a = 0; a < part1.Length; a++)
    for (int b = 0; b < part2.Length; b++)
    for (int c = 0; c < part3.Length; c++)
    for (int d = 0; d < part4.Length; d++)
    {
        hashCodes[counter] = GetMyHashCode(part1[a], part2[b], part3[c], part4[d]);
        counter++;
    }

    var distinct = hashCodes.Distinct().Count();
    if (hashCodes.Length != distinct)
    {
    }
}
private static int GetMyHashCode(int item1, int item2, int item3, int item4)
{
    //return HashCode.Combine("A" + item1, "B" + item2, "C" + item3, "D" + item4); //<- produces 202,423,619 unique ints
    return HashCode.Combine(item1, item2, item3, item4); //<- produces 203,088,009 unique ints
}

}
```

question

Source

Rand-Random

Most helpful comment

Before you say something like
"But this would break existing code.", thats the whole point of .NET Core that you can easilier break stuff as you could have with .NET Framework, since .NET Core can be deployed on app basis instead of machine basis and we as a developer have to make the decision if we want or don't want to upgrade, with those changes.

While that was somewhat true in .NET Core v1, it’s not in v2+. Compatibility was realized as very, very necessary and a major shift in course happened. For example, this type of change would mean netstandard libs no longer worked on .NET Core. That’d be an unacceptable regression to everything so far.

In short: this isn’t doable, it’d break the world. Especially items so fundamental as .GetHashCode(). Every lib with any class that override it would break...it’s just not worth it.

NickCraver on 17 Oct 2018

👍2

All 4 comments

Before you say something like
"But this would break existing code.", thats the whole point of .NET Core that you can easilier break stuff as you could have with .NET Framework, since .NET Core can be deployed on app basis instead of machine basis and we as a developer have to make the decision if we want or don't want to upgrade, with those changes.

In short: this isn’t doable, it’d break the world. Especially items so fundamental as .GetHashCode(). Every lib with any class that override it would break...it’s just not worth it.

NickCraver on 17 Oct 2018

👍2

As real problem I faced the other day, I noticed that my GetHashCode implementation didn't produce unique ints for my object, since I had a lot of them, the problem could easily be solved if GetHashCode would allow long.

GetHashCode was never intended to produce unique values. It's just an optimization (even though it's a very important optimization), and so it doesn't require unique values. And I'm not convinced 64-bit hash code would guarantee uniqueness. If that's what you need, you should use something like SHA, which is 160 bits or longer.

svick on 17 Oct 2018

👍1

About the suggestion of using it for .Length / .Count: I'm sure this can be re-discussed when having multiple TBs of memory are the norm and dealing with arrays, dictionaries etc. consuming hundreds of GBs are a concern. And even then I think the discussion would be started with suggesting additional .LongLength etc. APIs or even additional "big collections".