Runtime: Can more non-cryptographic hash algorithms be added into .NET BCL?

Created on 7 Oct 2020  路  12Comments  路  Source: dotnet/runtime

Now the BCL has hash algorithms such as MD5\SHA1\SHA256 etc.. However some other famous hash algorithms like murmur3/fnv/blake2 may take advantages on specific scenes.
I suggest the BCL includes the algorithms so that it can benefit more developers and let them join in the work of optimization. Our community has provided some implementations that can be taken reference, but some of them are still a little bit confusing and unreliable.
For example, Scala's standard library provides the APIs of _MurmurHash3_. For C#, these algorithms could be put into a stand-alone place and provided by a single Nuget package.

area-System.Runtime untriaged

Most helpful comment

I've had to use cryptographic hashes in the past for purposes that didn't require crypto-strong hash. Since .NET didn't offer non-crypto hashes, I used SHA-1. A couple years later I got harassed by the security people at the company who were flagging everything using SHA-1 as a security problem. After a long argument I had to move the code to SHA-256 for no reason (and everyone involved knew it's pointless). I fully expect to be harassed in a couple years again if SHA-256 becomes a security concern.

It would be great if .NET provided a non-crypto hash.

All 12 comments

Tagging subscribers to this area: @bartonjs, @vcsjones, @krwq, @jeffhandley
See info in area-owners.md if you want to be subscribed.

AFAIK, currently BCL relies on underlying OS to provider hash algorithms to be "secure enough". SHA1Managed is fake now and just delegates to native implementation.

There may be a centrally maintained managed hash package for NuGet.

Because BLAKE2 is a cryptographic hash, I guess it would be covered by the policy referred to in https://github.com/dotnet/runtime/issues/16010#issuecomment-697550274:

We have a policy of not implementing cryptographic primitives, but deferring to OS libraries.

MurmurHash and FNV hash are not cryptographic, so they could be easier to add than BLAKE2. https://github.com/dotnet/runtime/issues/24328 concerns API design for non-cryptographic hashing.

Note that framework should be adding only stuff which is useful for everyone and not everything which might be useful to someone, otherwise it would grow very large (and already is relatively large). I think it might be better to create an external library with such algorithm and if you have lots of downloads and prove "useful to many" then you can suggest adding this library into framework. Other thing is that crafting your own implementation of crypto primitives have certain complex process required by law you have to follow so preferred option is to rely on external implementation which already went through such process.

@krwq Understand. But now the framework has been divided into different parts, so I think it's feasible to add a stand-alone namespace/library which is provided by a single Nuget package for optional usage(like Microsoft.Bcl.XXX?).

By the way, the current famous hash algorithms, such as _MurmurHash_, have been many years old and are useful in many important data structures/systems like bloom filter, so it's not bad to implement a reliable .NET API which will really benifit productions of our community.

so preferred option is to rely on external implementation which already went through such process

I have no objection to such an opinion. I believe the problem can be solved by experts of this area(obviously I'm not...).

I've had to use cryptographic hashes in the past for purposes that didn't require crypto-strong hash. Since .NET didn't offer non-crypto hashes, I used SHA-1. A couple years later I got harassed by the security people at the company who were flagging everything using SHA-1 as a security problem. After a long argument I had to move the code to SHA-256 for no reason (and everyone involved knew it's pointless). I fully expect to be harassed in a couple years again if SHA-256 becomes a security concern.

It would be great if .NET provided a non-crypto hash.

The OOB package with non-cryptographic hashes is an option to consider. We'd need to figure out what hashes specifically do we want and consider them in terms of why would we want to have them rather than using i.e. built-in Marvin hashing: do we want larger size of hashes? Something faster? Or is perhaps Marvin sufficient and we could consider making it public?

I would have two things on my wishlist:

  • A compact hash that is reasonably fast (I like Marvin's 64bit and 32bit variants)
  • A hash where collisions are very unlikely (where I can just use the hash as a unique ID of input I can trust, same as e.g. git uses SHA-1 to identify a commit).

This would likely mean two different algorithms.

This would likely mean two different algorithms.

馃挱 Do you imagine these also to be stable by-default, or is that less of a concern?

I retitled + repathed the issue to reflect that we're discussing _non-cryptographic_ hash algorithms.

TBH I wouldn't add a Marvin-specific public API to any of our shipped packages. It suits our own needs nicely but never really gained traction outside of Microsoft. If we're going to ship implementations of non-crypto hash algorithms, we need to build up a list of the algorithms that would have the greatest benefit to the ecosystem. @LeaFrock's original list provides a good starting point.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

EgorBo picture EgorBo  路  3Comments

aggieben picture aggieben  路  3Comments

matty-hall picture matty-hall  路  3Comments

sahithreddyk picture sahithreddyk  路  3Comments

noahfalk picture noahfalk  路  3Comments