rustc uses FxHash{Set,Map} everywhere rather than Hash{Set,Map}, because the DefaultHasher used by Hash{Set,Map} is slow.
But once #69152 lands, DefaultHasher will be a lot faster when hashing integers, which is a common case; in one microbenchmark I saw a ~2.5x speed-up. Combine that with the fact that FxHasher is a lower-quality hasher and so tends to result in more collisions, and the default hash tables might be faster. (On a different microbenchmark I saw that HashSet<u32> was a little bit faster than FxHashSet<u32>.)
We should evaluate this, probably by replacing every FxHash{Set,Map} with Hash{Set,Map}. (It keeps things simpler if we exclusively used one or the other, rather than a mix.)
I briefly tried to do this, but we have a lint that produces this message if you try to use Hash{Set,Map}: "error: Prefer FxHashSet over HashSet, it has better performance". I couldn't work out how to disable it.
cc @rust-lang/wg-compiler-performance
cc @cbreeden
cc @Amanieu
You could fork the rustc-hash crate, make it use DefaultHasher and then use [replace] in the workspace Cargo.toml to try it out.
It may also make sense to evaluate ahash, the hash function that hashbrown switched to due to fxhash's poor hash quality.
Didn't we also use a custom hasher to bypass randomness?
If we rely on hashbrown we must make sure its compile-time randomness is disabled.
Note to self: x.py build --warnings=warn makes the warning ignorable. Thanks to @Mark-Simulacrum for the tip.
I measured this change locally, on top of the DefaultHasher improvements in #69152. (In the end I only had to change three lines in librustc_data_structures/fx.rs.) The results were terrible. Here are just the check results, which are representative.
avg: 63.6% min: 24.8% max: 84.7%
wf-projection-stress-65510-che...
avg: 47.8% min: 30.0% max: 57.8%
clap-rs-check
avg: 43.7% min: 30.5% max: 55.7%
wg-grammar-check
avg: 46.0% min: 30.8% max: 54.3%
packed-simd-check
avg: 50.2% min: 46.9% max: 52.2%
serde-check
avg: 44.1% min: 32.4% max: 51.3%
regression-31157-check
avg: 40.9% min: 31.4% max: 49.5%
futures-check
avg: 38.1% min: 30.1% max: 47.3%
unify-linearly-check
avg: 34.0% min: 25.1% max: 43.6%
syn-check
avg: 38.7% min: 36.4% max: 42.8%
piston-image-check
avg: 34.5% min: 29.2% max: 42.0%
tokio-webpush-simple-check
avg: 35.3% min: 24.6% max: 41.0%
regex-check
avg: 34.0% min: 30.4% max: 40.9%
ctfe-stress-4-check
avg: 36.7%? min: 29.2%? max: 40.8%?
ripgrep-check
avg: 33.3% min: 27.9% max: 40.0%
unused-warnings-check
avg: 33.7% min: 29.3% max: 39.7%
webrender-check
avg: 33.6% min: 28.9% max: 38.8%
hyper-2-check
avg: 32.8% min: 24.2% max: 38.6%
webrender-wrench-check
avg: 30.8% min: 24.2% max: 37.9%
cargo-check
avg: 27.6% min: 18.3% max: 37.4%
issue-46449-check
avg: 34.6% min: 29.0% max: 37.2%
deep-vector-check
avg: 30.7% min: 11.6% max: 37.0%
coercions-check
avg: 35.7%? min: 33.7%? max: 36.9%?
encoding-check
avg: 28.8% min: 23.5% max: 36.2%
cranelift-codegen-check
avg: 31.2% min: 26.5% max: 36.0%
trait-stress-check
avg: 32.3% min: 27.8% max: 34.8%
serde-serde_derive-check
avg: 28.5% min: 23.0% max: 34.4%
html5ever-check
avg: 27.5% min: 21.4% max: 34.1%
ucd-check
avg: 28.1% min: 20.8% max: 33.5%
keccak-check
avg: 19.9% min: 11.8% max: 31.8%
tuple-stress-check
avg: 28.4% min: 21.1% max: 31.6%
inflate-check
avg: 17.9% min: 9.7% max: 30.7%
await-call-tree-check
avg: 26.6% min: 23.2% max: 30.2%
helloworld-check
avg: 23.3% min: 16.9% max: 26.0%
unicode_normalization-check
avg: 18.9% min: 16.9% max: 20.1%
token-stream-stress-check
avg: 5.9% min: 3.9% max: 6.9%
Every single run of every single benchmark regressed, by 3.9% to 84.7%!
I had a microbenchmark where HashMap (with the improvements from #69152) was slightly better than FxHashMap, presumably because the benefit of fewer collisions outweighed the cost of the slower hasher. But that microbenchmark was using integers as keys, which is the best case for DefaultHasher. In the compiler I imagine a lot of the hash tables have more complicated keys containing multiple integers and or strings.
Oh well, at least it confirms that FxHasher is helping a lot.
@nnethercote Just to confirm, when you're talking about DefaultHasher, it's still SipHash, right?
I had seen https://github.com/rust-lang/rust/issues/69153#issuecomment-586180880 and got myself confused into thinking your results were for aHash.
I'm not surprised SipHash is still a slowdown, given its intended usecase (DoS protection), I don't think we can get away with a hash function "that good" unless we can heavily rely on hardware acceleration (which, ironically, might make actual cryptographic hashes faster; I guess the hardware acceleration might be reusable for something weaker than full-blown AES, but I have no idea).
I can confirm similar findings between DefaultHasher and fx in
my own codebases (although belatedly)
Hardware accelerated hashes sound nice until you consider the bigger than
usual performance cliff when acceleration is not available. Using something like aHash
is fine when you know you only care about x86. Rustc cares about way more.
On Fri, 21 Feb 2020, 00:09 Eduard-Mihai Burtescu, notifications@github.com
wrote:
@nnethercote https://github.com/nnethercote Just to confirm, when
you're talking about DefaultHasher, it's still SipHash, right?I had seen #69153 (comment)
https://github.com/rust-lang/rust/issues/69153#issuecomment-586180880
and got myself confused into thinking your results were for aHash.I'm not surprised SipHash is still a slowdown, given its intended usecase
(DoS protection), I don't think we can get away with a hash function "that
good" unless we can heavily rely on hardware acceleration (which,
ironically, might make actual cryptographic hashes faster; I guess the
hardware acceleration might be reusable for something weaker than
full-blown AES, but I have no idea).—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/69153?email_source=notifications&email_token=AAFFZUWZRSYYE6NFQE6M4G3RD35THA5CNFSM4KU7TWWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMQRPKY#issuecomment-589371307,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAFFZUXWBHY3N73T2WRESXTRD35THANCNFSM4KU7TWWA
.
AHash actually has a fallback hash which is almost as fast as FxHash but avoids the issues with poor distribution. This is the main reason why I chose it as the default hash function for hashbrown.
@nnethercote Just to confirm, when you're talking about
DefaultHasher, it's stillSipHash, right?
It's SipHasher13 (as normal) with the improvements from #69152 applied. I haven't looked at AHash at all.
It may also make sense to evaluate ahash, the hash function that hashbrown switched to due to fxhash's poor hash quality.
I just tried ahash. For some reason I got the fallback hash rather than the AES hardware one, even though I'm on a reasonably new x86-64 box that has AES capability (according to the info from /proc/cpuinfo).
Fallback ahash is slightly worse than fxhash in both instruction counts (mostly 1-2%) and cycles (mostly 1-4%).
ahash is also not deterministic across different builds, because it uses const_random! when initializing hasher state. This could cause extra noise in perf runs, which would be bad.
ahashis also not deterministic across different builds, because it usesconst_random!when initializing hasher state. This could cause extra noise in perf runs, which would be bad.
I brought that up earlier, and it's worse than that because it breaks deterministic builds. But you can turn it off (look at the feature list in aHash's Cargo.toml).
However, being slightly worse than fxhash probably means it's not worth spending more time on it (unless there is some inefficiency in the fallback path).
But you can turn it off (look at the feature list in
aHash'sCargo.toml).
I tried that but it caused compile errors because the you don't get a Default impl for AHasher without the compile-time-rng feature.
aHash not having reproducible output between machines make it unsuitable choice for rustc anyway. If we want to have any resemblance of support for distributed builds, that is.
Most helpful comment
I measured this change locally, on top of the
DefaultHasherimprovements in #69152. (In the end I only had to change three lines inlibrustc_data_structures/fx.rs.) The results were terrible. Here are just thecheckresults, which are representative.Every single run of every single benchmark regressed, by 3.9% to 84.7%!
I had a microbenchmark where
HashMap(with the improvements from #69152) was slightly better thanFxHashMap, presumably because the benefit of fewer collisions outweighed the cost of the slower hasher. But that microbenchmark was using integers as keys, which is the best case forDefaultHasher. In the compiler I imagine a lot of the hash tables have more complicated keys containing multiple integers and or strings.Oh well, at least it confirms that
FxHasheris helping a lot.