Is Git-LFS using https://en.wikipedia.org/wiki/Intel_SHA_extensions where possible for hashing files?
Since by its nature, Git-LFS is used with possibly hundred or thousand megabyte files, having the fastes possible hashing implementation would significantly impact git operations and general developer experience.
Hi @m1h4, thanks for pointing this out and opening this issue. It's a great idea; we don't currently use any hardware acceleration, since we rely on the crypto/sha256
to do this type of work.
I found https://github.com/minio/sha256-simd, which seems to do what we want, i.e., "be a drop-in replacement that uses hardware acceleration where possible, and falls back to crypto/sha256 where it's not."
I pushed a branch of in: https://github.com/git-lfs/git-lfs/pull/3021.
After doing some tests it seems like Git LFS is already using acceleration out-of-the-box after all:
~ mkfile -n 500m test
~ time openssl sha256 test
SHA256(test)= a08a92258f621b55d08ad1e84c90c2ea6286fc6b6c9a4dfa7156afb16c190170
13.25 real 13.05 user 0.17 sys
~ time /usr/local/Cellar/openssl/1.0.2o_1/bin/openssl sha256 test
SHA256(test)= a08a92258f621b55d08ad1e84c90c2ea6286fc6b6c9a4dfa7156afb16c190170
1.19 real 1.12 user 0.06 sys
~ time cat test | git lfs clean
1.88 real 0.03 user 0.18 sys
version https://git-lfs.github.com/spec/v1
oid sha256:a08a92258f621b55d08ad1e84c90c2ea6286fc6b6c9a4dfa7156afb16c190170
size 524288000
(here I am assuming that the older version of openssl
which comes standard on MacOS is without acceleration while the one installed via homebrew is)
But it would definetly be good to test out https://github.com/minio/sha256-simd and see if any gains are to be had.
@m1h4 interesting; I wasn't aware that Go had support for SIMD optimizations of this sort, but admit that I am only superficially aware of the crypto/*
packages, not any of their specific details. I think that more measurement is worthwhile to see if there are speedups in a practical setting within Git LFS between sha256-simd
and crypto/sha256
.
Closing per #3021.
Most helpful comment
Hi @m1h4, thanks for pointing this out and opening this issue. It's a great idea; we don't currently use any hardware acceleration, since we rely on the
crypto/sha256
to do this type of work.I found https://github.com/minio/sha256-simd, which seems to do what we want, i.e., "be a drop-in replacement that uses hardware acceleration where possible, and falls back to crypto/sha256 where it's not."
I pushed a branch of in: https://github.com/git-lfs/git-lfs/pull/3021.