Go: x/crypto/chacha20poly1305: linux/arm64 Go 1.9 performance is 3X slower than OpenSSL

Created on 19 Nov 2017  路  17Comments  路  Source: golang/go

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.9.2 linux/arm64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

GOARCH="arm64"
GOBIN=""
GOEXE=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH=""
GORACE=""
GOROOT="/usr/lib/go-1.6"
GOTOOLDIR="/usr/lib/go-1.6/pkg/tool/linux_arm64"
GO15VENDOREXPERIMENT="1"
CC="gcc"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"

What did you do?

go test vendor/golang_org/x/crypto/chacha20poly1305 -bench .

What did you expect to see?

Performance can be on par with OpenSSL (https://blog.cloudflare.com/content/images/2017/11/sym_key_1_core.png)

What did you see instead?

3X slower than OpenSSL( https://blog.cloudflare.com/content/images/2017/11/go_sym_key_1_core.png)

NeedsFix Performance help wanted

Most helpful comment

@FiloSottile This was marked for 1.12. Things still on target for that?

All 17 comments

Change https://golang.org/cl/105895 mentions this issue: crypto/poly1305: arm64 implementation using multiword arithmetic

Change https://golang.org/cl/105896 mentions this issue: crypto/poly1305: arm64 implementation using multiword arithmetic

Change https://golang.org/cl/107628 mentions this issue: internal/chacha20: add arm64 SIMD implementation

Substantial perf improvements on Cavium ThunderX going from go 1.10.2 to go1.11beta1, but not 3x faster.

ed@ed-2a-bcc-llvm:~$ go version
go version go1.10.2 linux/arm64
ed@ed-2a-bcc-llvm:~$ go test vendor/golang_org/x/crypto/chacha20poly1305 -bench .
goos: linux
goarch: arm64
pkg: vendor/golang_org/x/crypto/chacha20poly1305
BenchmarkChacha20Poly1305Open_64-96               500000              3047 ns/op          21.00 MB/s
BenchmarkChacha20Poly1305Seal_64-96               500000              2920 ns/op          21.91 MB/s
BenchmarkChacha20Poly1305Open_1350-96              50000             30990 ns/op          43.56 MB/s
BenchmarkChacha20Poly1305Seal_1350-96              50000             30890 ns/op          43.70 MB/s
BenchmarkChacha20Poly1305Open_8K-96                10000            173794 ns/op          47.14 MB/s
BenchmarkChacha20Poly1305Seal_8K-96                10000            173907 ns/op          47.11 MB/s
PASS
ok      vendor/golang_org/x/crypto/chacha20poly1305     10.538s
ed@ed-2a-bcc-llvm:~$ 
ed@ed-2a-bcc-llvm:~$ ~/go/bin/go1.11beta1 test vendor/golang_org/x/crypto/chacha20poly1305 -bench .
goos: linux
goarch: arm64
pkg: vendor/golang_org/x/crypto/chacha20poly1305
BenchmarkChacha20Poly1305Open_64-96              1000000              2249 ns/op          28.45 MB/s
BenchmarkChacha20Poly1305Seal_64-96              1000000              2245 ns/op          28.50 MB/s
BenchmarkChacha20Poly1305Open_1350-96             100000             19541 ns/op          69.08 MB/s
BenchmarkChacha20Poly1305Seal_1350-96             100000             19439 ns/op          69.45 MB/s
BenchmarkChacha20Poly1305Open_8K-96                10000            105547 ns/op          77.61 MB/s
BenchmarkChacha20Poly1305Seal_8K-96                10000            105938 ns/op          77.33 MB/s
PASS
ok      vendor/golang_org/x/crypto/chacha20poly1305     11.173s

@vielmetti internal/chacha20 won't be merged into go1.11 since it's frozen.

Thanks @mengzhuo . Can we get this onto the go1.12 roster then? It's currently marked as "Unplanned".

I changed the milestone, but note that that doesn't cause the work to be done. This is an open source project so the best way to get something done is to volunteer to do it. Thanks.

Noted at https://go-review.googlesource.com/c/crypto/+/107628

"If you prioritize arm64 chacha and arm64 poly, it will see production use super soon after."

It appears from comments on that patch that the coding work has largely been done but there are constraints on the availability of reviewers for arm64 assembly.

Who is reviewing arm64 assembly these days, @ianlancetaylor , and can they use qualified help? I'm happy to help recruit qualified reviewers from the arm64 Go community if I know the qualifications.

At present arm64 assembly is typically reviewed by the tireless @cherrymui . I'm sure @benshi001 would also have good input.

@FiloSottile This was marked for 1.12. Things still on target for that?

This issue is marked currently as "help wanted". What is the nature of the help desired?

@vielmetti Figuring out how to make the code run faster.

One of the meanings of the "help wanted" label is "we would like this to happen but nobody is working on it."

"we would like this to happen but nobody is working on it."

Pretty sure somebody was working on it, but then CL just didn't get much of a review. Until today, that is.

@zx2c4 The "help wanted" label was added before any of the CLs were sent. But I should have been clearer in my response; my apologies.

Could someone close this issue? Apparently it's done.

Was this page helpful?
0 / 5 - 0 ratings