Go: runtime: performance regression about 2~3ms since Go 1.15 on macOS, due to linking two more shared libraries

Created on 12 Aug 2020  Â·  10Comments  Â·  Source: golang/go

What version of Go are you using (go version)?

$ go version
go version go1.15 darwin/amd64

1.15, especially on macOS

Does this issue reproduce with the latest release?

Yes, this is a performance comparison issue, compared 1.15 against 1.14.7.

What operating system and processor architecture are you using (go env)?

go env Output

$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/itchyny/Library/Caches/go-build"
GOENV="/Users/itchyny/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/itchyny/.local/share/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/itchyny/.local/share/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/itchyny/Downloads/go1.15.darwin-amd64/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/itchyny/Downloads/go1.15.darwin-amd64/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/1c/r313z3nn77b3qyzyx6cnwcv80000gn/T/go-build806662573=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Compile a hello world program using Go 1.15 and 1.14.7 and compare the performance using hyperfine.

 % cat main.go
package main

import "fmt"

func main() {
        fmt.Println("Hello, world!")
}

 % ./go1.15.darwin-amd64/go/bin/go build -o hello-go115 main.go

 % ./go1.14.7.darwin-amd64/go/bin/go build -o hello-go114 main.go

 % hyperfine --warmup 10 ./hello-go115 ./hello-go114 # first run
Benchmark #1: ./hello-go115
  Time (mean ± σ):       7.3 ms ±   0.9 ms    [User: 2.6 ms, System: 3.2 ms]
  Range (min … max):     6.1 ms …  12.3 ms    228 runs

Benchmark #2: ./hello-go114
  Time (mean ± σ):       5.0 ms ±   1.5 ms    [User: 1.3 ms, System: 2.2 ms]
  Range (min … max):     3.5 ms …  12.8 ms    261 runs

Summary
  './hello-go114' ran
    1.46 ± 0.49 times faster than './hello-go115'

  Warning: Command took less than 5 ms to complete. Results might be inaccurate.

 % hyperfine --warmup 10 ./hello-go115 ./hello-go114 # second run
Benchmark #1: ./hello-go115
  Time (mean ± σ):       7.3 ms ±   0.7 ms    [User: 2.6 ms, System: 3.0 ms]
  Range (min … max):     6.3 ms …  11.6 ms    211 runs

Benchmark #2: ./hello-go114
  Time (mean ± σ):       4.8 ms ±   0.6 ms    [User: 1.2 ms, System: 2.1 ms]
  Range (min … max):     3.8 ms …   7.9 ms    287 runs

Summary
  './hello-go114' ran
    1.54 ± 0.23 times faster than './hello-go115'

  Warning: Command took less than 5 ms to complete. Results might be inaccurate.

 % # This hello world program is too small to compare the performance so let's do a loop.

 % cat main.go
package main

import "fmt"

func main() {
        for i := 0; i < 50000; i++ {
                fmt.Println("Hello, world!")
        }
}

 % ./go1.15.darwin-amd64/go/bin/go build -o hello-go115 main.go

 % ./go1.14.7.darwin-amd64/go/bin/go build -o hello-go114 main.go

 % hyperfine --warmup 10 ./hello-go115 ./hello-go114 # first run
Benchmark #1: ./hello-go115
  Time (mean ± σ):      44.1 ms ±   1.9 ms    [User: 26.7 ms, System: 15.9 ms]
  Range (min … max):    39.9 ms …  49.3 ms    58 runs

Benchmark #2: ./hello-go114
  Time (mean ± σ):      41.1 ms ±   2.5 ms    [User: 23.9 ms, System: 15.6 ms]
  Range (min … max):    36.5 ms …  48.6 ms    67 runs

Summary
  './hello-go114' ran
    1.07 ± 0.08 times faster than './hello-go115'

 % hyperfine --warmup 10 ./hello-go115 ./hello-go114 # second run
Benchmark #1: ./hello-go115
  Time (mean ± σ):      44.4 ms ±   2.7 ms    [User: 26.5 ms, System: 16.1 ms]
  Range (min … max):    39.1 ms …  50.5 ms    58 runs

Benchmark #2: ./hello-go114
  Time (mean ± σ):      41.2 ms ±   2.2 ms    [User: 23.8 ms, System: 15.8 ms]
  Range (min … max):    37.1 ms …  46.7 ms    62 runs

Summary
  './hello-go114' ran
    1.08 ± 0.09 times faster than './hello-go115'

What did you expect to see?

I upgraded the version of Go so I see no performance regression or even faster.

What did you see instead?

  • Single fmt.Println program took 4.9 ms in Go 1.14.7 but 7.3 ms in Go 1.15, so the regression is around 2.4 ms.
  • Repeating fmt.Println 50000 times took 41.2 ms in Go 1.14.7 but 44.4 ms in Go 1.15 so the regression is around 3.2 ms.

Why is this happening?

This regression is caused by https://go-review.googlesource.com/c/go/+/227037/ (crypto/x509: use Security.framework without cgo for roots on macOS). As you can see with otool command (ldd equivalent on macOS), two more shared library are linked in Go 1.15 compared to 1.14.7. This will lead (almost) constant overhead to any program built for macOS.

 % otool -L hello-go114
hello-go114:
        /usr/lib/libSystem.B.dylib (compatibility version 0.0.0, current version 0.0.0)

 % otool -L hello-go115
hello-go115:
        /usr/lib/libSystem.B.dylib (compatibility version 0.0.0, current version 0.0.0)
        /System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 0.0.0, current version 0.0.0)
        /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 0.0.0, current version 0.0.0)

Actually, when I removed the two lines and build the Go command, it produces the executable without linking to the two libraries and the performance is now comparable.

The Go 1.15 Release Notes says On macOS, binaries are now always linked against Security.framework to extract the system trust roots, regardless of whether cgo is available. The resulting behavior should be more consistent with the OS verifier. Thus I think this is an intended overhead we have to accept since Go 1.15. But I want to mention here in this issue tracker because this is where I searched for some regression issues after seeing the weird profiling results on upgrading Go. 2~3 ms addition for hello world program was actually surprising. Maybe I have to accept the fact that the change is applied to any program, even if it is not related to networking.

Go devs, I always appreciate your tough works and please close this issue if the overhead is expected. Thanks @koron for the big clue for this problem.

Regards

NeedsInvestigation OS-Darwin

Most helpful comment

@gopherbot please open a backport issue for Go 1.15.

I am not sure this qualifies to be backported, but it might as it's a regression with no workaround.

The team will evaluate it in the backport issue.

All 10 comments

@FiloSottile

Is there any way to avoid the reference to the new frameworks if none of the security code that uses it is linked in?

I was looking at that the other day, and have a fix locally. Will send it out soon.

It's reasonable to wire it up so it's only linked when crypto/x509 is imported, happy to review the CL @cherrymui.

Change https://golang.org/cl/248333 mentions this issue: cmd/link: link dynamic library automatically

I was wondering if this will be ported to 1.15.1 or not.

@gopherbot please open a backport issue for Go 1.15.

I am not sure this qualifies to be backported, but it might as it's a regression with no workaround.

The team will evaluate it in the backport issue.

Backport issue(s) opened: #40764 (for 1.15).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@gopherbot add OS-Darwin

Change https://golang.org/cl/248719 mentions this issue: [release-branch.go1.15] cmd/link: link dynamic library automatically

I'm not sure it qualifies backporting either. But I made a CL anyway. It backports cleanly and does work.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

michaelsafyan picture michaelsafyan  Â·  3Comments

gopherbot picture gopherbot  Â·  3Comments

bradfitz picture bradfitz  Â·  3Comments

enoodle picture enoodle  Â·  3Comments

ashb picture ashb  Â·  3Comments