Following a debug session with @fatih
go version)?go version go1.11.2 darwin/amd64 # and go version go1.11.2 linux/amd64
Yes
go env)?go env Output
GOARCH="amd64"
GOBIN="/Users/fatih/go/bin"
GOCACHE="/Users/fatih/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/fatih/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/Cellar/go/1.11.2/libexec"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.11.2/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/k_/87syx3r50m93m72hvqj2qqlw0000gn/T/go-build130848285=/tmp/go-build -gno-record-gcc-switches -fno-common"
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build416231211=/tmp/go-build -gno-record-gcc-switches"
The symptom that @fatih and I were investigating was godef performing ~2x more slowly on his fast machine than on my slower machine.
@fatih's machine specs for reference:
Model Name: iMac Pro
Model Identifier: iMacPro1,1
Processor Name: Intel Xeon W
Processor Speed: 3 GHz
Number of Processors: 1
Total Number of Cores: 10
L2 Cache (per Core): 1 MB
L3 Cache: 13.8 MB
Memory: 64 GB
OS X 10.14.1 (18B75)
We think we have a smaller reproduction that demonstrates what appears to be an OS X 10.14.n issue that may or may not be related to Go. But go list appears to be a good way to demonstrate the problem and hopefully therefore a good place to start further investigation.
The following was run in a Terminal on @fatih's setup, and then within a Docker container on the same machine. Run-times of the go list commands were compared:
/bin/bash
(
set -eux
export GOPATH=$(mktemp -d)
command cd $(mktemp -d)
git clone https://github.com/digitalocean/csi-digitalocean
command cd csi-digitalocean/
git checkout b9ed6c2ea9ad85c5e4956bdfe418940bb5aa883a
rm go.sum
go mod tidy
command cd driver
time go list -export -deps -json -e . > /dev/null 2>&1
time go list -export -deps -json -e . > /dev/null 2>&1
time go list -export -deps -json -e . > /dev/null 2>&1
time go list -export -deps -json -e . > /dev/null 2>&1
) 2>&1 | tee output.txt
The "native" terminal run shows "average" speeds of ~300ms:
+ go list -export -deps -json -e .
real 0m1.524s
user 0m7.224s
sys 0m2.391s
+ go list -export -deps -json -e .
real 0m0.297s
user 0m0.420s
sys 0m0.898s
+ go list -export -deps -json -e .
real 0m0.300s
user 0m0.403s
sys 0m0.919s
+ go list -export -deps -json -e .
real 0m0.308s
user 0m0.402s
sys 0m0.968s
Whereas the Docker run shows "average" speeds of ~180ms:
+ go list -export -deps -json -e .
real 0m4.837s
user 0m12.070s
sys 0m5.030s
+ go list -export -deps -json -e .
real 0m0.186s
user 0m0.180s
sys 0m0.110s
+ go list -export -deps -json -e .
real 0m0.165s
user 0m0.160s
sys 0m0.120s
+ go list -export -deps -json -e .
real 0m0.187s
user 0m0.190s
sys 0m0.100s
Similar run-times in each, possibly even faster times in the "native" OS X environment.
The "native" run-times are almost 70% longer.
cc @bcmills (and FYI @ianthehat for any go/packages side-effects)
Here's what I got from running the above script on a 13' macOS (seirra 10.12.6)
Without Docker
+ godef -o 19358 -f controller.go
/var/folders/n4/35bgdnkd6vlgqrdzbyl0x1ycmhcndx/T/tmp.DVAXjerX/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/v0/csi.pb.go:1233:1
real 0m10.234s
user 0m8.298s
sys 0m1.270s
+ godef -o 19358 -f controller.go
/var/folders/n4/35bgdnkd6vlgqrdzbyl0x1ycmhcndx/T/tmp.DVAXjerX/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/v0/csi.pb.go:1233:1
real 0m1.276s
user 0m0.381s
sys 0m0.234s
+ godef -o 19358 -f controller.go
/var/folders/n4/35bgdnkd6vlgqrdzbyl0x1ycmhcndx/T/tmp.DVAXjerX/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/v0/csi.pb.go:1233:1
real 0m1.295s
user 0m0.399s
sys 0m0.240s
With Docker
+ godef -o 19358 -f controller.go
/tmp/tmp.U5kJ73H8kO/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/v0/csi.pb.go:1233:1
real 0m7.215s
user 0m10.540s
sys 0m2.270s
+ godef -o 19358 -f controller.go
/tmp/tmp.U5kJ73H8kO/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/v0/csi.pb.go:1233:1
real 0m0.304s
user 0m0.250s
sys 0m0.110s
+ godef -o 19358 -f controller.go
/tmp/tmp.U5kJ73H8kO/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/v0/csi.pb.go:1233:1
real 0m0.282s
user 0m0.230s
sys 0m0.110s
Sorry, forgot to cc @rsc in the original description
Will a dtruss/strace log help here ? Just wondering if there is any OS level variable here.
cc @jayconrod
The same problem on my windows 10:
What version of Go are you using (go version)?
go version go1.11 windows/amd64
What operating system and processor architecture are you using (go env)?
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\shuye\AppData\Local\go-build
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=G:\GO
set GOPROXY=
set GORACE=
set GOROOT=D:\Go
set GOTMPDIR=
set GOTOOLDIR=D:\Go\pkg\tool\windows_amd64
set GCCGO=gccgo
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=G:\GIT.HUB\marketstore\go.mod
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=C:\Users\shuye\AppData\Local\Temp\go-build930085358=/tmp/go-build -gno-record-gcc-switches
I don't have an answer to why macOS 10.12 and 10.14 performance would be different, but I think I know why Docker and macOS are different.
As far as I understand, Docker on macOS actually runs a Linux kernel and userspace under a hypervisor. That means overhead for I/O and system calls is more similar to native Linux overhead than it is to macOS overhead. "go list" does a lot of stat and read system calls on small files, so the system call overhead matters a lot.
Here are the results from a quick benchmark I ran. The Read and Write tests are reading and writing slices of various sizes to a file (the file is almost certainly in the kernel file cache, so disk speed doesn't matter). The Import test is measuring the time to call go/build.Context.Import on "fmt".
macOS native:
BenchmarkStat-8 300000 4470 ns/op
BenchmarkRead/1K-8 50000 27366 ns/op
BenchmarkRead/50K-8 50000 30980 ns/op
BenchmarkRead/1024K-8 10000 104866 ns/op
BenchmarkWrite/1K-8 10000 151379 ns/op
BenchmarkWrite/50K-8 5000 311861 ns/op
BenchmarkWrite/1024K-8 1000 2020063 ns/op
BenchmarkImport-8 2000 919822 ns/op
macOS Docker
BenchmarkStat-4 1000000 1222 ns/op
BenchmarkRead/1K-4 300000 4101 ns/op
BenchmarkRead/50K-4 200000 7405 ns/op
BenchmarkRead/1024K-4 20000 92996 ns/op
BenchmarkWrite/1K-4 10000 181860 ns/op
BenchmarkWrite/50K-4 10000 341137 ns/op
BenchmarkWrite/1024K-4 1000 2404169 ns/op
BenchmarkImport-4 2000 580594 ns/op
Notably, the Stat and Read/1K tests are much, much slower on native macOS. Windows is quite a bit slower than this.
We have a project with a mono repo with 40k LOC, 700k LOC dependencies and 30 binaries, code generation before build and so forth. Running time make without any changes and a warm cache takes 3.3 sec on a modern MacBook Pro and only 2 sec on a Fedora VM running on the same machine using Parallels. It seems other people have similar experiences with VMware/Ubuntu and macOS.
all:
go build -o build/ ./gen/... # build code generators
go generate ./...
go test ./...
go build -o build/$(GOOS)_$(GOARCH)/ ./cmd/...
The build cache contains only this one project. When making a simple change to the project build times change to 5.3 sec vs. 2.8 sec. Building a single binary the numbers are 1.35 sec vs 0.85 sec.
We are using modules, everything is downloaded, no CGO and we are not cross-compiling.
Using a RAM disk on macOS for GOCACHE and the output directory did not improve the build speed.
I'm testing this on a quite beefy machine but this is more noticeable on smaller laptops of my colleagues. I'll double check but the build times were more in the range of 5-10 seconds. This can be quite noticeable and I'm curious how we can debug this a bit further.
Update 1:
The project and dependencies are about 1700 Go files at this point.
Could this possibly be due to #26550 (thus a dup of #30497 or #27415) ?
(Note that change occurred for 10.12, so it can't explain the OP's problem, but it may explain what some other posters are seeing.)
The speed difference was already noticeable with go1.13 since I've started playing with a Fedora laptop a while ago. Not sure about go1.12.
@randall77 you mean that i/o syscalls are slower on Darwin than on Linux?
@magiconair, yes, syscalls will be somewhat slower because they go through libc now. But I was particularly thinking of fsync which is much more expensive on Darwin. The regular fsync is kinda broken, so we have to use ioctl with F_FULLFSYNC.
BTW, that #26650, not #26550, sorry.
Are we sure that what we are seeing here is not just Docker for Mac virtualization doing some very aggressive in-memory caching?
We are not just comparing Darwin to Linux; we are comparing Darwin to Hyperkit+Linux.
In the numbers shared by @myitcv in the issue description, native macOS is actually faster on the first cold run:
+ go list -export -deps -json -e .
real 0m1.524s
user 0m7.224s
sys 0m2.391s
versus this for Docker:
+ go list -export -deps -json -e .
real 0m4.837s
user 0m12.070s
sys 0m5.030s
Update: I assumed everybody was using Docker for Mac, but maybe you use another solution to run Docker on your Mac?
@randall77 is there some indication that this situation will improve in the future? Is Apple addressing the fsync issue? I'm now at 15sec build times on Darwin vs 4.5 sec in a Fedora VM running on the same super-beefy hardware.
@randall77 is there some indication that this situation will improve in the future?
I don't know of anyone working on it, or anyone who has plans in this area. So, no.
Is Apple addressing the fsync issue?
I have no idea.
I'm now at 15sec build times on Darwin vs 4.5 sec in a Fedora VM running on the same super-beefy hardware.
I think if we want to make progress here we need to understand why it is so much slower. For build times, I don't think a slow fsync would be the cause. Building Go code is mostly reading; it's probably stat or read call. But that's a guess; data would help.
Building Go code is mostly reading; it's probably stat or read call. But that's a guess; data would help.
Do you have a quick tip on how to collect this data? Otherwise, I'll try and ask the internet.
Run with strace?
Run under a profiler?
Both might be kinda tricky, as the build system runs multiple processes.
Most helpful comment
I don't have an answer to why macOS 10.12 and 10.14 performance would be different, but I think I know why Docker and macOS are different.
As far as I understand, Docker on macOS actually runs a Linux kernel and userspace under a hypervisor. That means overhead for I/O and system calls is more similar to native Linux overhead than it is to macOS overhead. "go list" does a lot of
statandreadsystem calls on small files, so the system call overhead matters a lot.Here are the results from a quick benchmark I ran. The
ReadandWritetests are reading and writing slices of various sizes to a file (the file is almost certainly in the kernel file cache, so disk speed doesn't matter). TheImporttest is measuring the time to callgo/build.Context.Importon"fmt".macOS native:
macOS Docker
Notably, the
StatandRead/1Ktests are much, much slower on native macOS. Windows is quite a bit slower than this.