go version)?$ go version go version go1.14 darwin/amd64
Yes
go env)?go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN="/Users/s.rabot/go/bin"
GOCACHE="/Users/s.rabot/Library/Caches/go-build"
GOENV="/Users/s.rabot/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/s.rabot/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/z0/rcywz4nn3jg6g96h67c_4xzdw31lvl/T/go-build757609694=/tmp/go-build -gno-record-gcc-switches -fno-common"
package main
import (
"testing"
)
func BenchmarkNoCap(b *testing.B) {
for i := 0; i < b.N; i++ {
var slice []string
slice = append(slice, []string{"a", "q", "w"}...)
slice = append(slice, []string{"z", "s", "x"}...)
}
}
func BenchmarkCap(b *testing.B) {
for i := 0; i < b.N; i++ {
slice := make([]string, 0, 6)
slice = append(slice, []string{"a", "q", "w"}...)
slice = append(slice, []string{"z", "s", "x"}...)
}
}
func BenchmarkCapNoMagicNumber(b *testing.B) {
for i := 0; i < b.N; i++ {
base := []string{"a", "q", "w"}
fixed := []string{"z", "s", "x"}
slice := make([]string, 0, len(base)+len(fixed))
slice = append(slice, base...)
slice = append(slice, fixed...)
}
}
func BenchmarkCapNoMagicNumberFixedSlices(b *testing.B) {
for i := 0; i < b.N; i++ {
base := [...]string{"a", "q", "w"}
fixed := [...]string{"z", "s", "x"}
slice := make([]string, 0, len(base)+len(fixed))
slice = append(slice, base[:]...)
slice = append(slice, fixed[:]...)
}
}
I would have expected BenchmarkCapNoMagicNumber and BenchmarkCapNoMagicNumberFixedSlices to have the same throughput.
$ go test -bench . -count=3
goos: darwin
goarch: amd64
BenchmarkNoCap-12 7771320 142 ns/op
BenchmarkNoCap-12 7877163 142 ns/op
BenchmarkNoCap-12 7857156 143 ns/op
BenchmarkCap-12 59972187 20.2 ns/op
BenchmarkCap-12 58082768 23.4 ns/op
BenchmarkCap-12 44653009 25.6 ns/op
BenchmarkCapNoMagicNumber-12 15412473 74.8 ns/op
BenchmarkCapNoMagicNumber-12 14540749 75.1 ns/op
BenchmarkCapNoMagicNumber-12 15240964 74.7 ns/op
BenchmarkCapNoMagicNumberFixedSlices-12 64831053 21.0 ns/op
BenchmarkCapNoMagicNumberFixedSlices-12 64266398 21.4 ns/op
BenchmarkCapNoMagicNumberFixedSlices-12 60569865 22.7 ns/op
@randall77, I think you've got the title backwards. It looks like the sliced array is outperforming the slice-literal version for some reason.
Sorry, yes.
Slow routine allocates because len(base)+len(fixed) of slices isn't determined at compile time, whereas with arrays the expression is constant.
Or to be more precise, during the escape analysis phase, len(base)+len(fixed) isn't known to be a constant. (Later on, SSA figures it out.)
So both are optimized to the extent of what they can be or is there room for improvement in the slice case ?
It鈥檚 definitely possible for them to be improved. The question鈥攚hich I can鈥檛 answer offhand鈥攊s whether it is possible to make a small, local improvement that would cover this case, or whether it would require an overhaul. (There are good independent reasons for an overhaul, but what is needed鈥攅liminating one of the compiler鈥檚 intermediate representations鈥攊s a huge undertaking.)
Is this a duplicate of #20533?
Most helpful comment
It鈥檚 definitely possible for them to be improved. The question鈥攚hich I can鈥檛 answer offhand鈥攊s whether it is possible to make a small, local improvement that would cover this case, or whether it would require an overhaul. (There are good independent reasons for an overhaul, but what is needed鈥攅liminating one of the compiler鈥檚 intermediate representations鈥攊s a huge undertaking.)