I expected to see documentation on how to correctly write benchmarks that avoid compiler optimizations and examples that reflect best practices.
If the techniques described in Dave's blog post are no longer necessary, I expected to see explicit documentation to that effect.
Neither of those things.
@davecheney,
Is that kind of optimization avoidance still recommended?
If so, do you think that info should be put in https://golang.org/pkg/testing/#hdr-Benchmarks? I don't see an official benchmark wiki so seems useful to give a short explanation in testing doc.
Couldn't reproduce benchmark code elision in currently-supported Go versions.
@as I believe it can happen now with inlined calls. There is discussion of purity analysis, which might impact non-inlined pure calls later.
How come no action has been taken here? This indeed seems like it would be documentation worthy. Does it warrant a doc update?
@gbbr I think adding some documentation around this would be fine. No one has gotten to it yet. Want to send a CL?
Based on the conversation in this thread it is still unclear to me what the documentation should say. Does Dave鈥檚 2013 blog post reflect today鈥檚 best practices?
I am not able to come up with an example that illustrates the problem. I'm not sure if this really _is_ a problem today. The example here is wrong, as is #14813, because it uses the loop's index as an argument to the function call. The other example in Dave's post here also does not prove any noticeable differences between the two solutions.
Here's an example that demonstrates the problem. You have to be careful to ensure that the function is inlined but also that the result cannot computed at compile time.
package main
import (
"runtime"
"testing"
"time"
)
func BenchmarkX(b *testing.B) {
for i := 0; i < b.N; i++ {
f()
}
}
var sink int
func BenchmarkXSink(b *testing.B) {
for i := 0; i < b.N; i++ {
sink = f()
}
}
func BenchmarkXKeepAlive(b *testing.B) {
for i := 0; i < b.N; i++ {
runtime.KeepAlive(f())
}
}
var input float64
func init() {
if time.Now().Year() > 1900 {
input = 123412341234123
}
}
func f() int {
x := input
x /= 7.3
x /= 7.3
x /= 7.3
x /= 7.3
return int(x)
}
Here's what I get using gc (tip) and gccgo (8.3.0):
$ go test -bench . bench_test.go
goos: linux
goarch: amd64
BenchmarkX-4 1000000000 0.281 ns/op
BenchmarkXSink-4 43315279 24.5 ns/op
BenchmarkXSinkExported-4 49105191 24.5 ns/op
BenchmarkXKeepAlive-4 48882840 24.3 ns/op
PASS
ok command-line-arguments 5.823s
$ go test -compiler gccgo -bench . bench_test.go
goos: linux
goarch: amd64
BenchmarkX-4 50000000 24.4 ns/op
BenchmarkXSink-4 50000000 24.4 ns/op
BenchmarkXSinkExported-4 50000000 24.5 ns/op
BenchmarkXKeepAlive-4 20000000 68.6 ns/op
PASS
ok command-line-arguments 5.286s
I started one of the other discussions about this a few years ago on golang-dev. I think the situation is still quite unfortunate. To the best of my knowledge, I think that situation is:
Given (1) and (2), I think a lot of the current "sink" approaches are not good since they make the code uglier and they are hard to justify (they protect the benchmark against some, but not all, hypothetical future optimizations).
More recently some people have suggested that using runtime.KeepAlive to mark values that you want to always be computed in the benchmark is the best approach (such as @randall77's comment here). That seems better, but is still not completely ideal in two ways:
Some comments in these discussions have seemed to suggest that writing microbenchmarks is an advanced task for experienced programmers which always requires some knowledge about the compiler and system underneath and therefore there's nothing to be done here. I don't really buy that, though: I think that even beginners can reasonably want to write benchmarks which measure how long their function f takes to run and compare different approaches to writing f, and we should make it as easy as possible to avoid any future pitfalls where the obvious benchmark code becomes broken.
I advocate that we decide on a single best approach here (which seems to be using runtime.KeepAlive or else a new helper in the testing package), document it thoroughly, and adopt it as widely as possible.
I just ran into this in some mostly-for-fun unsafe code: the optimized version of the code is a non-allocating pure function, so it got optimized away in the benchmark.
I ended up working around it using runtime.KeepAlive, but ideally the compiler should be smart enough to know not to DCE calls within a Benchmark function itself.
I don't think the compiler should apply special treatment to benchmarks. That will in the long run mean that benchmarks aren't doing what we think they are doing. Which is already true, but it will be even worse.
Most helpful comment
I started one of the other discussions about this a few years ago on golang-dev. I think the situation is still quite unfortunate. To the best of my knowledge, I think that situation is:
Given (1) and (2), I think a lot of the current "sink" approaches are not good since they make the code uglier and they are hard to justify (they protect the benchmark against some, but not all, hypothetical future optimizations).
More recently some people have suggested that using runtime.KeepAlive to mark values that you want to always be computed in the benchmark is the best approach (such as @randall77's comment here). That seems better, but is still not completely ideal in two ways:
Some comments in these discussions have seemed to suggest that writing microbenchmarks is an advanced task for experienced programmers which always requires some knowledge about the compiler and system underneath and therefore there's nothing to be done here. I don't really buy that, though: I think that even beginners can reasonably want to write benchmarks which measure how long their function
ftakes to run and compare different approaches to writingf, and we should make it as easy as possible to avoid any future pitfalls where the obvious benchmark code becomes broken.I advocate that we decide on a single best approach here (which seems to be using
runtime.KeepAliveor else a new helper in the testing package), document it thoroughly, and adopt it as widely as possible.