$ go version
go version go1.15.2 linux/amd64
$ uname -a
Linux gauss 5.7.0-2-amd64 #1 SMP Debian 5.7.10-1 (2020-07-26) x86_64 GNU/Linux
I have a program that shall not heap-allocate in its main loop, with a test checking that. When I re-compiled the program with Go1.15, the test started failing. This is the offending line according to mem profile data:
os.Stdout.Write([]byte("\033[2J\033[1;1H"))
Ignore the string literal cast to []byte
, it's a red herring. Here's a reproducer:
package p
import "os"
func f() {
os.Stdout.Write([]byte{'h', 'e', 'l', 'l', 'o'})
}
Go1.14:
$ go tool compile -m test.go
test.go:5:6: can inline f
test.go:6:24: []byte literal does not escape
Go1.15:
$ go tool compile -m test.go
test.go:5:6: can inline f
test.go:6:24: []byte literal escapes to heap
git bisect
points to 8c1db77a92b1d17d3fe07999c5f20602a2080be9, but I don't know if that makes sense.
cc @dr2chase @randall77 @ianlancetaylor
I think that commit is indeed the cause. Looking at os/file.go, these two:
112 func (f *File) Read(b []byte) (n int, err error) {
169 func (f *File) Write(b []byte) (n int, err error) {
md5-47c3926bfe7f6a6d3c56c60b52238e52
os/file.go:112:21: leaking param: b
os/file.go:169:22: leaking param: b
md5-3ddb9c3cb2cb30563cd6e7ab015baa97
os/file.go:112:21: b does not escape
os/file.go:169:22: b does not escape
Probably we need to mark the ignoringEINTR
function as //go:noescape
.
(But that's a bit unfortunate, because that function is actually _parametrically_ //go:noescape
.)
Well, that's annoying.
//go:noescape
won't help because it only applies to function declarations, not function definitions.
I'll let the compiler team decide whether they want to improve the escape analysis or whether we should replace ignoringEINTR
and ignoringEINTRIO
with explicit loops.
CC @dr2chase
The escape analysis may be failing in this case because the functions are not being inlined.
The functions cannot be inlined because they contain loops (that's #14768).
Clearly we should replace the loops with goto
statements.
I guess by using ignoringEINTR
it changed direct calls to function pointer calls, which makes the escape analysis conservative. Inlining might help, or not (I'm not sure the compiler constant-folds function pointers).
Looks like we don't get around to that constant-fold:
v166 00053 (574) MOVQ syscall.Write路f(SB), AX
v77 00054 (574) LEAQ syscall.Write路f(SB), DX
v167 00055 (574) PCDATA $1, $3
v167 00056 (574) CALL AX
This is sad.
Looks like we do have a few opportunities to do that.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
# runtime
src/runtime/mpagealloc_64bit.go:148:55: closure-to-static 2 "".(*pageAlloc).sysGrow.func1路f
src/runtime/print.go:286:6: closure-to-static 1 "".hexdumpWords.func1路f
src/runtime/print.go:298:5: closure-to-static 1 "".hexdumpWords.func1路f
src/runtime/proc.go:2574:12: closure-to-static 1 "".injectglist.func1路f
src/runtime/proc.go:2589:12: closure-to-static 1 "".injectglist.func1路f
# internal/poll
src/internal/poll/fd_unix.go:155:28: closure-to-static 2 syscall.Read路f
src/internal/poll/fd_unix.go:267:28: closure-to-static 2 syscall.Write路f
src/internal/poll/fd_unix.go:426:28: closure-to-static 2 syscall.ReadDirent路f
src/internal/poll/fd_unix.go:530:24: closure-to-static 2 syscall.Write路f
# reflect
src/reflect/deepequal.go:45:9: closure-to-static 2 "".deepValueEqual.func1路f
# debug/macho
src/debug/macho/file.go:632:24: closure-to-static 1 "".(*File).DWARF.func2路f
src/debug/macho/file.go:651:24: closure-to-static 1 "".(*File).DWARF.func2路f
# debug/pe
src/debug/pe/file.go:258:24: closure-to-static 1 "".(*File).DWARF.func2路f
src/debug/pe/file.go:277:24: closure-to-static 1 "".(*File).DWARF.func2路f
# cmd/internal/obj/mips
src/cmd/internal/obj/mips/a.out.go:252:3: closure-to-static 3 "".init.0.func1路f
src/cmd/internal/obj/mips/a.out.go:253:3: closure-to-static 3 "".init.0.func1路f
src/cmd/internal/obj/mips/a.out.go:257:3: closure-to-static 3 "".init.0.func1路f
# internal/xcoff
src/internal/xcoff/ar.go:133:31: closure-to-static 1 "".NewArchive.func1路f
src/internal/xcoff/ar.go:143:35: closure-to-static 1 "".NewArchive.func1路f
src/internal/xcoff/ar.go:166:33: closure-to-static 1 "".NewArchive.func1路f
src/internal/xcoff/ar.go:173:35: closure-to-static 1 "".NewArchive.func1路f
src/internal/xcoff/ar.go:206:31: closure-to-static 1 "".NewArchive.func1路f
# cmd/internal/obj/ppc64
src/cmd/internal/obj/ppc64/a.out.go:269:3: closure-to-static 3 "".init.0.func1路f
src/cmd/internal/obj/ppc64/a.out.go:270:3: closure-to-static 3 "".init.0.func1路f
src/cmd/internal/obj/ppc64/a.out.go:271:3: closure-to-static 3 "".init.0.func1路f
src/cmd/internal/obj/ppc64/a.out.go:272:3: closure-to-static 3 "".init.0.func1路f
src/cmd/internal/obj/ppc64/a.out.go:274:3: closure-to-static 3 "".init.0.func1路f
src/cmd/internal/obj/ppc64/a.out.go:275:3: closure-to-static 3 "".init.0.func1路f
# cmd/cgo
src/cmd/cgo/gcc.go:1707:29: closure-to-static 1 "".(*Package).gccDebug.func3路f
src/cmd/cgo/gcc.go:1720:32: closure-to-static 1 "".(*Package).gccDebug.func4路f
src/cmd/cgo/gcc.go:1786:29: closure-to-static 1 "".(*Package).gccDebug.func3路f
src/cmd/cgo/gcc.go:1799:32: closure-to-static 1 "".(*Package).gccDebug.func4路f
src/cmd/cgo/gcc.go:1860:28: closure-to-static 1 "".(*Package).gccDebug.func3路f
src/cmd/cgo/gcc.go:1872:31: closure-to-static 1 "".(*Package).gccDebug.func4路f
src/cmd/cgo/gcc.go:1932:28: closure-to-static 1 "".(*Package).gccDebug.func3路f
src/cmd/cgo/gcc.go:1944:31: closure-to-static 1 "".(*Package).gccDebug.func4路f
# cmd/compile/internal/ssa
src/cmd/compile/internal/ssa/branchelim.go:206:29: closure-to-static 1 "".elimIf.func1路f
src/cmd/compile/internal/ssa/branchelim.go:213:27: closure-to-static 1 "".elimIf.func1路f
src/cmd/compile/internal/ssa/branchelim.go:235:17: closure-to-static 1 "".elimIf.func2路f
src/cmd/compile/internal/ssa/branchelim.go:241:26: closure-to-static 1 "".elimIf.func2路f
src/cmd/compile/internal/ssa/rewrite.go:845:27: closure-to-static 1 "".disjoint.func1路f
src/cmd/compile/internal/ssa/rewrite.go:846:27: closure-to-static 1 "".disjoint.func1路f
# cmd/compile/internal/gc
src/cmd/compile/internal/gc/plive.go:1509:16: closure-to-static 1 "".(*Liveness).emit.func1路f
src/cmd/compile/internal/gc/plive.go:1509:38: closure-to-static 1 "".(*Liveness).emit.func1路f
src/cmd/compile/internal/gc/ssa.go:3504:16: closure-to-static 3 "".init.1.func28路f
src/cmd/compile/internal/gc/ssa.go:3507:16: closure-to-static 3 "".init.1.func28路f
src/cmd/compile/internal/gc/ssa.go:3706:17: closure-to-static 1 "".init.1.func45路f
src/cmd/compile/internal/gc/ssa.go:3709:17: closure-to-static 1 "".init.1.func45路f
src/cmd/compile/internal/gc/ssa.go:3712:17: closure-to-static 1 "".init.1.func45路f
src/cmd/compile/internal/gc/ssa.go:3715:17: closure-to-static 1 "".init.1.func45路f
src/cmd/compile/internal/gc/ssa.go:3916:21: closure-to-static 2 "".init.1.func71路f
src/cmd/compile/internal/gc/ssa.go:3924:21: closure-to-static 2 "".init.1.func71路f
src/cmd/compile/internal/gc/ssa.go:3932:21: closure-to-static 2 "".init.1.func71路f
src/cmd/compile/internal/gc/ssa.go:3945:21: closure-to-static 2 "".init.1.func71路f
Building packages and commands for darwin/amd64.
# go/build
src/go/build/build.go:560:64: closure-to-static 1 "".(*Context).Import.func2路f
src/go/build/build.go:571:67: closure-to-static 1 "".(*Context).Import.func2路f
# go/types
src/go/types/builtins.go:246:12: closure-to-static 1 "".(*Checker).builtin.func4路f
src/go/types/builtins.go:247:12: closure-to-static 1 "".(*Checker).builtin.func4路f
# cmd/cgo
src/cmd/cgo/gcc.go:1707:29: closure-to-static 1 "".(*Package).gccDebug.func3路f
src/cmd/cgo/gcc.go:1720:32: closure-to-static 1 "".(*Package).gccDebug.func4路f
src/cmd/cgo/gcc.go:1786:29: closure-to-static 1 "".(*Package).gccDebug.func3路f
src/cmd/cgo/gcc.go:1799:32: closure-to-static 1 "".(*Package).gccDebug.func4路f
src/cmd/cgo/gcc.go:1860:28: closure-to-static 1 "".(*Package).gccDebug.func3路f
src/cmd/cgo/gcc.go:1872:31: closure-to-static 1 "".(*Package).gccDebug.func4路f
src/cmd/cgo/gcc.go:1932:28: closure-to-static 1 "".(*Package).gccDebug.func3路f
src/cmd/cgo/gcc.go:1944:31: closure-to-static 1 "".(*Package).gccDebug.func4路f
# cmd/vendor/github.com/google/pprof/internal/elfexec
src/cmd/vendor/github.com/google/pprof/internal/elfexec/elfexec.go:149:27: closure-to-static 1 "".GetBuildID.func1路f
src/cmd/vendor/github.com/google/pprof/internal/elfexec/elfexec.go:161:27: closure-to-static 1 "".GetBuildID.func1路f
# cmd/vendor/github.com/ianlancetaylor/demangle
src/cmd/vendor/github.com/ianlancetaylor/demangle/demangle.go:560:38: closure-to-static 1 "".(*state).prefix.func1路f
src/cmd/vendor/github.com/ianlancetaylor/demangle/demangle.go:572:38: closure-to-static 1 "".(*state).prefix.func1路f
# cmd/vendor/golang.org/x/tools/go/types/typeutil
src/cmd/vendor/golang.org/x/tools/go/types/typeutil/ui.go:34:48: closure-to-static 1 "".IntuitiveMethodSet.func1路f
# crypto/x509
src/crypto/x509/root_darwin_amd64.go:201:19: closure-to-static 1 "".sslTrustSettingsResult.func1路f
# cmd/vendor/github.com/google/pprof/internal/driver
src/cmd/vendor/github.com/google/pprof/internal/driver/commands.go:262:38: closure-to-static 2 "".usage.func1路f
src/cmd/vendor/github.com/google/pprof/internal/driver/commands.go:271:38: closure-to-static 2 "".usage.func1路f
src/cmd/vendor/github.com/google/pprof/internal/driver/commands.go:272:38: closure-to-static 2 "".usage.func1路f
src/cmd/vendor/github.com/google/pprof/internal/driver/commands.go:287:40: closure-to-static 2 "".usage.func1路f
src/cmd/vendor/github.com/google/pprof/internal/driver/commands.go:297:24: closure-to-static 2 "".usage.func1路f
src/cmd/vendor/github.com/google/pprof/internal/driver/commands.go:299:30: closure-to-static 2 "".usage.func1路f
# cmd/go/internal/modload
src/cmd/go/internal/modload/load.go:1164:15: closure-to-static 1 "".scanDir.func1路f
src/cmd/go/internal/modload/load.go:1164:33: closure-to-static 1 "".scanDir.func1路f
# cmd/go/internal/work
src/cmd/go/internal/work/exec.go:604:29: closure-to-static 3 "".(*Builder).build.func3路f
Change https://golang.org/cl/256077 mentions this issue: cmd/compile: add rewrite rule to change callClosure(constant) to callStatic
Changing inlining to allow for
loops seems reasonable to me. If we already allow inlining if
and goto
(as Ian points out), then it seems silly to continue disallowing for
loops.
Inlining for
loops tickles some bug in the front-end, I forget exactly what but it shows up in all.bash.
We should find and fix that bug. Inlining goto
but not for
creates a bit of a perverse incentive...
I think I found the issue.
Okay, I found a sequence of optimizations that makes the original code no longer heap escape:
Make for
loops inlineable. I think the frontend issue that @dr2chase was referring to is that named break/continue statements aren't correctly rewritten by inlining. For the purpose of this issue though, it suffices to optimize for
loops without named break/continues.
Change inlining logic so that parameter assignments match the desired pattern for the frontend's simple single-assignment logic. This is necessary because otherwise:
ignoringEINTR(syscall.Write, fd, p)
gets inlined to
var inl_fn, inl_fd, inl_p ...
inl_fn, inl_fd, inl_p = syscall.Write, fd, p
for {
n, err := inl_fn(inl_fd, inl_p)
...
}
and we still pessimistically assume inl_fn
could be a call to anything rather than recognizing it's only ever syscall.Write
.
Extend inlining to consider optimizing f := someFunc; f(...)
calls just like it already handles f := func(...) { ...}; f(...)
calls.
Extend escape analysis to use the same logic as inlining for identifying static calls through local variables.
It's worth noting that the last 3 steps are only necessary because ignoringEINTR accepts a function and the arguments to pass to it. If it's changed so that call sites look like:
ignoringEINTR(func() (int, error) { return syscall.Write(fd, p) })
then optimization 1 alone is sufficient. This is because it's possible for escape analysis to tell that p
is only passed to syscall.Write
without depending on inlining.
t's worth noting that the last 3 steps are only necessary because ignoringEINTR accepts a function and the arguments to pass to it.
On tip this is ignoringEINTRIO
and ignoringEINTR
takes a simple closure. It would be fine with me to replace ignoringEINTRIO
with ignoringEINTR
on tip and to make the equivalent transformation in 1.15.
Oh, I missed that this is a regression with Go 1.15, not tip, so fixing it requires backporting changes. I'd rather not have to backport any new inlining / escape analysis optimizations.
Also, correction: if we change the function to take a simple closure, then I think none of the optimizations are strictly necessary. I think that change would be best/safest to backport to Go 1.15.
I'll still upload CLs for the optimizations though, because they look like they'll help optimize a fair bit of cmd code.
@gopherbot Please open a backport issue for 1.15.
This is an unexpected and unavoidable performance regression in 1.15: using Write
to write a []byte
forces the []byte
to escape into the heap. That did not happen in 1.14.
Backport issue(s) opened: #41542 (for 1.14), #41543 (for 1.15).
Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.
Change https://golang.org/cl/256418 mentions this issue: [release-branch.go1.15] internal/poll: adjust ignoringEINTR to avoid slice escape
Change https://golang.org/cl/256460 mentions this issue: cmd/compile: use staticValue for inlining logic
Change https://golang.org/cl/256458 mentions this issue: cmd/compile: improve escape analysis of known calls
Change https://golang.org/cl/256457 mentions this issue: cmd/compile: set n.Name.Defn for inlined parameters
Change https://golang.org/cl/256459 mentions this issue: cmd/compile: allow inlining of "for" loops
Given that we've backported a fix for this issue to Go 1.15, I believe we should not release Go 1.16 without fixing this (or re-visiting this in some way), so marking as a release-blocker.
The plan here is to try to fix this in the compiler, but if we get to the freeze without that landing, we'll just forward-port the 1.15 fix.
Change https://golang.org/cl/262678 mentions this issue: test: add regression test from #41474
I've landed all the individual compiler optimizations to fix this, but it's proving tedious to write a test that shows the original test case involving os.Stdout.Write has actually been fixed. E.g., js/wasm and apparently Windows also still escape, but at least js/wasm also escaped in Go 1.15. I haven't looked into Windows.
It should be a straightforward task for someone to take CL 262678 and amend it. Probably it just needs to check runtime.GOOS == "windows"
.
os.Stdout.Write
used to escape memory on js/wasm and Windows, and everywhere with the race detector enabled.
Change https://golang.org/cl/263097 mentions this issue: test: add regression test from #41474
Thanks @ALTree for helping with the test case!
Change https://golang.org/cl/263537 mentions this issue: Revert "test: add regression test from #41474"
Most helpful comment
Clearly we should replace the loops with
goto
statements.