CL 39471 moves a large cache from the heap to a global. (A single Ctxt object is allocated at the beginning of compilation, so there's exactly one of these Prog caches, both before and after the CL.) The CL is just a simplified demo, but it mimics something real I want to do.
The CL causes GC to use lots more CPU when compiling package archive/tar. (It affects other packages as well, but archive/tar shows it most prominently.)
name old time/op new time/op delta
Tar 119ms ± 5% 120ms ± 3% ~ (p=0.061 n=45+46)
name old user-ns/op new user-ns/op delta
Tar 138M ± 5% 158M ± 7% +14.30% (p=0.000 n=44+48)
Note that the real time is about the same--the compiler itself is doing the same amount of work--but the CPU consumed goes up considerably. I'd expect it to be unchanged.
@aclements @RLH
Maybe our global scanning code is suboptimal somehow?
Found an (utterly obvious in retrospect) workaround. Instead of
var cache [10000]obj.Prog
do
var cache *[10000]obj.Prog
func init() {
cache = new([10000]obj.Prog)
}
With that change, performance returns to previous levels. We can revert that workaround once this issue is fixed.
Maybe our global scanning code is suboptimal somehow?
Or because globals are roots, they are always (expensively) rescanned? I'll leave this for people who know the GC.
Or because globals are roots, they are always (expensively) rescanned?
Scanning globals is probably slightly more efficient than scanning the heap (the process is basically identical, but globals use a 1 bit bitmap instead of a 2 bit bitmap). But I don't think that's what's going on here.
GC is triggered by the size of the heap, so if you move a large block of memory from the heap to a global, GC is going to be triggered more often, but it still has just as much work to do, so you spend more aggregate time in GC.
Add heap ballast to the compiler? :P
Sigh. This is a pretty silly situation.
Silliness aside, this also seems like the sort of thing that encourages abuse. Should globals perhaps be included in the trigger calculation?
Should globals perhaps be included in the trigger calculation?
That's not a bad idea. In fact, part of the point of GOGC is to amortize the cost of scanning and if we don't count all of the scannable stuff, we're not getting the full amortization. Some stuff is just hard to count (e.g., stacks), but scannable globals would be easy.
@RLH, thoughts?
Counting bss, data, and stacks as part of GOGC is probably dictated by the
doctrine of least surprise. I am concerned about a situation where a
program has tens of gigabytes in a scanable global while using the heap for
transient data.
A partial solution would be to only count the globals that need scanning
and round up stacks to span granularity.
On Tue, Apr 4, 2017 at 7:58 PM, Austin Clements notifications@github.com
wrote:
Should globals perhaps be included in the trigger calculation?
That's not a bad idea. In fact, part of the point of GOGC is to amortize
the cost of scanning and if we don't count all of the scannable stuff,
we're not getting the full amortization. Some stuff is just hard to count
(e.g., stacks), but scannable globals would be easy.@RLH https://github.com/RLH, thoughts?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/golang/go/issues/19839#issuecomment-291679175, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AA7WnxGLUysyhDEMKuqXXQBpoFXRSor5ks5rstksgaJpZM4MzE86
.
CL https://golang.org/cl/39713 mentions this issue.
Most helpful comment
Add heap ballast to the compiler? :P