Go: cmd/compile: anonymous structs consume space in binary

Created on 7 Apr 2020  路  8Comments  路  Source: golang/go

Anonymous structs can take quite a significant amount of space in the compiled binary. Since anonymous structs do not have a proper name hence the full anonymous struct description used as the name.

As a quick experiment, by replacing some anonymous structs in Go compiler with a named type the compiler shrunk 14KB and the gofmt shrunk by 10KB.

As an extreme example: by changing: https://github.com/golang/go/blob/553a8626ba04981d362ee5937583d2592b305eae/src/runtime/mgc.go#L945 to a named type, the compiler binary shrunk ~5KB.

diff --git a/src/runtime/mgc.go b/src/runtime/mgc.go
index 7a8ab5314f..94693e5e87 100644
--- a/src/runtime/mgc.go
+++ b/src/runtime/mgc.go
@@ -945 +945,4 @@ const gcOverAssistWork = 64 << 10
-var work struct {
+var work _work
+
+//go:notinheap
+type _work struct {
Related issues #6853, #36313 ### Go Version
$ go version
go version devel +74d6de03fd Mon Apr 6 18:06:41 2020 +0000 windows/amd64
NeedsInvestigation binary-size

Most helpful comment

It turns out we just significantly improved this! Commit 44d22869a8df6419f894317b10c9f8329706467a drops type descriptors that aren't used, which means these long symbols names and many other things no longer appear in the binary. This shrinks cmd/compile by 140 KiB.

There's still more that could be done. If you put a large anonymous type in an interface, the linker will still pull in the type descriptor and all of these large symbol names. On our list of things to explore in the linker (not for 1.15) is to use structural hashes to name and dedup type symbols. The downside, of course, is that the type symbol names in the binary become much less informative.

All 8 comments

@aclements

This could maybe help with smaller binaries.

/cc @josharian @randall77

Instead of adding all the fields as part of the name could we take the variables package path and variables name and add a prefix or suffix that can only be generated by the compiler?

This is probably interesting to @bradfitz too, particularly the runtime example saving 5KiB.

@martisch just to clarify, I wasn't proposing fixing all these manually :D. It's something that compiler/linker could do. And yeah, my thoughts were along the same line, A) use variable name for instead of the string description or B) generate dynamically the name when needed and use some hash in the binary.

It turns out we just significantly improved this! Commit 44d22869a8df6419f894317b10c9f8329706467a drops type descriptors that aren't used, which means these long symbols names and many other things no longer appear in the binary. This shrinks cmd/compile by 140 KiB.

There's still more that could be done. If you put a large anonymous type in an interface, the linker will still pull in the type descriptor and all of these large symbol names. On our list of things to explore in the linker (not for 1.15) is to use structural hashes to name and dedup type symbols. The downside, of course, is that the type symbol names in the binary become much less informative.

Moving to unplanned since we fixed a lot of this problem for 1.15, and exploring further improvements is on the list for the linker, but involves trade-offs that need to be explored.

I can confirm that on tip the _work example reduces compiler size by ~500 bytes.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

longzhizhi picture longzhizhi  路  3Comments

dominikh picture dominikh  路  3Comments

michaelsafyan picture michaelsafyan  路  3Comments

jayhuang75 picture jayhuang75  路  3Comments

Miserlou picture Miserlou  路  3Comments