Go: cmd/compile: user symbols can be in reserved namespace

Created on 9 Mar 2020  ยท  28Comments  ยท  Source: golang/go

What version of Go are you using (go version)?

$ go version
go version go1.14 darwin/amd64

What did you do?

package main

import "debug/gosym"

func main()  {
    s1 := gosym.Sym{Name: "go.uber.org/zap/zapcore.(*CheckedEntry).Write"}

    println(s1.PackageName())
}

What did you expect to see?

go.uber.org/zap/zapcore

What did you see instead?

empty string

I see problem is right here but since I'm not aware how linker uses these prefixes I can't figure out how to properly fix this condition
https://github.com/golang/go/blob/43c6ada84c6ef47e3b61646d2f2e7f6b7264929d/src/debug/gosym/symtab.go#L40-L44

NeedsFix

Most helpful comment

Thanks. My interpretation of @mdempsky 's comment is that it is safe to use :. It's not going to conflict with struct tags (famous last words when discussing symbol mangling :P), and the fact that we've used : in struct symbol names for a long time is evidence that external linkers don't have an issue with it.

All 28 comments

The debug/gosym package is intended to be used to read object files generated by the compiler. It's not intended to be used by creating your own instances of gosym.Sym.

Can you show us a complete program that demonstrates this problem when opening an object generated by the compiler? Please also include the sources for the object that you want to open. Thanks.

Yeah, sure (go.uber.org/zap is a public available library):

package main

import (
    "debug/gosym"
    "debug/macho"
    "fmt"
    "os"
    "strings"

    "go.uber.org/zap"
)

func main() {
    logger, _ := zap.NewDevelopment()
    logger.Debug("using logger just a little bit")

    executable, _ := os.Executable()
    exe, _ := macho.Open(executable)

    pclndat, _ := exe.Section("__gopclntab").Data()
    symTabRaw, _ := exe.Section("__gosymtab").Data()

    pcln := gosym.NewLineTable(pclndat, exe.Section("__text").Addr)
    symTab, _ := gosym.NewTable(symTabRaw, pcln)
    for _, funk := range symTab.Funcs {
        if strings.Contains(funk.Name, "go.uber.org") {
            fmt.Printf("%s -> %s\n", funk.Name, funk.PackageName())
        }
    }
}

And output (just a few lines):

โฏ go run test.go
go: finding module for package go.uber.org/zap
go: downloading go.uber.org/zap v1.14.0
go: found go.uber.org/zap in go.uber.org/zap v1.14.0
2020-03-09T22:16:31.280+0300    DEBUG   test_package_name/test.go:15    using logger just a little bit
go.uber.org/zap/buffer.(*Buffer).AppendString ->
go.uber.org/zap/buffer.(*Buffer).AppendBool ->
...

Thanks. If user packages can be in the compiler-reserved "go." namespace, that seems like a potential problem.

CC @griesemer @mdempsky @josharian @randall77

Bitten again by our ad hoc symbol mangling rules. :(

I think a short-term workaround could be debug/gosym only checks for "go." and "type." when it can't find a '/'. I don't think the compiler ever generates go.* or type.* symbols that contain a /.

This would still cause problems for top-level user package paths like "go.mystuff" (as opposed to "go.mystuff/dir"). Not sure how to best address that off hand.

Edit: Nevermind, I'm mistaken:

$ nm $(go tool -n compile) | grep 'go\..*/' | head -1
0000000000d9aca0 R go.itab.*cmd/compile/internal/gc.blockHeap,container/heap.Interface

Can we change go. to go.. instead for compiler generated symbols?

Can we change go. to go.. instead for compiler generated symbols?

Seems doable, I make a quick change to go.itab -> go..itab and it's compiled ok and test passed.

Some 2 years ago, there was a somewhat similar issue https://github.com/golang/go/issues/25113 reported and fixed by @aarzilli.
We had clashes with userland variables that collided with compiler generated names such as statictmp_0, and we had to rename from statictmp -> .stmp, from @josharian's suggestion in CL https://go-review.googlesource.com/c/go/+/142497. Seems like a similar mangling scheme could help here, as @cuonglm suggested in https://github.com/golang/go/issues/37762#issuecomment-597585691, thank you!

Change https://golang.org/cl/226282 mentions this issue: cmd/compile,link: use "go.." for compiler genereated types

Change https://golang.org/cl/226283 mentions this issue: debug/gosym: correct checking condition for compiler generated symbols

Howdy @jeremyfaller @randall77 @thanm @cuonglm? Kindly pinging you since @cuonglm's CLs were abandoned in lieu of the new linker landing, which made tests fail, however no confirmation on whether this issue has been addressed, and it is marked for Go1.15. Thank you.

@odeke-em I think it's better to leave it for go1.16 instead. This will make a large change in both compiler and linker.

Awesome, thank you for the advice @cuonglm!

Rather than doubling the period, it would probably be more robust if we included a character that can't appear in user package paths at all. There's a whole bunch of reserved characters at https://golang.org/ref/spec#Import_declarations for us to choose.

For example, I think we can change go. and type. to go: and type:.

One thing that affects the choice or the character is that the symbol names may go to the external linker or dynamic linker in some cases, and some linkers don't like certain characters in symbol names. We could let the Go linker rewrite them, but I'd rather not to.

@cherrymui Do you happen to know/remember any examples of external linkers not liking certain characters?

I remembered Go switching from center dot and division to real dot and slash in linker symbols a long time back, and I thought I remembered it was related to linker (or maybe debugger) issues. But looking now, I can't find any evidence of that. It appears to have been motivated by consistency (a6736fa4ff).

Issue #19529 is one, where the external linker doesn't like the @ character.

@cherrymui Interesting, thanks. As far as I can tell skimming https://www.akkadia.org/drepper/symbol-versioning and looking at strings -a /lib/x86_64-linux-gnu/libc.so.6, the @ isn't actually part of the ELF symbol. Instead it seems just like a UI convention (e.g., recognized by the assembler, and used by readelf for printing, but not nm -D). So it at least seems like @ should be safe. I'm curious what exactly the underlying issue there is.

If : was an issue though, I think we'd know about it because of struct tag conventions; for example:

$ nm `which go` | grep :
00000000007bfae0 T type..eq.struct { Logentry struct { Revision int64 "xml:\"revision,attr\""; Date string "xml:\"date\"" } "xml:\"logentry\"" }
00000000007bfa60 T type..eq.struct { Revision int64 "xml:\"revision,attr\""; Date string "xml:\"date\"" }

Looking at other reserved symbols, it seems like these get used for compiling the runtime, so presumably they're all safe: ()[]{},;*. So we could use one of those instead if we're concerned.

I wonder if it's worth putting special characters in a runtime struct tag or something to help smoke out linker issues.

You won't see an @ in a symbol name in a shared library or an executable, because they store the version information in .gnu.version sections, but you will see them in object files. That is how an object file indicates which version to use for a symbol. See https://www.airs.com/blog/archives/220 .

Ah, I see. I thought the .gnu.version stuff was also used in object files. That makes sense then.

The probably should have used the section in object files too. I don't know why they didn't.

Ping. Should we bump this to 1.17? Changing these prefixes to use a : sounds good to me, and is probably a reasonably easy fix, but doesn't seem worth the risk at this point in the freeze. Does someone want to queue up a wait-release CL for the change (maybe after the Russquake)?

Ping. Should we bump this to 1.17? Changing these prefixes to use a : sounds good to me, and is probably a reasonably easy fix, but doesn't seem worth the risk at this point in the freeze. Does someone want to queue up a wait-release CL for the change (maybe after the Russquake)?

I can make a CL, just need to decide which character to use. As @mdempsky's above comment, : can be problem because of struct tag

Thanks. My interpretation of @mdempsky 's comment is that it is safe to use :. It's not going to conflict with struct tags (famous last words when discussing symbol mangling :P), and the fact that we've used : in struct symbol names for a long time is evidence that external linkers don't have an issue with it.

Thanks. My interpretation of @mdempsky 's comment is that it is safe to use :. It's not going to conflict with struct tags (famous last words when discussing symbol mangling :P), and the fact that we've used : in struct symbol names for a long time is evidence that external linkers don't have an issue with it.

Ah right, sorry, I mis-read the explanation.

Russquake?

Yeah, I think switching to : should be safe.

I'm not sure about any downstream impacts from this though. E.g., if debuggers or any other tools will need updating.

Bumping to 1.17 because this seems like too much potential risk for relatively little reward. But feel free to send a CL and we can land it as soon as the tree opens.

Was this page helpful?
0 / 5 - 0 ratings