It would be good if go was able to generate bit-for-bit identical binaries, even if the build environment changes in unimportant ways. (For example, so users can reproduce binaries without requiring root.)
Up until recently, we have been able to reproduce go binaries whilst modifying many parts of the build environment, except that we kept the build-path constant. However recently, we started to also vary the build-path. (Why "recently", is irrelevant to this report, but I can go into it if you ask.)
Anyway, now we can see that .gopclntab
embeds the build path into the resulting binary. This patch will get rid of this environment-specific information, but I know it's not perfect:
--- src/cmd/link/internal/ld/pcln.go
+++ src/cmd/link/internal/ld/pcln.go
@@ -41,6 +41,8 @@
// iteration over encoded pcdata tables.
+var cwd, _ = os.Getwd()
+
func getvarint(pp *[]byte) uint32 {
v := uint32(0)
p := *pp
@@ -152,7 +154,8 @@
f.Value = int64(ctxt.Nhistfile)
f.Type = obj.SFILEPATH
f.Next = ctxt.Filesyms
- f.Name = expandGoroot(f.Name)
+ //f.Name = expandGoroot(f.Name)
+ f.Name, _ = filepath.Rel(cwd, f.Name)
ctxt.Filesyms = f
}
}
In particular, I'm not sure how this will interfere with readers of this information. I know that src/debug/pclntab.go
carries an API for this, but I'm not sure what sorts of contracts you have published for that, that people expect.
Also, the part where I comment out expandGoroot
is not strictly necessary for reproducibility, but would allow, e.g. a user to try to reproduce a binary from his _own_ go toolchain - which is some extra assurance that the copies behave the same way.
The call to Getwd
during static initialisation is also a bit dirty; I could delay this for later but that may or may not require locks, or pass it in from an parent/ancestor caller but that would require adding extra arguments to some functions.
However if you give me some guidelines on how to make this patch acceptable, I'll be happy to do this work and submit a PR.
When you say that you are varying the build path, what do you mean precisely?
This seems like a restatement of #9206.
Yeah, let's merge conversation in #9206.
@infinity0, instructions for sending a change are at https://golang.org/doc/contribute.html#Code_review which looks like a wall of text, but it's not many actual steps. Be sure to check errors, not have commented-out code, and include tests if possible.
Reopening, since this is slightly different from #9206.
Thanks for the explanation (over on #9206). It's not clear to me that this should be a goal. I agree that reproducible builds are essential. However, it's not clear to me that reproducible builds when the sources are in different directories are essential.
For example, exactly the same thing happens when using GCC with the -g option. The source directory is included in the debug info, in the DW_AT_comp_dir
attribute.
Not including the source directory will make debugging more difficult.
For GCC, we are setting -fdebug-prefix-map=$SRC_ROOT=.
dynamically for most builds, which removes it from DW_AT_comp_dir
. Recently we landed a patch in GCC to make it not embed debug-prefix-map into DW_AT_producer
, so this now works to achieve buildpath-independent reproducibility.
Yes, we're being quite strict with ourselves in trying to make things independent of the build path, but we think it's a good goal that would allow more people in practice to perform build verification.
Ideally, we'd want as many people to rebuild the _same_ hash as possible, so that the rest of the world (who we assume don't want to do these rebuilds) can see that 20 people built the same hash, rather than 5 people built 4 different hashes. It is indeed unclear at the moment what the "best tradeoff" is - we're just scratching the surface of this topic ourselves - but this particular issue seemed fairly easy to me to fix.
(edit: more accurate to say SRC_ROOT
instead of PWD
; we set it once at the start of the build when they happen to be equal)
Yes, "making debugging more difficult" was also my concern. FWIW, our experiments with GCC did not make things harder to debug (this was by chance, we had to try it out to see it). I can appreciate that Go is different, but if you can explain the details I could also try to think of solutions.
I think this is important. Debugging is also very important. From what I understand this is the issue of two users with different goroots set. Renaming of a module/forking it to a new repo name would not be supported. So, is the question how to share debug information? How to relate debug information created in one root to another root? If you are in the same root as the debug info was created it would be nice if we could strip out all the stuff outside the root.
Couldnt the paths stored in the binary be somehow prefixed with $GOPATH
so that when i'm debugging a binary it uses code in my gopath correctly?
The compiler has a -trimpath
flag, which might be close to what you need. When it's set to e.g. $GOPATH/src
, the paths included in the binary's debug info will be relative paths with that prefix (and a /
) removed. This is handy for use with go tool pprof
, which will search for the relevant code in your GOPATH. Other tools might have similar behaviors when the debug info contains relative paths.
Since the -trimpath
flag is on the compiler (and assembler), it works by modifying the contents of the .a files (stored in $GOPATH/pkg
).
The $GOROOT_FINAL
environment variable (currently) controls the path prefix used for debug info of packages within $GOROOT
. With these two features combined, a command like env GOROOT_FINAL=/usr/local/go go install -a -gcflags="-trimpath=$GOPATH/src" .
(untested) should normalize or remove all file paths in the debug info, assuming that $GOPATH
refers to a single directory.
There are at least two other ways that directory names end up in binaries, even with cgo disabled. First, DW_AT_comp_dir
, which when I last tested on OSX was set to the current working directory used for the linker command. This may be normalizable by starting with cd /
and using fully-qualified paths for the build. Second, the runtime.GOROOT
function is backed by a file that's generated by ./src/make.bash
. That file contains the value of $GOROOT
when make.bash
was executed (when the Go toolchain was built). If bit-for-bit reproducible builds are more important to you than the complexity of managing custom $GOROOT
settings, that might help too.
These settings combined can get very close to bit-for-bit reproducible builds. The Go toolchain's build id also needs to be constrained, maybe via -ldflags=-buildid=xxx
, but I haven't figured out how to make its value be consistent across machines automatically.
To be clear, I _don't recommend using these knobs_ ... unless their value is worth their complexity cost for your application, and you're prepared to own the pieces when it stops working or changes behavior on some future release.
Thanks for the detailed reply, @rhysh.
We started specifying the -trimpath
flag with version 1.22 of our build helper dh-golang (see https://anonscm.debian.org/cgit/pkg-go/packages/dh-golang.git/log/?h=debian/1.22), which just entered Debian unstable. In my tests, the flag worked as expected, but let’s see how it goes… :)
I also prepared a patch which makes cmd/link
honor the BUILD_PATH_PREFIX_MAP
environment variable (see https://reproducible-builds.org/specs/build-path-prefix-map/ for the spec), eliminating build paths from the DW_AT_comp_dir
field and DWARF file tables. I uploaded it to our reproducible builds infrastructure and successfully verified it works on a number of packages.
I’ll send it for review once the merge window opens after the Go 1.9 release. This will at least give us some code to further inform the discussion.
Change https://golang.org/cl/73291 mentions this issue: cmd/link: apply BUILD_PATH_PREFIX_MAP
I think we should look at this for Go 1.11. Updated milestone.
@rsc Thanks for clarifying over on #22491 and here.
Just to be clear, when you say “we should look at this”, are you saying the Go team will have a look, or would you rely on external contributions?
I mean I think the Go team should try to make this happen. If external contributors do the work, that's fine. But if not then the Go team should dedicate time to it. If you're interested, that's great. Please discuss the design here before sending a CL. Thanks.
I think it’d be better if the Go team spent some time on this, I’m already quite over-subscribed. Just wanted to make sure this doesn’t fall through the cracks. Thanks!
Didn't get to this for Go 1.11. For Go 1.12 then. It's important.
/cc @hugelgupf who was just asking about this. To see what we do inside Google, see http://go/gobuildpwdhack which sets the PWD environment to /proc/self/cwd/../google3
(which works on Linux, but isn't a portable solution)
@bradfitz I assume that link is only clickable by googlers?
Yup. That's why I also described it.
There may be a bug in Go 1.11: when building with Modules and using -trimpath
, a new, temporary path to _gomod_.go
is embedded into the binary, I've reported this as https://github.com/golang/go/issues/28008
@stapelberg This may also be a problem in Debian sooner or later...
As noted in https://github.com/golang/go/issues/24976#issuecomment-439189276, this may require a change to the trimpath
flag definition in cmd/compile
.
Go 1.12 got away from us due to the late start and other unanticipated time theft. I'll try to work on this during the freeze to be ready for early-in-cycle for Go 1.13.
I've submitted a patch for this some months ago: https://go-review.googlesource.com/c/go/+/65233/
This is needed to fix https://github.com/golang/go/issues/26759
With Go 1.13 binary-only packages will be deprecated leaving only Go Plugins as the only way to distribute Go code where the code shouldn't be disclosed. Currently, Go Plugins cannot be used for this.
Note that a fix for this might also handle #29430.
@rsc @bcmills Are there any updates here regarding this work? If nobody on the Go team itself has time to get this done, I could take a stab at implementing it as long as I can get some guidance to make sure I'm doing things correctly. Even if there are sub-tasks here that need completion to enable this (such as supporting multiple -trimpath
arguments), I'm happy to take a look if you want help.
Last I checked, @rsc was still planning to get to it during the 1.13 cycle.
This almost works but then doesn't because of go-build12345/b123/... in the resultant binary.
Change https://golang.org/cl/173345 mentions this issue: cmd/go: add -trimpath build flag
Change https://golang.org/cl/173344 mentions this issue: cmd/internal/objabi: expand -trimpath syntax
For anyone who wants to try this out, you might need to first do go clean -cache
due to #31896.
An external issue is relying on this issue:
Most helpful comment
Didn't get to this for Go 1.11. For Go 1.12 then. It's important.