Go: x/build/cmd/release: fails for -target=linux-amd64 due to non-default GOROOT_FINAL value causing TestScript/test_race_install_cgo test to fail

Created on 3 Jun 2020  ·  13Comments  ·  Source: golang/go

Edit: This is an issue for a specific test. See #39386 for the high level tracking issue.

This issue requires investigation. The problem is either in how the release command sets up the environment and executes commands on the builder, or in the builder definition, or in the test, or some combination. Depending on what we learn, we may end up deciding this is okay to resolve after beta1 (but before the final release) and skip this test failure in order to make the Go 1.15 beta 1 available for testing and finding unknown issues sooner.

This can be reproduced (reliably) if your account has permissions needed to run releasebot (they are documented at https://github.com/golang/build/tree/master/cmd/releasebot#permissions) with:

$ release -target=linux-amd64 -version=go1.15beta1 -watch -rev=09e791feb1dfbfc05578a17cd9c035ef82af4033
[...]
ok      cmd/doc 0.154s
ok      cmd/fix 4.935s
go test proxy running at GOPROXY=http://127.0.0.1:38583/mod
go proxy: no archive rsc.io v1.5.2: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.1.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.5.1: file does not exist
go proxy: no archive example.com/newcycle v1.0.0: file does not exist
go proxy: no archive test.go v1.0.0: file does not exist
go proxy: no archive rsc.io v1.5.2: file does not exist
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/test_race_install_cgo (0.39s)
        script_test.go:209: 
            # Tests Issue #10500 (0.392s)
            > [!race] skip
            > [!darwin] ! stale cmd/cgo  # The darwin builders are spuriously stale; see #33598.
            FAIL: testdata/script/test_race_install_cgo.txt:5: cmd/cgo is unexpectedly stale


FAIL
FAIL    cmd/go  81.725s
ok      cmd/go/internal/auth    0.033s
ok      cmd/go/internal/cache   0.140s
[...]
ok      cmd/pack    2.172s
ok      cmd/trace   0.090s
ok      cmd/vet 12.448s
FAIL
go tool dist: Failed: exit status 1

Complete Output

release $ go run . -target linux-amd64 -watch -version go1.15beta1 -rev 9326d71874730901c159643703a16dc34646e5bd
[... duplicate output from -watch flag]
2020/06/03 10:48:30 linux-amd64: Error: Build failed: exit status 1
Output:
Building Go cmd/dist using /go1.4. (go1.4 linux/amd64)
Building Go toolchain1 using /go1.4.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for linux/amd64.

##### Testing packages.
ok      archive/tar 0.042s
ok      archive/zip 0.424s
ok      bufio   0.122s
ok      bytes   2.197s
ok      compress/bzip2  0.082s
ok      compress/flate  0.920s
ok      compress/gzip   0.117s
ok      compress/lzw    0.128s
ok      compress/zlib   0.791s
ok      container/heap  0.038s
ok      container/list  0.040s
ok      container/ring  0.004s
ok      context 0.032s
ok      crypto  0.004s
ok      crypto/aes  0.027s
ok      crypto/cipher   0.013s
ok      crypto/des  0.030s
ok      crypto/dsa  0.010s
ok      crypto/ecdsa    0.234s
ok      crypto/ed25519  0.217s
ok      crypto/elliptic 0.066s
ok      crypto/hmac 0.005s
ok      crypto/internal/subtle  0.006s
ok      crypto/md5  0.026s
ok      crypto/rand 0.016s
ok      crypto/rc4  0.063s
ok      crypto/rsa  0.145s
ok      crypto/sha1 0.041s
ok      crypto/sha256   0.004s
ok      crypto/sha512   0.013s
ok      crypto/subtle   0.007s
ok      crypto/tls  1.332s
ok      crypto/x509 0.773s
ok      database/sql    0.667s
ok      database/sql/driver 0.005s
ok      debug/dwarf 0.039s
ok      debug/elf   0.021s
ok      debug/gosym 0.013s
ok      debug/macho 0.006s
ok      debug/pe    0.016s
ok      debug/plan9obj  0.038s
ok      encoding/ascii85    0.045s
ok      encoding/asn1   0.009s
ok      encoding/base32 0.048s
ok      encoding/base64 0.007s
ok      encoding/binary 0.016s
ok      encoding/csv    0.040s
ok      encoding/gob    0.025s
ok      encoding/hex    0.029s
ok      encoding/json   0.298s
ok      encoding/pem    0.026s
ok      encoding/xml    0.039s
ok      errors  0.012s
ok      expvar  0.013s
ok      flag    0.134s
ok      fmt 0.152s
ok      go/ast  0.005s
ok      go/build    0.188s
ok      go/constant 0.019s
ok      go/doc  0.070s
ok      go/format   0.020s
ok      go/importer 0.136s
ok      go/internal/gccgoimporter   0.022s
ok      go/internal/gcimporter  3.142s
ok      go/internal/srcimporter 7.145s
ok      go/parser   0.033s
ok      go/printer  0.339s
ok      go/scanner  0.012s
ok      go/token    0.027s
ok      go/types    1.912s
ok      hash    0.008s
ok      hash/adler32    0.020s
ok      hash/crc32  0.016s
ok      hash/crc64  0.003s
ok      hash/fnv    0.012s
ok      hash/maphash    0.116s
ok      html    0.008s
ok      html/template   0.098s
ok      image   0.140s
ok      image/color 0.052s
ok      image/draw  0.070s
ok      image/gif   0.499s
ok      image/jpeg  0.396s
ok      image/png   0.056s
ok      index/suffixarray   0.316s
ok      internal/cpu    0.053s
ok      internal/fmtsort    0.033s
ok      internal/poll   0.179s
ok      internal/profile    0.015s
ok      internal/reflectlite    0.162s
ok      internal/singleflight   0.032s
ok      internal/trace  0.088s
ok      internal/unsafeheader   0.023s
ok      internal/xcoff  0.019s
ok      io  0.090s
ok      io/ioutil   0.007s
ok      log 0.012s
ok      log/syslog  1.215s
ok      math    0.057s
ok      math/big    2.163s
ok      math/bits   0.010s
ok      math/cmplx  0.018s
ok      math/rand   0.309s
ok      mime    0.012s
ok      mime/multipart  0.310s
ok      mime/quotedprintable    0.024s
ok      net 9.831s
ok      net/http    11.624s
ok      net/http/cgi    0.726s
ok      net/http/cookiejar  0.064s
ok      net/http/fcgi   0.039s
ok      net/http/httptest   0.040s
ok      net/http/httptrace  0.014s
ok      net/http/httputil   0.062s
ok      net/http/internal   0.011s
ok      net/http/pprof  5.119s
ok      net/internal/socktest   0.003s
ok      net/mail    0.006s
ok      net/rpc 0.026s
ok      net/rpc/jsonrpc 0.008s
ok      net/smtp    0.021s
ok      net/textproto   0.059s
ok      net/url 0.020s
ok      os  0.840s
ok      os/exec 1.290s
ok      os/signal   2.335s
ok      os/user 0.004s
ok      path    0.006s
ok      path/filepath   0.014s
ok      plugin  0.020s
ok      reflect 0.325s
ok      regexp  0.084s
ok      regexp/syntax   0.886s
ok      runtime 86.307s
ok      runtime/debug   0.102s
ok      runtime/internal/atomic 0.062s
ok      runtime/internal/math   0.009s
ok      runtime/internal/sys    0.036s
ok      runtime/pprof   6.414s
ok      runtime/trace   0.863s
ok      sort    0.197s
ok      strconv 0.667s
ok      strings 1.892s
ok      sync    0.428s
ok      sync/atomic 0.061s
ok      syscall 0.075s
ok      testing 0.330s
ok      testing/iotest  0.033s
ok      testing/quick   0.084s
ok      text/scanner    0.008s
ok      text/tabwriter  0.031s
ok      text/template   0.044s
ok      text/template/parse 0.008s
ok      time    1.943s
ok      unicode 0.021s
ok      unicode/utf16   0.023s
ok      unicode/utf8    0.005s
ok      cmd/addr2line   3.328s
ok      cmd/api 9.369s
ok      cmd/asm/internal/asm    2.424s
ok      cmd/asm/internal/lex    0.033s
ok      cmd/compile 13.557s
ok      cmd/compile/internal/gc 26.960s
ok      cmd/compile/internal/logopt 0.858s
ok      cmd/compile/internal/ssa    0.824s
ok      cmd/compile/internal/syntax 0.014s
ok      cmd/compile/internal/test   0.019s [no tests to run]
ok      cmd/compile/internal/types  0.037s
ok      cmd/cover   2.595s
ok      cmd/doc 0.154s
ok      cmd/fix 4.935s
go test proxy running at GOPROXY=http://127.0.0.1:38583/mod
go proxy: no archive rsc.io v1.5.2: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.1.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.5.1: file does not exist
go proxy: no archive example.com/newcycle v1.0.0: file does not exist
go proxy: no archive test.go v1.0.0: file does not exist
go proxy: no archive rsc.io v1.5.2: file does not exist
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/test_race_install_cgo (0.39s)
        script_test.go:209: 
            # Tests Issue #10500 (0.392s)
            > [!race] skip
            > [!darwin] ! stale cmd/cgo  # The darwin builders are spuriously stale; see #33598.
            FAIL: testdata/script/test_race_install_cgo.txt:5: cmd/cgo is unexpectedly stale


FAIL
FAIL    cmd/go  81.725s
ok      cmd/go/internal/auth    0.033s
ok      cmd/go/internal/cache   0.140s
ok      cmd/go/internal/generate    0.050s
ok      cmd/go/internal/get 0.047s
ok      cmd/go/internal/imports 0.020s
ok      cmd/go/internal/load    0.025s
ok      cmd/go/internal/lockedfile  0.142s
ok      cmd/go/internal/lockedfile/internal/filelock    0.063s
ok      cmd/go/internal/modconv 0.035s
ok      cmd/go/internal/modfetch    0.033s
ok      cmd/go/internal/modfetch/codehost   0.023s
ok      cmd/go/internal/modfetch/zip_sum_test   0.034s
ok      cmd/go/internal/modload 0.022s
ok      cmd/go/internal/mvs 0.038s
ok      cmd/go/internal/par 0.047s
ok      cmd/go/internal/renameio    0.082s
ok      cmd/go/internal/search  0.022s
ok      cmd/go/internal/test    0.060s
ok      cmd/go/internal/txtar   0.013s
ok      cmd/go/internal/web 0.035s
ok      cmd/go/internal/work    0.068s
ok      cmd/gofmt   0.096s
ok      cmd/internal/buildid    0.504s
ok      cmd/internal/dwarf  0.012s
ok      cmd/internal/edit   0.044s
ok      cmd/internal/goobj  1.511s
ok      cmd/internal/goobj2 0.056s
ok      cmd/internal/moddeps    3.387s
ok      cmd/internal/obj    0.033s
ok      cmd/internal/obj/arm64  0.194s
ok      cmd/internal/obj/ppc64  0.145s
ok      cmd/internal/obj/riscv  0.162s
ok      cmd/internal/obj/x86    0.776s
ok      cmd/internal/objabi 0.033s
ok      cmd/internal/src    0.029s
ok      cmd/internal/test2json  0.405s
ok      cmd/link    25.148s
ok      cmd/link/internal/benchmark 0.025s
ok      cmd/link/internal/ld    7.021s
ok      cmd/link/internal/loader    0.058s
ok      cmd/link/internal/sym   0.028s
ok      cmd/nm  6.175s
ok      cmd/objdump 3.473s
ok      cmd/oldlink/internal/sym    0.031s
ok      cmd/pack    2.172s
ok      cmd/trace   0.090s
ok      cmd/vet 12.448s
FAIL
go tool dist: Failed: exit status 1

(The release command creates a tarball and places it in a staging directory; it doesn't publish anything, so it is safe to run with -version=go1.15beta1 flag.)

/cc @golang/osp-team @bcmills @jayconrod @matloob

Builders NeedsFix release-blocker

Most helpful comment

A proper fix turned out not to be terribly invasive.

The comment in cmd/dist/test.go was also stale: it claimed that runtime/debug required GOROOT_FINAL to be unset (it empirically does not), but failed to mention cmd/addr2line and cmd/objdump (both of which needed minor test-only adjustments).

Fortunately, the fixes for those are straightforward: either explicitly unset GOROOT_FINAL when building the binary, or detect when the embedded GOROOT_FINAL filename is invalid and retarget it to GOROOT explicitly.

All 13 comments

I don't have any particular insight, but I am curious to see whether this ends up being related to #33598.

This seems to affect only the 64-bit targets, not 32-bit ones.

Something that I can't explain yet is that this is affecting both linux/amd64 and linux/arm64. The former is configured with the race detector on, the latter without.

Compare:

2020/06/03 18:03:21 linux-amd64: Building (make.bash only).
2020/06/03 18:05:27 linux-amd64: Building race detector.
2020/06/03 18:05:45 linux-amd64: Cleaning goroot (post-build).
[...]
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/test_race_install_cgo (0.33s)
        script_test.go:209: 
            # Tests Issue #10500 (0.330s)
            > [!race] skip
            > [!darwin] ! stale cmd/cgo  # The darwin builders are spuriously stale; see #33598.
            FAIL: testdata/script/test_race_install_cgo.txt:5: cmd/cgo is unexpectedly stale


FAIL
FAIL    cmd/go  82.030s

And:

2020/06/03 18:02:27 linux-arm64: Building (make.bash only).
2020/06/03 18:05:21 linux-arm64: Cleaning goroot (post-build).
[...]
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/test_race_install_cgo (4.22s)
        script_test.go:209: 
            # Tests Issue #10500 (4.199s)
            > [!race] skip
            > [!darwin] ! stale cmd/cgo  # The darwin builders are spuriously stale; see #33598.
            FAIL: testdata/script/test_race_install_cgo.txt:5: cmd/cgo is unexpectedly stale


FAIL
FAIL    cmd/go  125.049s

TestScript documentation says "[race] is whether the race detector can be used". I'm not sure how that's determined exactly yet.

As part of the sequence of commands in release, we remove go/pkg/${GOOS}_${GOARCH}/cmd (see here), and all.bash is run later.

Removing pkg/.../cmd is likely to make cmd/cgo stale, since the packages it imports (cmd/internal/edit, cmd/internal/objabi) won't be available anymore.

The "! stale cmd/go" check was added in CL 223746 on March 18th, so this is the first time it's going through the unique way that release runs all.bash testing, and it seems to not be compatible with it.

I can reproduce locally (after removing the [!darwin] precondition) with:

src $ ./make.bash
[...]
src $ go test -run='TestScript/test_race_install_cgo' -count=1 cmd/go
ok      cmd/go  2.139s
src $ rm -rf ../pkg/darwin_amd64/cmd                                         
src $ go test -run='TestScript/test_race_install_cgo' -count=1 cmd/go
go test proxy running at GOPROXY=http://127.0.0.1:61444/mod
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/test_race_install_cgo (0.13s)
        script_test.go:209: 
            # Tests Issue #10500 (0.119s)
            > [!race] skip
            > ! stale cmd/cgo  # The darwin builders are spuriously stale; see #33598.
            FAIL: testdata/script/test_race_install_cgo.txt:5: cmd/cgo is unexpectedly stale


FAIL
FAIL    cmd/go  1.459s
FAIL
src $ 

The above is likely relevant, but not a complete explanation. all.bash runs make.bash, so it _should_ be reinstalling cmd/cgo and making it non-stale. Need to understand why it's not.

I've asked @bcmills to help look into this, and we're investigating this now. We suspect it's some sort of difference between the way the release command executes all.bash on the builder, and we'll try to zero in on what that is.

@bcmills, I've been using release with this diff to get more debugging info (including the environment variables) from the client.Exec call that starts src/all.bash:

--- a/cmd/release/release.go
+++ b/cmd/release/release.go
@@ -587,6 +587,7 @@ func (b *Build) make() error {
                        Output:   execOut,
                        ExtraEnv: env,
                        Args:     bc.AllScriptArgs(),
+                       Debug:    true,
                })
                if err != nil {
                        return err

This looks like a bug in cmd/go/script_test.go when GOROOT_FINAL is set to a non-default value.

Minimal steps to reproduce:

~/go/src$ export GOROOT_FINAL=/usr/local/go

~/go/src$ ./make.bash
Building Go cmd/dist using /usr/local/google/home/bcmills/go1.4. (go1.4-bootstrap-20170531 linux/amd64)
Building Go toolchain1 using /usr/local/google/home/bcmills/go1.4.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for linux/amd64.
---
Installed Go for linux/amd64 in /usr/local/google/home/bcmills/go
Installed commands in /usr/local/google/home/bcmills/go/bin

The binaries expect /usr/local/google/home/bcmills/go to be copied or moved to /usr/local/go

~/go/src$ ../bin/go test cmd/go

go test proxy running at GOPROXY=http://127.0.0.1:46675/mod
go proxy: no archive rsc.io v1.5.2: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.1.0: file does not exist
go proxy: no archive example.com v0.1.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.5.1: file does not exist
go proxy: no archive example.com/newcycle v1.0.0: file does not exist
go proxy: no archive test.go v1.0.0: file does not exist
go proxy: no archive rsc.io v1.0.0: file does not exist
go proxy: no archive rsc.io v1.4.0: file does not exist
go proxy: no archive golang.org v0.3.0: file does not exist
go proxy: no archive golang.org/x v0.3.0: file does not exist
go proxy: no archive golang.org v0.1.0: file does not exist
go proxy: no archive golang.org/x v0.1.0: file does not exist
go proxy: no archive rsc.io v1.5.2: file does not exist
--- FAIL: TestScript (0.00s)
    --- FAIL: TestScript/test_race_install_cgo (0.27s)
        script_test.go:209:
            # Tests Issue #10500 (0.272s)
            > [!race] skip
            > [!darwin] ! stale cmd/cgo  # The darwin builders are spuriously stale; see #33598.
            FAIL: testdata/script/test_race_install_cgo.txt:5: cmd/cgo is unexpectedly stale


--- FAIL: TestListTemplateContextFunction (0.00s)
    --- FAIL: TestListTemplateContextFunction/GOROOT (0.16s)
        go_test.go:1588: running testgo [list -f {{context.GOROOT}}]
        go_test.go:1588: standard output:
        go_test.go:1588: /usr/local/google/home/bcmills/go

        go_test.go:1593: go list -f "{{context.GOROOT}}": got "/usr/local/google/home/bcmills/go"; want "/usr/local/go"
FAIL
FAIL    cmd/go  132.288s
FAIL

Great find @bcmills, thank you! I'll unassign myself, leaving just you, and update this to NeedsFix, since we now know where the problem is and how to reproduce it without x/build/cmd/release.

If the full fix needs more time, we have the option of splitting that into an okay-after-beta1 release blocker issue in the cmd/go component and resolve this beta-1-blocking x/build/cmd/release issue with a temporary skip of a known test failure.

Well, it's not just a cmd/go bug. cmd/dist intentionally lies about GOROOT_FINAL in order to work around what is arguably a bug in the runtime/debug test:
https://github.com/golang/go/blob/d282b0f1121c475c3b435843a722a47172c20e7e/src/cmd/dist/test.go#L181-L188

That means it's not straightforward to make cmd/go do the right thing w.r.t. GOROOT_FINAL when it is set.

More work needed. 😞

Thanks for the analysis Bryan.

Given the timing, should we factor out that problem into a separate issue, and resolve this one by reverting the "! stale cmd/cgo" check added to TestScript/test_race_install_cgo in CL 223746?

As I understand, this is now a known problem that we can fix after releasing Go 1.15 Beta 1, which can serve the purpose of helping us uncover unknown problems even without this fix.

Change https://golang.org/cl/236818 mentions this issue: cmd/go: remove a bogus assertion in mod_convert_dep

Change https://golang.org/cl/236819 mentions this issue: cmd/dist: do not unset GOROOT_FINAL prior to running tests

A proper fix turned out not to be terribly invasive.

The comment in cmd/dist/test.go was also stale: it claimed that runtime/debug required GOROOT_FINAL to be unset (it empirically does not), but failed to mention cmd/addr2line and cmd/objdump (both of which needed minor test-only adjustments).

Fortunately, the fixes for those are straightforward: either explicitly unset GOROOT_FINAL when building the binary, or detect when the embedded GOROOT_FINAL filename is invalid and retarget it to GOROOT explicitly.

Change https://golang.org/cl/237359 mentions this issue: run.bat: do not unset GOROOT_FINAL before running tests

Was this page helpful?
0 / 5 - 0 ratings