Go: runtime: missing deferreturn on linux/ppc64le

Created on 13 May 2020  路  27Comments  路  Source: golang/go

What version of Go are you using (go version)?

$ go version
go version go1.14.2 linux/ppc64le

Does this issue reproduce with the latest release?

Yes, the same issue exists on tip.

This is a regression between Go 1.13 and Go 1.14, presumably either due to the introduction of open coded defers, or due to a bug that is now being triggered.

What operating system and processor architecture are you using (go env)?

go env Output

$ go env
GO111MODULE=""
GOARCH="ppc64le"
GOBIN=""
GOCACHE="/home/jsing/.cache/go-build"
GOENV="/home/jsing/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="ppc64le"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/jsing/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/jsing/src/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/jsing/src/go/pkg/tool/linux_ppc64le"
GCCGO="gccgo"
GOPPC64="power8"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build042050806=/tmp/go-build -gno-record-gcc-switches"

What did you do?

This was initially observed when trying to run tests in a large code base on linux/ppc64le. In order to reproduce the issue, a panic() and defer() needs to be run at a high PC - I've written a Go program (https://play.golang.org/p/CmKwSyteWhX) that produces a Go program that triggers this issue.

Save https://play.golang.org/p/CmKwSyteWhX as gen.go then run:

$ go run gen.go && go run crash/main.go

What did you expect to see?

$ go run gen.go && go run crash/main.go
panic: blah

goroutine 1 [running]:
main.f2()
        /home/jsing/tmp/crash/crash/main.go:20 +0x7c
main.f1()
        /home/jsing/tmp/crash/crash/main.go:16 +0x3c
main.main()
        /home/jsing/tmp/crash/crash/main.go:27 +0x24
exit status 2

What did you see instead?

$ go run gen.go && go run crash/main.go
fatal error: missing deferreturn

runtime stack:           
runtime.throw(0x3ddde3d, 0x13)                                                                                                                                               
        /home/jsing/src/go/src/runtime/panic.go:1116 +0x5c   
runtime.addOneOpenDeferFrame.func1.1(0x3fffd6114a10, 0x0, 0x4260c00)
        /home/jsing/src/go/src/runtime/panic.go:753 +0x258
runtime.gentraceback(0x3dbf7ec, 0xc000084ed0, 0x0, 0xc000000180, 0x0, 0x0, 0x7fffffff, 0x3fffd6114ae0, 0x0, 0x0, ...)
        /home/jsing/src/go/src/runtime/traceback.go:334 +0xea0                                                                                                               
runtime.addOneOpenDeferFrame.func1()                                                  
        /home/jsing/src/go/src/runtime/panic.go:721 +0x8c
runtime.systemstack(0x0)
        /home/jsing/src/go/src/runtime/asm_ppc64x.s:269 +0x94
runtime.mstart()
        /home/jsing/src/go/src/runtime/proc.go:1041

goroutine 1 [running]:                                                                
runtime.systemstack_switch()                                                          
        /home/jsing/src/go/src/runtime/asm_ppc64x.s:216 +0x10 fp=0xc000084db0 sp=0xc000084d90 pc=0x625b0
runtime.addOneOpenDeferFrame(0xc000000180, 0x3dbf7ec, 0xc000084ed0)
        /home/jsing/src/go/src/runtime/panic.go:720 +0x7c fp=0xc000084e00 sp=0xc000084db0 pc=0x3886c
panic(0x3dc9680, 0x3e0d2c0)                                                           
        /home/jsing/src/go/src/runtime/panic.go:929 +0xdc fp=0xc000084ed0 sp=0xc000084e00 pc=0x38eac
main.f2()
        /home/jsing/tmp/crash/crash/main.go:20 +0x7c fp=0xc000084f00 sp=0xc000084ed0 pc=0x3dbf7ec


main.f1()                             
        /home/jsing/tmp/crash/crash/main.go:16 +0x3c fp=0xc000084f30 sp=0xc000084f00 pc=0x3dbf72c
main.main()
        /home/jsing/tmp/crash/crash/main.go:27 +0x24 fp=0xc000084f50 sp=0xc000084f30 pc=0x3dbf834
runtime.main()   
        /home/jsing/src/go/src/runtime/proc.go:203 +0x248 fp=0xc000084fc0 sp=0xc000084f50 pc=0x3bcd8
runtime.goexit()
        /home/jsing/src/go/src/runtime/asm_ppc64x.s:884 +0x4 fp=0xc000084fc0 sp=0xc000084fc0 pc=0x64b64
exit status 2            

Changing n from 4149 to 4148 in gen.go will reduce the number of instructions prior to the defer() and results in the test succeeding.

NeedsFix

All 27 comments

@danscales

@4a6f656c thanks for the repro case!

In trying to reproduce this on a linux-ppc64le-buildlet gomote (after pushing the go source tree, building it with make.sh, and sshing in via 'gomote ssh', I got this error:

~# /workdir/go/bin/go run gen.go
~# /workdir/go/bin/go build crash/main.go

_/root/crash/huge1

crash/huge1/a.s:1: expected '(', found C2
crash/huge1/a.s:1004: expected '(', found C2
crash/huge1/a.s:2007: expected '(', found C2
crash/huge1/a.s:3010: expected '(', found C2
crash/huge1/a.s:4013: expected '(', found C2
crash/huge1/a.s:5016: expected '(', found C2
crash/huge1/a.s:6019: expected '(', found C2
crash/huge1/a.s:7022: expected '(', found C2
crash/huge1/a.s:8025: expected '(', found C2
crash/huge1/a.s:9028: expected '(', found C2
crash/huge1/a.s:10031: expected '(', found C2
asm: too many errors

_/root/crash/huge2

crash/huge2/a.s:1: expected '(', found C2
crash/huge2/a.s:1004: expected '(', found C2
crash/huge2/a.s:2007: expected '(', found C2
crash/huge2/a.s:3010: expected '(', found C2
crash/huge2/a.s:4013: expected '(', found C2
crash/huge2/a.s:5016: expected '(', found C2
crash/huge2/a.s:6019: expected '(', found C2
crash/huge2/a.s:7022: expected '(', found C2
crash/huge2/a.s:8025: expected '(', found C2
crash/huge2/a.s:9028: expected '(', found C2
crash/huge2/a.s:10031: expected '(', found C2
asm: too many errors

Any suggestions on why the as files are not assembling properly? Do you think I'll be able to repro on a gomote buildlet? Thanks!

@danscales - ugh, when I've copied and pasted into play.golang.org, the unicode dot (路) got replaced with <C2><B7>:

fmt.Fprintf(buf, "TEXT <C2><B7>f%d(SB),0,$0-0\n", i)

Correcting that should fix the problem. You should be able to repro on a gomote buildlet as long as it's got enough resources.

(I tried attaching gen.go to directly to this issue, but GitHub complained about invalid file types :S)

Edit: I've just updated the play.golang.org links to a version that should have this fixed.

@4a6f656c Thanks for the fixed gen.go file!

I wasn't able to reproduce the problem on Gomote type linux-ppc64le-buildlet. I get the correct output 'panic: blah' and no 'missing deferreturn' message.

Do you think it would repro better on some other buildlet (maybe linux-ppc64le-power9osu -- I'll try that next)? Is there anything unusual about your ppc64le configuration?

Oh, also, I tried both on current tip and on the Go 1.14.2 release, but didn't repro on either.

@danscales - I don't think there is anything particularly unusual about these machines, but I'll take a closer look later today. You may want to bump `n' up to a larger value (say 8000), as it may be dependent on the system stack allocation.

@danscales - I just tested on a clean machine and noticed that I'd left gen.go with an n of 4148, setting it to 4149 was insufficient on this host (so presumably memory pressure or OS stack allocation is playing into it). Setting it to 8000 did trigger the issue however.

OK, thanks to the repro case from Joel, I was able to figure out that this was due to the use of trampolines on PPC64 for calling deferreturn for programs with very large text sizes. The current method for finding/marking the deferreturn stub in a function doesn't work with these trampolines. These trampolines currently are possibly used only for arm and ppc64. I will check out the best way to identify these trampolines for deferreturn.

Trampolines are very simple functions and don't have deferreturn in them. I wonder why they are special. I'll take a look. Let me know if there is anything I could help.

To clarify, the problem is that a call to deferreturn in a normal function (which has defers) is being turned into a trampoline, and therefore the code to recognize the deferreturn stub (for open-coded defers) based on a call to deferreturn is not working. The code for recognizing the deferreturn call is in pcln.go:computeDeferReturn(). We just need to decide if some extra pattern matching (for trampoline calls) is OK, or if we should do some more complicated passing of the relative position of the deferreturn call within the function from the compiler.

Thanks, @danscales . I think I understand the issue now. I can try to write a CL, if that's helpful.

Change https://golang.org/cl/234105 mentions this issue: cmd/link: detect trampoline of deferreturn call

@cherrymui Oh, that was quick -- thanks for writing a CL. I'll take a look!

Is there any chance of this being backported to 1.14 ? Thanks!

This is a critical bug for kubernetes project, can someone help us cherry-picking this to 1.14 release? @danscales @aclements @4a6f656c @thanm @danscales @jeremyfaller

@gopherbot please open a backport to 1.14

Backport issue(s) opened: #39991 (for 1.14).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@gopherbot please open a backport to 1.14

@ianlancetaylor Thanks for a quick response, may I know when 1.14.5 bits will be available?

We normally do minor releases around the start of each month, which in this case due to the U.S. Independence Day holiday means the beginning of next week. But someone will need to backport the change.

I'll do the backport. I'll note that this won't be a clean cherry-pick, as the fix here is applied to the new linker, while Go 1.14 still uses the old linker. The logic is easy to backport, though.

Thanks, @cherrymui . Let me know if I can help in any way.

Change https://golang.org/cl/240917 mentions this issue: [release-branch.go1.14] cmd/link: detect trampoline of deferreturn call

Change https://golang.org/cl/241087 mentions this issue: cmd/oldlink: port bug fixes to old linker

I also run into the similar error with kubernetes kubelet tooling which I report upstream at https://github.com/kubernetes/kubelet/issues/13

For my case the app runs correctly with 1.13.x and regressed with 1.14.x. The 1.14.5 does not address this issue unfortunately. Any guidance is greatly appreciated.

cc @cherrymui

@runlevel5 , Go 1.14.5 was a security release, so it did not include the fix for this. This should be fixed in the next non-security release (which is very likely to be 1.14.6 and I think should be out soon).

@aclements thanks for clarification

Finally 1.14.6 has resolved the issue :)

Was this page helpful?
0 / 5 - 0 ratings