Go: cmd/compile: switch to a register-based calling convention for Go functions

Created on 12 Aug 2020  路  41Comments  路  Source: golang/go

I propose that we switch the Go internal ABI (used between Go functions) from stack-based to register-based argument and result passing for Go 1.16.

I lay out the details of our proposal and the work required in this document.

This was previously proposed in #18597, where @dr2chase did some excellent prototyping work that our new proposal builds on. With multiple ABI support, we鈥檙e now in a much better position to execute on this without breaking compatibility with the existing body of Go assembly code. I鈥檝e opened this new issue to focus on discussion of our new proposal.

/cc @dr2chase @danscales @thanm @cherrymui @mknyszek @prattmic @randall77

An incomplete and evolving list of tasks:

  • [ ] Define ABI specification (in progress, @aclements)
  • [X] Add GOEXPERIMENT=regabi and asm macro (CL)
  • [ ] cmd/compile: late call lowering (CLs, mostly landed)
  • [ ] cmd/compile: implement register arguments (in progress, @dr2chase)

    • [ ] spill around morestack call (WIP CL)

    • [ ] add temporary ABI pragma for testing (CL)

    • [ ] add abstract argument registers (WIP CL)

    • [x] implement ABI rules (CL)

    • [ ] use abstract argument registers

  • [ ] cmd/compile: implement register returns

    • [ ] late return lowering (in progress, @dr2chase)

    • [x] implement ABI rules (CL)

  • [ ] cmd/compile: ABI wrappers (CL)
  • [X] cmd/asm: add syntax for runtime packages to reference ABIInternal symbols (CL)

    • [x] Mark runtime.{goexit,asyncPreempt}, reflect.{makeFuncStub,methodValueCall} definitions ABIInternal (CL)

    • [ ] Mark reference to runtime.callbackWrap in sys_windows_amd64.s as ABIInternal

  • [ ] Tracebacks

    • [ ] Define extended argument metadata format (in progress)

    • [ ] cmd/compile: produce extended argument metadata

    • [ ] runtime: consume extended argument metadata

  • [ ] runtime: fix assembly closure calls (mcall, systemstack)
  • [ ] reflect: support calling ABIInternal (CL)
  • [ ] go and defer statements

    • [ ] Design generic frame layout

    • [ ] cmd/compile: construct generic register frames

    • [ ] runtime: use generic frames for go and defer

  • [X] cmd/cgo: decouple cgo callbacks from ABI (CL)
  • [ ] runtime: finalizer support (CL)
  • [ ] runtime: Windows callback support (in progress, @mknyszek)

    • [ ] Translate from stdcall/fastcall to Go ABI

  • [ ] DWARF

    • [ ] cmd/compile: ensure arguments have appropriate DWARF locations

    • [ ] Ensure GDB and Delve can find register arguments

Post-MVP

  • [ ] Port to other platforms
  • [ ] Eliminate spill slots

Optional cleanup

  • [ ] runtime: port memmove, memclrNoHeapPointers to ABIInternal for performance
  • [ ] runtime: port all assembly -> Go calls to ABIInternal
  • [ ] Eliminate ABIAlias support from cmd/compile, cmd/link, and object format
Proposal Proposal-Accepted Proposal-FinalCommentPeriod

Most helpful comment

Just an update on this: we were hoping to land this for 1.16, but that was always going to be a bit of a stretch, and at this point we won't be able to make 1.16. Since we're making good progress and have a lot of work in flight, the current plan is to branch when the freeze comes and finish up the work there. Landing this at the beginning of 1.17 isn't the worst thing since that will give it lots of soak time. There's some concern about the risk of landing both this and generics in 1.17, but I believe they will be fairly orthogonal and the nature of the register ABI work is such that we should be able to test it pretty thoroughly and things are likely to fail loudly if there are bugs.

All 41 comments

Once upon a time, I was one of several people whose primary work activity for I think two days was tracking down a weird kernel crash, which was ultimately caused by callee-saved registers. Well, the crash was caused by a [FOO_MAX] array being overrun. But the thing where the result of that overrun was to overwrite a single automatic variable, which never had its address taken, five call frames away? _That_ was caused by callee-saved registers.

Which is to say:

Platform ABIs typically define callee-save registers, which place substantial additional requirements on a garbage collector. There are alternatives to callee-save registers that share many of their benefits, while being much better suited to Go.

+1 +1 +1 +1 +1 +1 [...]

If I'm understanding the proposal correctly, assembler code will continue to be written with the current stack based calling conventions.

I understand the excellent reasoning behind doing this, I'll just note that this is surely going to frustrate assembly code writers not being able to pass and receive args in registers as often assembly code fragments are tiny and the overhead of calling them is massive.

@ncw This restriction to ABI0 for assembly writers isn't necessarily permanent. See https://go.googlesource.com/proposal/+/master/design/27539-internal-abi.md#proposal.

Right now writeBarrier is a special builtin that allows arguments to be passed in registers. Couldn't that same concept be extended to others like memmove and memcpy to avoid call overhead even though they are asm?

@laboger , yes, we're considering special-casing just a few assembly functions. The hottest ones by far are memmove, memclrNoHeapPointers, and possibly syscall.Syscall. The write barrier calling convention is very specialized because its context is so unusual and performance-sensitive. We'll probably just switch the others to use the register ABI rather than anything more specialized (and teach the toolchain that those are special even though they're assembly).

A comment on the "stack growth" section of the doc, specifically about spilling register state for morestack into the guard space for a goroutine (copied from Gerrit):

I worry a little bit about cases where functions marked nosplit eventually call into a non-nosplit function. If the nosplit frames are large (but don't exceed the guard space) we might find out that we don't have any space left. What should we do in this case? Throwing doesn't seem quite right since this works just fine today (though, you have to be careful). Should we have "guard space" and a subset of that as "no, really, this is very guarded" space? That would require extending the guard space a bit further which isn't great...

One alternative is to manage per-goroutine space for this, but that's not great either, since its more memory used per goroutine.

Good point. This is especially true for architectures that have a lot registers. Maybe we could decide that some registers are never live across a call?

That's an excellent point. A few possible solutions come to mind:

  1. It would be really sad if we had to make more per-goroutine space for this since you almost always have space on the stack. One possibility is to spill to the stack if there's room, and otherwise we keep a pool of allocated "spill space" objects. For that, we could keep just one pre-allocated per P and if the goroutine needs it, it can pull it off the P, spill into it, and then once it's entered the runtime it can allocate a new one (probably getting it from a global allocation pool) and attach it to the P. On return, it can return it to the allocation pool.

  2. A similar option would be to have just one spill space per P. Again, use the stack if you can, otherwise spill into the P, enter the runtime, grow the stack even if it's just a preemption, and then dump the spilled registers back onto the stack in the right place.

  3. We could revisit the nosplit rules. It's often (though not always) a mistake to call from a nosplit function to a non-nosplit function. I'm not sure exactly what this would look like. Maybe a non-nosplit function called from a nosplit function has to have a special prologue? Related: https://github.com/golang/go/issues/21314#issuecomment-346480985. I think the only cases I enumerated in that comment that intentionally call from a nosplit to a non-nosplit function aren't running on a user G stack and thus can't grow it anyway.

Change https://golang.org/cl/252258 mentions this issue: cmd/asm: define a macro for GOEXPERIMENT=regabi

Change https://golang.org/cl/252257 mentions this issue: cmd/internal/objabi: add regabi GOEXPERIMENT

Change https://golang.org/cl/258938 mentions this issue: runtime,cmd/cgo: simplify C -> Go call path

@alexbrainman , I just ran into a bit of a snag on Windows that maybe you can better inform me on. It looks like syscall.NewCallback is going to also need an implementation of the Go internal ABI rules and I'm trying to figure out exactly what that needs to look like. (I pinged you because it looks like you last substantially touched that code, but feel free to pass the ping.)

The documentation says "The argument is expected to be a function with one uintptr-sized result. The function must not have arguments with size larger than the size of uintptr." Does this actually mean the arguments (and result) must be an integral, floating-point, or pointer type? Or does it mean, say, struct { x, y uint16 } or struct { x [4]byte } is allowed, since those are technically no larger than the size of a uintptr? The current implementation permits the latter, though I don't know if it actually works.

If they must be int/float/pointer, then mapping from the Windows ABI to the Go register ABI is relatively straightforward. If compounds are allowed, then this gets much harder.

cc @jstarks re the ABI question.

@aclements I reckon there are very few instances of syscall.NewCallback usage. But they are used a lot.

If you want to build Windows GUI, you have to use syscall.NewCallback, because Windows GUI API requires syscall.NewCallback.

Similarly, if you want to build Windows service, you have to use syscall.NewCallback.

These two pretty much cover syscall.NewCallback usage.

For Windows service, you can look in golang.org/x/sys/windows/svc. It uses this RegisterServiceCtrlHandlerEx WIndows API

https://docs.microsoft.com/en-us/windows/win32/api/winsvc/nf-winsvc-registerservicectrlhandlerexw

which requires LPHANDLER_FUNCTION_EX

https://docs.microsoft.com/en-us/windows/win32/api/winsvc/nc-winsvc-lphandler_function_ex

so LPHANDLER_FUNCTION_EX returns DWORD (uint32).

For GUI grep in golang.org/x/exp/shiny - it uses RegisterClass

https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-registerclassw

which uses WNDCLASSW.lpfnWndProc

https://docs.microsoft.com/en-us/windows/win32/api/winuser/ns-winuser-wndclassw

which is WNDPROC

https://docs.microsoft.com/en-us/previous-versions/windows/desktop/legacy/ms633573(v=vs.85)

which returns LRESULT which is LONG_PTR or uintptr (in Go)

https://docs.microsoft.com/en-us/windows/win32/winprog/windows-data-types

Also I recommend you look at github.com/lxn/walk for their syscall.NewCallback usage. This is an ultimate package to create GUI apps. But, I doubt, you will find different usage than golang.org/x/exp/shine. (I did not look myself)

I reckon, that is about it. For Microsoft APIs (maybe google code for syscall.NewCallback or windows.NewCallback).

But some people might use it in their custom / proprietary interface. I am not sure how these are supported by Go. You will be the judge.

Alex

IIUC this example in the microsoft docs shows that compounds are allowed:

struct Struct2 {
   int j, k;    // Struct2 fits in 64 bits, and meets requirements for return by value.
};
Struct2 func4(int a, double b, int c, float d);
// Caller passes a in RCX, b in XMM1, c in R8, and d in XMM3;
// callee returns Struct2 result by value in RAX.

Change https://golang.org/cl/262197 mentions this issue: go/analysis/passes/asmdecl: add support for ABI selector clauses

Change https://golang.org/cl/262319 mentions this issue: reflect,runtime: use internal ABI for selected ASM routines

Change https://golang.org/cl/262317 mentions this issue: cmd/dist,cmd/go: broaden use of asm macro GOEXPERIMENT_REGABI

Change https://golang.org/cl/259445 mentions this issue: cmd/compile,cmd/link: initial support for ABI wrappers

Change https://golang.org/cl/260477 mentions this issue: cmd/asm: allow def/ref of func ABI when compiling runtime

Change https://golang.org/cl/262318 mentions this issue: cmd: go get golang.org/x/tools@d1624618 && go mod vendor

Change https://golang.org/cl/262117 mentions this issue: cmd/compile: delay expansion of OpArg until expand_calls

Change https://golang.org/cl/249458 mentions this issue: cmd/compile: avoid generating CSEs; do all aggregates; maintain debug names

Change https://golang.org/cl/263271 mentions this issue: runtime: fix sub-uintptr-sized Windows callback arguments

Just an update on this: we were hoping to land this for 1.16, but that was always going to be a bit of a stretch, and at this point we won't be able to make 1.16. Since we're making good progress and have a lot of work in flight, the current plan is to branch when the freeze comes and finish up the work there. Landing this at the beginning of 1.17 isn't the worst thing since that will give it lots of soak time. There's some concern about the risk of landing both this and generics in 1.17, but I believe they will be fairly orthogonal and the nature of the register ABI work is such that we should be able to test it pretty thoroughly and things are likely to fail loudly if there are bugs.

There's some concern about the risk of landing both this and generics in 1.17

1.17 -> 2.0?

It sounds like everyone involved is on board for doing this in Go 1.17.
Adding to proposal minutes for wider visiblity, but seems headed for likely accept.

An incomplete and evolving list of tasks:

@aclements Consider adding a bullet point tracking the implementation of DWARF locations for arguments and results (contained in registers). Having good support for Go calling conventions in GDB and Delve is well worth a tracking bullet point :)

This work is already outlined in https://go.googlesource.com/proposal/+/master/design/40724-register-calling.md

DWARF locations: The compiler will need to generate DWARF location lists for arguments and results. It already has this ability for local variables, and we should reuse that as much as possible. We will need to ensure Delve and GDB are compatible with this. Both already support location lists in general, so this is unlikely to require much (if any) work in these debuggers.

@mewmew , thanks, I've added some bullets to the list for this. Hopefully there's nothing to actually do here, but you're right it's good to track that we need to make sure this works.

Change https://golang.org/cl/258137 mentions this issue: reflect,runtime: support amd64 register ABI for reflect.Call

Change https://golang.org/cl/264437 mentions this issue: internal/abi: add new internal/abi package for ABI constants

Change https://golang.org/cl/264438 mentions this issue: runtime: support register ABI for finalizers

Based on the discussion above, this seems like a likely accept.

Change https://golang.org/cl/266638 mentions this issue: reflect,runtime: use internal ABI for selected ASM routines, attempt 2

No change in consensus, so accepted.

Change https://golang.org/cl/270077 mentions this issue: runtime: simplify C -> Go callbacks on windows/arm

Change https://golang.org/cl/270863 mentions this issue: [dev.regabi] cmd/compile,cmd/link: initial support for ABI wrappers

Change https://golang.org/cl/271178 mentions this issue: runtime: support new callbackasm1 calling convention on windows/arm

Change https://golang.org/cl/272570 mentions this issue: runtime: support register ABI for finalizers

Change https://golang.org/cl/272567 mentions this issue: internal/abi: add new internal/abi package for ABI constants

Change https://golang.org/cl/272568 mentions this issue: reflect,runtime: support amd64 register ABI for reflect.Call

Was this page helpful?
0 / 5 - 0 ratings