Go: proposal: unsafe: clarify unsafe.Pointer rules for package syscall

Created on 3 Oct 2019  ·  28Comments  ·  Source: golang/go

unsafe.Pointer's rule 4 says:

Conversion of a Pointer to a uintptr when calling syscall.Syscall.

The Syscall functions in package syscall pass their uintptr arguments directly to the operating system, which then may, depending on the details of the call, reinterpret some of them as pointers. That is, the system call implementation is implicitly converting certain arguments back from uintptr to pointer.

It talks about "the Syscall functions", but on Windows, there's also Proc.Call and LazyProc.Call.

Q1: Are these two functions "Syscall functions"? Strictly speaking, I would interpret "Syscall functions" to mean Syscall{,6,9,12,15,18}. Perhaps the docs should be clarified.

Q2: Do functions have to be called directly, or are indirect calls allowed? The rule explicitly warns that conversions have to be performed directly in the argument list, but it doesn't say anything about indirect calls.

Q3: For Proc.Call and LazyProc.Call, are direct calls via method expressions allowed? E.g., if x.Call(uintptr(p), 0, 0) is valid, then is (*Proc).Call(x, uintptr(p), 0, 0) also valid?

Clarifying this is relevant to determining how far we need to go in fixing #34474.

Incidentally, Q2 and Q3 also apply to rule 5. The status quo there is that cmd/compile safely handles unsafe.Pointer(f(...)) for all f(...), so we do allow indirect and method expression calls to reflect.Value.Pointer and reflect.Value.UnsafeAddr.

/cc @rsc @ianlancetaylor

Proposal Proposal-Hold

All 28 comments

The part you quoted is the rationale. The real rule is in the part you didn't quote:

The compiler handles a Pointer converted to a uintptr in the argument list of a call to a function implemented in assembly by arranging that the referenced allocated object, if any, is retained and not moved until the call completes, even though from the types alone it would appear that the object is no longer needed during the call.

For the compiler to recognize this pattern, the conversion must appear in the argument list:

My answers, such as they are, to your questions.

Q1: Are these two functions "Syscall functions"? Strictly speaking, I would interpret "Syscall functions" to mean Syscall{,6,9,12,15,18}. Perhaps the docs should be clarified.

Per above, the rule applies to any function implemented in assembly. syscall.Proc.Call and syscall.LazyProc.Call are not implemented in assembly, and, as such, this rule does not apply to them.

Q2: Do functions have to be called directly, or are indirect calls allowed? The rule explicitly warns that conversions have to be performed directly in the argument list, but it doesn't say anything about indirect calls.

I think the functions have to be called directly, and we should update the unsafe package docs to say that.

Q3: For Proc.Call and LazyProc.Call, are direct calls via method expressions allowed? E.g., if x.Call(uintptr(p), 0, 0) is valid, then is (*Proc).Call(x, uintptr(p), 0, 0) also valid?

I suppose that first we have to come up with a rule for Proc.Call and LazyProc.Call.

When I introduced go:uintptrescapes for #16035, my thinking was basically that there was no general rule that could make that code work, but since it already existed and people were using it we should at least try to keep those programs working. It wasn't intended to introduce a new general unsafe.Pointer exception, just one for those two methods.

The part you quoted is the rationale. The real rule is in the part you didn't quote:

Hm, I've always interpreted the first line as the rule, and everything below as the rationale, restrictions, and details. For example, that rule 4 is you can convert unsafe.Pointer-to-uintptr when calling syscall.Syscall, and the part you describe as the real rule is just an explanation of the implementation details about how the compilers implement this today (i.e., that syscall.Syscall are implemented as assembly functions, and the compilers generically handle calls to assembly functions this way).

If rule 4 is in fact that all assembly functions should be handled that way (with syscall.Syscall as just a special case thereof), I think we should reword the first sentence to better emphasize that.

Yes, my interpretation is that this applies to all functions written in assembly, since the rule clearly applies to more than the single function syscall.Syscall. But perhaps I am mistaken.

I think making //go:uintptrescapes work correctly for these functions is probably the best way to go, since there really isn't a general rule, it depends on semantics of the OS call.

@beoran The issue here is we've never defined exactly what it means for //go:uintptrescapes to "work correctly". As far as I can tell, it's not documented anywhere (e.g., neither cmd/compile's godocs, nor src/runtime/HACKING.md).

I've generally been under the assumption that (1) rule 4 is meant to apply specifically to package syscall's functions like Syscall, RawSyscall, Proc.Call, and LazyProc.Call, and (2) //go:uintptrescapes and cmd/compile's semantics for calling assembly functions are implementation details about how we guarantee rule 4 today.

For example, I'm under the impression that while today users can write their own assembly functions that pass pointers as uintptr or //go:uintptrescapes-annotated Go functions that do the same, we do not guarantee this to work in the future and reserve the right at any point to break this (as long as package syscall's Syscall functions continue to work correctly).

However, it seems at least @ianlancetaylor had a different interpretation.

My thoughts, not definitive:

Q1: It seems like the answer must be yes or else those functions are impossible to use safely at all. That's the same reason we answer yes to syscall.Syscall.

Q2: It would be nice if indirect calls worked, but it seems like they probably can't be made to work without significant overhead, and it also seems like they don't work today. If both those are true, then probably we should say that indirect calls don't apply.

Q3: In general there's almost zero difference between x.M() and T.M(x), so I don't see why we'd introduce a difference here.

So I guess I'm suggesting yes/no/yes.

Thanks for sharing your thoughts, @rsc.

Q2: It would be nice if indirect calls worked, but it seems like they probably can't be made to work without significant overhead, and it also seems like they don't work today. If both those are true, then probably we should say that indirect calls don't apply.

It should only introduce overhead for indirect calls involving unsafe.Pointer->uintptr conversions as arguments, and I don't think those are very common. So I don't expect the overhead would be significant, but I'll measure this.

Q3: In general there's almost zero difference between x.M() and T.M(x), so I don't see why we'd introduce a difference here.

That's my tentative position as well. The two counter arguments here though are (1) historically it hasn't worked; and (2) package unsafe imposes its own set of rules, and maybe we think T.M(x) is uncommon enough to not allow it.

Across the standard library and Kubernetes, the only indirect call that involves a unsafe.Pointer->uintptr conversion is here:

https://github.com/golang/go/blob/32b6eb80fc727b852965b17a63c83b4c5dab2973/src/syscall/syscall_unix.go#L95-L98

(And the same code appears in x/sys/unix.)

For this particular call, the overhead would amount to an extra stack slot and an instruction to populate it.

Change https://golang.org/cl/200137 mentions this issue: cmd/compile: warn about indirect calls with unsafe.Pointer->uintptr conversions

I think //go:uintptrescapes would be nice to have to make indirect calls as in Q2 possible. It makes it easier to keep Go code that interests with the operating system well factored and easier to read.

How do you plan to fix the indirect calls?

How do you plan to fix the indirect calls?

Conservatively assume any indirect function calls might be to an assembly function or a function annotated with //go:uintptrescapes, and handle unsafe.Pointer->uintptr conversions passed to their uintptr-typed arguments appropriately. See CL 200137 for details.

I think //go:uintptrescapes would be nice to have to make indirect calls as in Q2 possible. It makes it easier to keep Go code that interests with the operating system well factored and easier to read.

Sorry, I don't understand what you're suggesting here.

In general, users shouldn't have to worry about //go:uintptrescapes. It's an implementation detail about how we make sure package syscall works.

Go doesn't support users writing their own functions that accept or return pointers using uintptr-typed parameters.

I am currently writing a game library in Go that avoids cgo and calls the
OS directly, certainly on Linux thanks to it's stable kernel API. (
https://gitlab.com/beoran/galago
https://gitlab.com/beoran/galago/blob/master/os/linux/input/input_linux.go
)

So, I have functions like these:

// Icotl performs an ioctl on the given device
func (d * Device) Ioctl(code uint32, pointer unsafe.Pointer) error {
fmt.Printf("ioctl: %d %d %d\n", uintptr(d.Fd()), uintptr(code),
uintptr(pointer))
_, _, errno := syscall.Syscall(
syscall.SYS_IOCTL,
uintptr(d.Fd()),
uintptr(code),
uintptr(pointer))
if (errno != 0) {
return errno
}
d.KeepAlive()
return nil
}

Notice the KeepAlive? It would be great if it wasn't needed and I could
instruct the compiler that everything escapes, maybe with the mentioned
pragma.

Please don't underestimate us "normal" Go users, we too need to perform low
level system calls for certain uses. To boldly Go where no one has used Go
before. :-)

Notice the KeepAlive? It would be great if it wasn't needed and I could instruct the compiler that everything escapes, maybe with the mentioned pragma.

I agree it would be nice if users didn't have to manually insert runtime.KeepAlive calls when using os.File.Fd. Or if there was at least vet/lint tooling to suggest when they probably need it.

However, file descriptors are not pointers, so the unsafe.Pointer safety rules do not apply to os.File.Fd. You're suggesting entirely new functionality, whereas this issue is about clarifying corner cases of existing functionality.

I recommend filing a new feature request / proposal issue if you'd like to pursue that idea.

@mdempsky, it sounds like you are suggesting that the
"Conversion of a Pointer to a uintptr when calling syscall.Syscall."
section effectively be retitled to
"Conversion of a Pointer to a uintptr when calling a function."
and that after such a function call (any call at all) there would be a keepalive of the pointer inserted immediately after the return from the call. (And whatever pinning is needed during the call.)

Do I have that right?

(You didn't specifically say "any call", but it seems like if you are going to do a fixed set of specific calls as well as _all_ indirect calls, you might as well just complete the set and do all calls. Otherwise direct calls are somehow disadvantaged compared to indirect calls.)

@rsc I think that's close, yes.

I would say my suggestion was specifically that the conversion is safe when "calling (directly or indirectly) to a fixed set of specific functions".

The particular implementation detail today would be that we handle all indirect calls as though they might be to one of those functions (which has very few false positives), but compiler optimizations might later let us rule some of those out. E.g., calls to non-exported interface methods (outside of package syscall) can never be one of those specific functions; moving escape analysis to SSA might allow us to see that the set of possible called functions at a call-site is disjoint from the set of specific functions; or a Go JIT that does dynamic monomorphic call optimization might know the target function isn't one of those specific functions.

I think further simplifying to "when calling a function" is reasonable and has merits, but that's not specifically my suggestion. My only concern would be this might introduce more unnecessary KeepAlive calls in current code; I'll measure this.

I am definitely in favour of simplifying this to "when calling a function", as this makes low level programming that much easier. As for unnecessary KeepAlive calls, in this case, I thought KeepAlive is always needed, only now it sometimes accidentally seems to work without the call.

I am definitely in favour of simplifying this to "when calling a function", as this makes low level programming that much easier.

Note that even if we generalize to "when calling a function", to allow users to write their own functions that accept pointers as uintptr-typed parameters we'll still need to specify how those uintptr parameters can actually be used by the function.

Currently we side step that because only the standard library contains this code, and we can ensure it's kept in sync with the compiler's own conventions (e.g., using undocumented compiler directives).

Change https://golang.org/cl/198043 mentions this issue: cmd/compile: fix //go:uintptrescapes for basic method calls

Change https://golang.org/cl/205244 mentions this issue: [release-branch.go1.13] cmd/compile: fix //go:uintptrescapes for basic method calls

It sounds like the "any function" rules may be fine but we are waiting on @mdempsky to confirm that there's not unexpected overhead. I'm going to tag this Proposal since it is a real (if small) change to the effective language.

Putting this proposal on hold until @mdempsky has had a chance to check the overheads.
Matthew, please feel free to remove the label when you are ready. No hurry. Thanks.

unsafe.Pointer's rule 4 says:

Conversion of a Pointer to a uintptr when calling syscall.Syscall.
The Syscall functions in package syscall pass their uintptr arguments directly to the operating system, which then may, depending on the details of the call, reinterpret some of them as pointers. That is, the system call implementation is implicitly converting certain arguments back from uintptr to pointer.

It talks about "the Syscall functions", but on Windows, there's also Proc.Call and LazyProc.Call.

This stung me too.

I had code wrapping (*syscall.LazyProc).Call(a ...uintptr) that logged the arguments called, and the values returned. I hadn't paid close attention to unsafe.Pointer rule 4 since I was not calling syscall.Syscall but rather LazyProc.Call.

I think if the documentation had been more explicit that other functions/methods also had special treatment, I might have spotted the issue sooner.

Although this was a genuine bug in my code, I was fortunate that the former escape analysis coincidentally worked in my favour and kept the values live. It appeared as an intermittent issue when upgrading to go 1.13. From bisecting, I discovered the failure started occurring in 996a687ebbf81b26f81b41b8e62ef21d8b0826af when the escape analysis was cleaned up.

What would be the correct / supported approach for logging values in/out of (*syscall.LazyProc).Call(a ...uintptr)?

With the existing wrapper, I'm aware I could add the //go:uintptrescapes pragma to the wrapped (*syscall.LazyProc).Call(a ...uintptr) method. However, I'm also aware due to https://github.com/golang/go/issues/23045#issuecomment-350275695 that this is an unsupported / undocumented feature, so I'm open to suggestion. I'm not sure if runtime.KeepAlive can help here, since the wrapper method received uintptr values. To get around this, one option might be to adapt it to take interface{} values instead, and change the call sites to not convert to uintptr first. However, some inputs may require uintptr(unsafe.Pointer()) conversions, whereas others (such as uint32 values) may just require e.g. uintptr(), so implementing the conversion directly in the wrapper function seems nontrivial. Perhaps a solution exists using the reflect package, but it starts getting pretty complicated...

It feels like logging inputs/outputs of syscalls should be something doable in the langauge without requiring the use of an undocumented feature, so I wanted to check if I'm missing something obvious, or what the recommended approach should be. Many thanks!

Offhand I don't think there is a safe way to interpose (*syscall.LazyProc).Call. That's unfortunate, but it's not so bad: the restriction only applies to a couple of functions.

Using runtime.KeepAlive doesn't work in general because that only keeps the value alive. What is needed here is something more: the value must not move. And if the value is on the stack, runtime.KeepAlive doesn't guarantee that.

Offhand I don't think there is a safe way to interpose (*syscall.LazyProc).Call. That's unfortunate, but it's not so bad: the restriction only applies to a couple of functions.

Many thanks for the prompt reply.

Would there be value in making //go:uintptrescapes a supported/documented feature?

One guideline for adding yet another feature is whether it is possible to write simple succinct documentation for that feature that any programmer can understand. I don't think //go:uintptrescapes meets that bar.

I respectfully disagree. I believe not all Go programmers need to know all Go language features. The common high level features of the language need to be widely understood, yes, and need therefore need to be easy to understand and easy to explain.

However most "advanced", or low level features, like unsafe.Pointer, syscalls, //go:uintpointerescapes, etc, are not for general use by Go programmers, but for use by those of us who want to integrate Go with OS kernels, existing C libraries, work on compilers or kernels in Go language, etc. But, the problem is that C libraries, kernels, etc, are not easily explained in any succinct way, and it takes time and effort to learn about these advanced topics. The best the Go team can do here is to provide a few pointers to the information available on the internet or in books, and explain the advanced features of Go in detail so they can be learned smoothly by those of us with the special interests mentioned.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

natefinch picture natefinch  ·  3Comments

stub42 picture stub42  ·  3Comments

rakyll picture rakyll  ·  3Comments

mingrammer picture mingrammer  ·  3Comments

myitcv picture myitcv  ·  3Comments