Go: proposal: runtime: allow termination of blocked syscalls

Created on 27 Aug 2020  路  23Comments  路  Source: golang/go

Many syscalls may block indefinitely, for example file ops trying paths on a network filesystem or peripheral device that's unavailable. Such syscalls can often be terminated, but Go is unable to do so at present. So today, Go apps...

  • can't retry a dir, file, etc while it's unavailable without leaking an OS thread on every attempt.
  • can't let you abandon a stalled task.
  • may return immediately to a stalled state after abort and restart.

CIFS (#39237, #38836) and FUSE (#40846) support this by returning EINTR instead of restarting after a signal. NFS on Linux used to do the same, but dropped it a while ago. So not all hung syscalls can be terminated gracefully.

Rust used to retry syscalls on EINTR, but dropped the practice outside of some high-level APIs.

The runtime should provide a way to terminate blocked syscalls. On unix, this entails sending the blocked thread a signal. Windows has an analogous mechanism, CancelSynchronousIo(). See also
https://docs.microsoft.com/en-us/windows/win32/fileio/canceling-pending-i-o-operations.

The solution must not add context.Context variants of all stdlib APIs that make blocking syscalls. Mandating Context arguments for termination would force them into third party package APIs, and code importing those packages would then be broken. If a package author did not amend its API, the callers would have to manage stalled ops some other way, and likely leak resources on every op retry.

I think the simplest way to allow this (other ideas welcome) is to asynchronously post a threadId & metadata to the app (edit: or an internal table) immediately before and after trying a blocking syscall.

The following API could be introduced as experimental (edit: or internal). If we also want cancellable variants of stdlib APIs, Add/DropSyscallPost() could take an argument limiting its scope to the current goroutine, for use by the variants.

A file-oriented variation of this appears in https://github.com/golang/go/issues/41054#issuecomment-692267226. A runtime-internal variation is suggested in https://github.com/golang/go/issues/41054#issuecomment-683155474.

package runtime

// A callback typically implemented to maintain a table of ThreadIds and syscall metadata.
// The name is OS-specific; if "-", thread has completed its syscall.
type PostSyscall func(thread ThreadId, name string, args ...interface{})

// Notify the app of blocking syscalls, by invoking post asynchronously. Each function added is invoked.
// An error results if post is already known.
func AddSyscallPost(post PostSyscall) error

// Stop notifications via post.
// An error results if post is unknown.
func DropSyscallPost(post PostSyscall) error

// Try to terminate a syscall blocked in thread.
// An error results if thread is invalid.
func TerminateSyscall(thread ThreadId) error

// The error returned by syscall after termination.
// The caller should retry the syscall if TerminateSyscall() wasn't called for this thread.
type InterruptError struct { thread ThreadId }

Discussion of this began (more or less) with https://github.com/golang/go/issues/40846#issuecomment-676777624

__Changelog__

_27-Aug:_ add InterruptError and improve PostSyscall docs

Proposal

Most helpful comment

I am strongly opposed to any form of calling back to user code while deep in invoking a syscall. Even if we carefully write down rules as to what is permitted, people will inevitably write code that breaks those rules, and we will be tied to what people write. We went through years of pain with cgo vs. the garbage collector, and that was in an area where we had explicitly said there was no compatibility guarantee.

That said, perhaps this API could use a channel.

Similarly, the notion of marking the API is experimental is a non-starter. Once people start using an API, it is fixed. We can not break people once they have started using some feature. That's not how Go works.

The app doesn't use context, because it doesn't need to cancel certain requests while others continue.

Nevertheless, I believe that context.Context is the right approach for canceling operations in Go. I don't think it's necessary to introduce another approach.

All 23 comments

cc @bcmills @aclements @prattmic
@tmm1 @dsoprea @rfjakob

I don't see how it could work to call the function asynchronously. It would be possible to see the DropSyscallPost before the AddSyscallPost.

Also, not all system calls involve the file system.

For any given ThreadId, the sequence is synchronous:

a) syscall.Stat(filename)
b) runtime start/resume thread
b1) post(threadId, "stat", filename) ...
b2) enter syscall
b3) post(threadId, "-") ...
c) syscall.Stat returns

I used _asynchronously_ to indicate that the post() happens after (a) and before (b2), and again before (c). Also post() from different threads can be concurrent. Perhaps it's the wrong term.

We're concerned with all blocking syscalls, filesystem and otherwise. Some can't be terminated; that's OK.

I don't think we can synchronously call a user function as we are calling a syscall. At the moment of calling a syscall we are in a fairly restricted execution envrionment. Calling user code at that point seems like a footgun.

(I thought that by "asynchronously" you meant that the user code would run in a separate goroutine. I think that could be done safely. I don't think it would be safe to call a user function and wait for it to return.)

I don't really understand why this API would be necessary in the runtime proper. Isn't this already possible to implement as a third-party library today, using runtime.{Lock,Unlock}OSThread(), EINTR, and either golang.org/x/sys/unix.Tgkill or C.pthread_kill?

Oh, I see. TerminateSyscall could be implemented as a library for syscalls that opt in to termination. This proposal would allow the program to (attempt to) terminate syscalls that have _not_ opted in.

I think that would be a mistake: terminating a syscall that has not explicitly opted in to termination might just send it back through a retry loop, resulting in live-lock if the PostSyscall callback keeps trying to terminate it.

Ian, since this is a low-level API, let's define what is safe in a function called from the restricted syscall environment. The point of post() is to examine the syscall's name and arguments, and possibly add its threadId & metadata to a table. It's not meant to be an all-purpose tool.

Let's also provide a PostSyscall callback implementation that works out of the box to track active syscalls, and serves as a template for custom functions.

@bcmills the callback can't terminate anything; it simply provides data to the app for a later TerminateSyscall() attempt. The stdlib EINTR retry loops are disabled by AddSyscallPost() because the syscall's caller must see an InterruptError and retry unless the app tried TerminateSyscall() for that threadId. Other retry loops should be unaffected.

What do you mean by _syscalls that opt in to termination_?

EDIT: We'd presumably install signal handlers without SA_RESTART, to make more syscalls interruptible. That might require retry loops around non-blocking syscalls, or a sigmask to protect those threads.

After catching up on https://github.com/golang/go/issues/40846, it's not clear to me where this API comes from? It seems like that thread was slowly converging around having versions of os.Stat, etc which take a Context that interrupts the system call when cancelled. Lots of details to be worked out, but a pretty clear API.

On the other hand, I'm worried that this API is both very low-level, and difficult to use. In particular:

  1. What do we do if multiple packages want to register different handlers for their own filesystem operations?
  2. How does PostSyscall differentiate different callers of the same syscall? e.g., two different goroutines may call stat("foo") that need completely different cancellation semantics.

@ianlancetaylor I think calling user code could be generally safe so long as this is limited to standard library calls like os.Stat/Readdir, etc, since those aren't used in any fragile states within the runtime itself. If this is intended to wrap _every_ syscall, then we're definitely going to have many cases where it is nearly impossible to call user code (e.g., mmap call trying to allocate more stack space in morestack).

@networkimprov

What do you mean by _syscalls that opt in to termination_?

I mean syscalls invoked from a setting where the caller is prepared to do something reasonable (other than just retrying) in case of EINTR (or InterruptError).

Experience with Java's InterruptedException has taught me that a mechanism for signaling interruption (such as EINTR) is not particularly useful without some accompanying explanation of the reason for the interruption and/or the expected response to the interruption. In Go's case, the mechanism for that accompanying explanation is context.Context.

@prattmic thanks for digging in! There were two use cases raised in the previous thread:
a) terminating a specific syscall associated with a context, and
b) terminating all syscalls trying files within a tree that has become inaccessible.

This API can serve both cases. Re context, __replicating most stdlib APIs__ that make a blocking syscall is a terribly tall order and not clearly necessary. This is a simple API to fix an oversight in the Go runtime; and I suggested that it debut as "experimental".

Re multiple packages, I considered that! Multiple PostSyscall callbacks can be installed. That's why it's Add.../Drop... instead of Enable.../Disable...

Re different callers, concurrent syscalls have different ThreadId values. The callback returns both ThreadId and syscall details.

The callbacks needn't wrap every syscall, only blocking syscalls that can be terminated without undermining the runtime.

@bcmills InterruptError does not occur by default. AddSyscallPost() opts in to InterruptError and allows discerning interrupts due to TerminateSyscall() vs async preemption. I've considered context and accommodated it, but a large new API is not essential, whereas allowing the runtime to terminate stalled syscalls is.

For background, I'm building a desktop app for Windows, MacOS, and Linux, which makes heavy use of the filesystem. I have no control over the kind of storage used. The app (via timeout or user intervention) may need to terminate all ops within a filesystem tree. The app doesn't use context, because it doesn't need to cancel certain requests while others continue.

I am strongly opposed to any form of calling back to user code while deep in invoking a syscall. Even if we carefully write down rules as to what is permitted, people will inevitably write code that breaks those rules, and we will be tied to what people write. We went through years of pain with cgo vs. the garbage collector, and that was in an area where we had explicitly said there was no compatibility guarantee.

That said, perhaps this API could use a channel.

Similarly, the notion of marking the API is experimental is a non-starter. Once people start using an API, it is fixed. We can not break people once they have started using some feature. That's not how Go works.

The app doesn't use context, because it doesn't need to cancel certain requests while others continue.

Nevertheless, I believe that context.Context is the right approach for canceling operations in Go. I don't think it's necessary to introduce another approach.

Ian, on a related note, it seems that internal/poll.fcntl didn't get an EINTR loop:

...

Also, it might be worth a comment with this link on poll.CloseFunc, to note that errors don't mean failure, so retry is incorrect.

EDIT: Filed #41115

I didn't add an EINTR loop to fcntl because those are kernel operations that I would not expect to reach out to the file system. If I'm wrong about that, please do file a new issue. Thanks.

I guess we could add a comment to (*FD).destroy if we really wanted to. It seems kind of fussy, though. I wouldn't expect random changes in this area.

So right now, Go apps...

  • can't retry a dir, file, etc while it's unavailable without leaking an OS thread on every attempt.
  • can't let you abandon a stalled task.
  • may return immediately to a stalled state after abort and restart.

Can't we fix this problem? And without a major stdlib expansion?

Can we prototype an _internal_ runtime solution? It would underpin a future public API, and in the interim, projects that desperately need this bugfix can obtain it via //go:linkname. Perhaps:

trackPendingSyscalls()              // stdlib APIs may return OS interrupt errors
getPendingSyscalls() syscallTable   // copy the tracking data
terminatePendingSyscall(t threadId) // not guaranteed to cause interrupt error

Thanks for the channel suggestion; but isn't 2 send + 2 recv operations per syscall rather expensive?

You should of course feel free to prototype whatever you like, but I would argue against any implicit approach. I think that a cancelable syscall should be clearly cancelable.

For the reference: on Linux with io_uring practically any system call request can be canceled:

https://kernel.dk/io_uring-whatsnew.pdf -> IORING_OP_ASYNC_CANCEL

This is incredibly invasive and breaks essentially all the abstractions that the Go runtime works hard to establish. I have a hard time seeing why we would do that.

There may be a problem to solve here, but the answer can't be tearing down all the abstractions we have. Are you going to let the user TerminateSyscall in the middle of the runtime asking the operating system for more memory? It's just going to lead to an incredible number of subtle bugs.

Golly, such strident critique; it seems I touched a nerve :-)

This needn't be invasive. The app only needs to know about (and interrupt) its own syscalls -- not the runtime's! Re "subtle bugs", a terminated syscall returns an error indicating interruption; how is that subtle?

The runtime is missing an important abstraction, the "pending system operation". Let's hear more ideas to halt stalled or runaway pending system ops!

To date, the only other suggestion is to replicate dozens of stdlib APIs to take a Context argument. That's a huge stdlib expansion, and would require all packages that call any such API to add Context arguments as well. It's just not viable.

Added to issue text:
The solution must not add context.Context variants of all stdlib APIs that make blocking syscalls. Mandating Context arguments for termination would force them into third party package APIs, and code importing those packages would then be broken. If a package author did not amend its API, the callers would have to manage stalled ops some other way, and likely leak resources on every op retry.

Here is a high-level API, adapted from #41249. Tracked ops could be limited to file syscalls.

package os

TrackPending(on bool)                     // turn on/off tracking; callable by multiple packages

ListPending(kind uint) []PendingOp        // return recently pending ops
                                          // takes a set of |'d values, e.g. OpKindTop | OpKindFile
func (f *File) ListPending() []PendingOp  // return recently pending ops on the file

Interrupt(op ...PendingOp) error          // interrupt one or more ops, if still pending

type PendingOp interface {
   Kind() uint
   Fn() string                            // "Open" etc
   Params() []interface{}                 // all arguments
   Pathname() string                      // empty if not applicable
   String() string                        // summary suitable for logging
}

type InterruptError struct {...}          // returned by interrupted ops

const (
   OpKindFile uint = 1 << iota
   ...
)

Typical usage:

os.TrackPending(true)
...
myfile, err := os.Open("myfile")
...
if problems {
   os.Interrupt(myfile.ListPending()...)
}

This proposal remains a non-starter for the reasons given above.
This is a likely decline.

@rsc, there is a problem here, Go apps...

  • can't retry a dir, file, etc while it's unavailable without leaking an OS thread on every attempt.
  • can't let you abandon a stalled task.
  • may return immediately to a stalled state after abort and restart.

And #41414 restates the problem as goroutine leaks due to stalled I/O.

Is the Go team willing to explore solutions to this problem?

There may be a problem here but it doesn't seem like a new problem or an urgent one.

As discussed above, I don't think that this specific proposal is a good fix for that problem.

No change in consensus, so declined.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bradfitz picture bradfitz  路  3Comments

natefinch picture natefinch  路  3Comments

rsc picture rsc  路  3Comments

enoodle picture enoodle  路  3Comments

bbodenmiller picture bbodenmiller  路  3Comments