Related to #18507, it would be nice if the io
interface methods all took in Context
as their first parameter. This allows a cancellation signal to be attached directly to the I/O calls that are using them, instead of requiring an out-of-band method, as in net.Conn
.
This is obviously a backward-incompatible change and would have broad impact on the ecosystem. I wanted to open this as a proposal for consideration in Go 2.
EDIT: See below for the more backward-compatible proposal
EDIT 2: I've written an experience report/blog post detailing the background for this proposal: Canceling I/O in Go Cap’n Proto. Feedback and alternatives welcome.
In addition to all of the Reader-like and Writer-like interfaces, we should also do this to io.Closer
as close operations frequently involve one last set of write operations before closing the handle.
It's less obvious whether this should be done to io.Seeker
(but perhaps we even remove that in Go2 #17920)? Oftentimes the implementation of io.Seeker is as simple as changing some file offset in the internal data structure, and shouldn't block.
Sorry to hijack, but if we get to this point, then Context
should probably be rethought to be a feature of the language. The current io
interfaces are very nice because they're very easy to work with yet very powerful and composable. Adding the complexity of correctly handling a context for all use cases seems overkill. If we want all goroutines to be correctly cancelable no matter what, then this is possibly better handled by the runtime.
@rasky, let's file a seperate issue for Context being part of the language.
Other than needing to do extra typing to add the argument for the context, there is not that much complexity. Compute-only Readers like compression doesn't even need to inspect the context. All it needs to do is plumb it through to the underlying io.Reader. This does not seem particularly complicated.
It's real simple to make an io.Reader
(or io.Writter
) that wraps an existing io.Reader
plus a context (including checking if the the context has a deadline and the reader implements SetReadDeadline
). If anything, just add such wrappers to io/ioutil
rather than mucking with the existing io.Reader
and io.Writer
interfaces.
I don't bother be rude: the author is network monkey.
File IO on Linux is blocking. Imagine running another OS thread just to comply the functionality.
And I believe the problem of goroutine cancellation must be solved via the runtime tricks, because it only happened as a consequence of mismatch between synchronous representation of asynchronous environment, so let they introduce another tool to access the functionality that is out of scope now due to the mismatch.
@dsnet and I came up with a way this could be introduced gradually. Instead of modifying the interfaces directly, taking the database/sql approach:
package ctxio // package "io/ctxio"
import (
"context"
"io"
)
type Reader interface {
ReadCtx(context.Context, []byte) (int, error)
}
// BindReader returns an io.Reader that reads from r using the ctx Context.
// (Name is bikeshed-able.)
func BindReader(ctx context.Context, r Reader) io.Reader
// PromoteReader returns a Reader that reads from r.
// It will ignore any Context passed to the Read calls.
// (Name is bikeshed-able.)
func PromoteReader(r io.Reader) Reader
// and so on...
By having this be a different interface, existing concrete types (like *os.File
) could implement both interfaces and this could be a backward-compatible change.
@dchapes: Yes, I have done this before. Where SetReadDeadline
fails is if the Context
has a manual cancel signal (perhaps triggered by SIGINT or some such), which at least in most code that I write, comes up more frequently. Readers and writers also end up being the hardest place to correctly propagate Contexts (especially values) for readers and writers that are shared amongst multiple contexts.
@sirkon raises a good point which would be the logical follow-on: it would be wonderful if *os.File
and *net.TCPConn
et al supported cancel signals at the scheduler level. Hand-waving, the only parts of Context that the scheduler would need to receive would be the Done()
channel and the deadline, which could keep the runtime decoupled from the context
package.
@zombiezen
Try to read about blocking and non-blocking IO at least at the API level. TCPConn's socket is already non-blocking in Go and it is only needed to "just" add another coupled file descriptor (timer) into the event loop to control exit. "Just" because it is yet another syscall and syscalls are expensive.
On files: asynchronous file IO don't work well in Linux and turn to be blocking for heavy loads. So, you need to launch thread per file to guarantee asynchronous file reads/writes (+ scheduling thread, although it is already there).
These were just technical remarks.
And it, finally, the idea stinks. It just looks ugly, it is the shame you don't realize this yourself.
As I told, the issue itself is just a consequence of mismatch between asynchronous nature and its synchronous representation in Go. The best possible solution is on runtime level, e.g. goroutines with optional parameters.
@sirkon, thank you for your concerns about syscall efficiency. However, saying "the idea stinks. It just looks ugly, it is the shame you don't realize this yourself" doesn't help the discussion. I think you may be failing to recognize the problem that this issue is trying to solve. These issues are intended to be a place to discuss ideas in a civil manner.
The issue at hand is not just "asynchronous vs synchronous", but one with regards to plumbing context for cancelation signal, which (although often canceled asynchronously) and also value propagation are still different issues than "asynchronous vs synchronous" calling of Read
. Plumbing Context
through to all Readers
and Writers
is one such approach to addressing this issue, but carries with it obvious downsides of plumbing the context to many Reader
and Writer
implementations that do not care about it. There are probably many other approaches, and that's why we should discuss this.
If you have the "best possible solution" in mind, perhaps you would like to elaborate your ideas?
I don't have any idea on the issue other than understanding this proposal is stupid.
By points:
Back to the proposal:
@sirkon, I'm going to echo @dsnet's remarks here. Your responses are uncivil (I would like to point you to our Code of Conduct) and are commenting on the basis of "ugliness" rather than giving concrete technical objections.
I understand that an Experience Report will likely clear up some of the background on what problems this is trying to solve. I will try to write up something soon.
As for the objection to "introduce concept into domain when not all items of it actually support it", for the implementers of an io interface that don't require the Context, they can simply ignore the parameter, just as any function that takes in a Context that does not need it does. However, consider Readers and Writers that need a Context: there isn't as easy of a workaround. @dchapes's solution works in a number of cases involving deadlines, but it does not work in cases involving Context values.
For the record, this isn't a full-fledged proposal. I opened this issue since I feel this is something that should be considered for Go 2. It may be accepted; it may be rejected. The Go ecosystem is still finding its way on how and when to use Context. Networking and I/O are places where Context comes up frequently, so it's something that we may want to consider in the future. Once we get farther along and the problem space is more agreed upon, there may be better ways of achieving the same goals.
I've written an experience report/blog post detailing the background for this proposal: Canceling I/O in Go Cap’n Proto. Alternatives welcome.
I don't bother be rude: the author is network monkey.
File IO on Linux is blocking. Imagine running another OS thread just to comply the functionality.
Real network monkeys know that Go also runs on modern systems.
I think this proposal is asking for something deeper than changing io.Reader
to use a Read
method that takes a Context
argument. I think this is asking for a way to interrupt a call to Read
by canceling a Context
.
As @dchapes says above, it is possible to build an io.Reader
that incorporates a Context
value and checks it before calling the underlying Read
. That demonstrates that there is no clear need for adding a Context
as an argument to every Read
call. But that is (presumably) unsatisfactory because, with the wrapped Reader
, if the Context
is canceled, the Read
will not be interrupted.
So I think we can set aside the idea of passing a Context
to every Read
method, and focus on the idea of whether it is possible to arrange for a Read
to be interrupted, with an error, when some Context
is canceled. We can implement that for descriptors that end up in the network poller; descriptors that do not end up there probably do not hang in Read
in any case.
Can we devise a mechanism for associating a Context
with a *os.File
or net.Conn
?
On POSIX systems, it would be great if Context
could expose a file descriptor to signal its state changes. That way, we could use it with I/O event notification functions such as select()
, poll()
, etc. It removes one part of the problem: interfacing Context
with file-operation functions.
The other part, unblocking a call, is very specific to the file descriptor and the function being used... So far I only had to unblock read()
calls on both net.Conn
and *os.File
, and the same self-pipe technique implementation works.
Associating a Context
with a *os.File
or net.Conn
could be problematic because it applies to all ongoing operations. It is not uncommon for a read to be in progress on a network socket when a write comes along. I wouldn't want to cancel the write just because the read is being canceled (or vice versa)
I created this (incomplete) hack for testing my own purposes that may or may not be useful for reference.
Ignoring the implementation, the API at least served my purposes. https://gitlab.com/streamy/concon
@Julio-Guerra Context
already exposes a channel for use in a select
statement. We don't expect Go programmers to call the system calls select
or poll
themselves, since the runtime does that automatically anyhow. So I don't see a good reason for Context
to expose a file descriptor. While there are cases where that could be useful, those cases seem very rare.
@dchapes The proposal you made only works to some extent. For example, if we call func Reader
with the same io.Reader
several times, the read deadline can be infinitely extended, when the reader is slow.
Although this example is simple and can be avoided, I have seen more intricate cases with similar ideas which run into the same trouble.
I see why this is something that makes sense to do especially when the file storage is remote and can be dealt with the same way as other network resources, i.e. with cancellation hooks. But I have to agree that it is not immediate obvious for a beginner to realize what context
is used for, or to understand that IO would like to use context
for cancellation, instead of something else, or that context
is a primary mechanism of cancellation at all...
It's simply not self-explanatory enough, and I don't think its purposes is consistent for a developer who also deals with concepts with the same name in a different language. e.g. writing C++ at the same time, I would think Context has server-lifetime instead of request-lifetime.
At Google, we require people to pass context
as the first argument to everything on the call hierarchy. So context
spreads so widely and it's everywhere. I think that makes it worthwhile for us to carefully explain what it is to help engineers / new learners understand, and make the package itself as clear as possible. Or maybe it is really context
that should be changed, instead of whatever that needs the cancellation behavior context
currently supports.
Another concept in this vein: #36402
Can we devise a mechanism for associating a
Context
with a*os.File
ornet.Conn
?
I've yet to see any prettier API than func (f *File) IO(ctx context.Context) ReadWriter
func (f *os.File) SetStopper(c <-chan struct{}) error
which enables
aFile.SetStopper(aCtx.Done())
@ianlancetaylor pointed out on golang-nuts that you can interrupt a read by setting its deadline. This is working for me:
type ReadDeadliner interface {
SetReadDeadline(t time.Time) error
}
func SetReadDeadlineOnCancel(ctx context.Context, d ReadDeadliner) {
go func() {
<-ctx.Done()
d.SetReadDeadline(time.Now())
}()
}
To use it safely, you need to create a new context which you always cancel when the file or socket is closed, to avoid a goroutine leak.
func handleConnection(ctx context.Context, conn net.Conn) {
ctx, cancel := context.WithCancel(ctx)
defer cancel()
SetReadDeadlineOnCancel(ctx, conn)
// rest of code goes here, skeleton only shown
for {
n, err := conn.Read(buf) // blocking here waiting for a message
if err != nil {
... see below
}
...
conn.Write(...)
}
}
If either the outer or inner context is cancelled, the Read() call is unblocked with a timeout error. If you're in the middle of processing a message, then you'll get this the next time round the loop.
net.Listener doesn't have a Set(Read)Deadline
, so if you're in an Accept loop then you have to close the socket instead.
func CloseOnCancel(ctx context.Context, sock io.Closer) {
go func() {
<-ctx.Done()
sock.Close()
}()
}
To suppress spurious logs, it's useful to detect if a error is a real I/O error or just a normal disconnection:
func IsNormalDisconnection(err error) bool {
if err == io.EOF {
return true
}
if os.IsTimeout(err) {
return true
}
// https://github.com/golang/go/issues/4373
if strings.Contains(err.Error(), "use of closed network connection") {
return true
}
return false
}
Finally, to use this approach with os.Stdin, you need to re-open it in non-blocking mode. Example using SIGTERM to do clean shutdown:
package main
import (
"context"
"encoding/json"
"fmt"
"io"
"os"
"os/signal"
"strings"
"syscall"
"time"
)
// type ReadDeadliner from above
// func SetReadDeadlineOnCancel from above
// func IsNormalDisconnection from above
func main() {
ctx, shutdown := context.WithCancel(context.Background())
defer shutdown()
// https://github.com/golang/go/issues/24842
if err := syscall.SetNonblock(0, true); err != nil {
panic(err)
}
stdin := os.NewFile(0, "stdin")
SetReadDeadlineOnCancel(ctx, stdin)
rx := json.NewDecoder(stdin)
chanTERM := make(chan os.Signal)
signal.Notify(chanTERM, syscall.SIGTERM)
go func() {
<-chanTERM
shutdown()
}()
for {
var msg interface{}
err := rx.Decode(&msg)
if err != nil {
if IsNormalDisconnection(err) {
break
}
fmt.Printf("Error: %v\n", err)
os.Exit(1)
}
fmt.Printf("Message: %v\n", msg)
}
fmt.Println("Goodbye")
}
Finally, to use this approach with os.Stdin, you need to re-open it in non-blocking mode.
Setting O_NONBLOCK on file descriptors shared with unsuspecting programs is a bad idea, don't do it / please don't advocate for Go to ever do it.
A proposal to allow termination of blocked syscalls, which probably underlies this issue: #41054
Most helpful comment
Sorry to hijack, but if we get to this point, then
Context
should probably be rethought to be a feature of the language. The currentio
interfaces are very nice because they're very easy to work with yet very powerful and composable. Adding the complexity of correctly handling a context for all use cases seems overkill. If we want all goroutines to be correctly cancelable no matter what, then this is possibly better handled by the runtime.