Go: proposal: Go 2: sync: remove the Cond type

Created on 25 Jul 2017  Â·  46Comments  Â·  Source: golang/go

In the discussion on https://github.com/golang/go/issues/16620, I've noticed that the majority of use-cases folks describe for sync.Cond turn out to work fine with a channel instead: (*sync.Cond).Broadcast corresponds to calling close on the channel, and (*sync.Cond).Signal corresponds to sending on the channel.

The existence of sync.Cond suggests that people might want to use it, but it is currently under-documented (https://github.com/golang/go/issues/20491), incompatible with other Go synchronization patterns (e.g. select statements; see https://github.com/golang/go/issues/16620), and unlike most other types in the sync package, does not have a valid zero-value (its Wait method requires a non-nil L field). It has an additional "no copy" invariant enforced through both a run-time dynamic check and special-case code in the vet tool.

On top of that, condition variables are fiendishly difficult to use: they are prone to either missed or spurious signals [citation needed — experience reports welcome].

An audit of the Go standard library shows only a handful of uses:

  • io/pipe.go: The use of sync.Cond was added in https://golang.org/cl/4252057. The previous implementation used channels and had no known bugs.
  • syscall/net_nacl.go: The comment there says "We do not use channels because we need to be able to handle writes after and during close, and because a chan byte would require too many send and receive operations in real use." It does not explain why these considerations preclude the use of channels (e.g. a separate 'chan struct{}' to signal closing, and channels of tokens or slices rather than channels of tokens or bytes).
  • net/http/h2_bundle.go: There are two sync.Conds in this file. One is is in a struct "like io.Pipe". The other is only used with Broadcast (never Signal), and can thus be replaced with a channel that is closed to broadcast readiness.
  • net/http/server.go: This sync.Cond is again only used with Broadcast, and thus easy to replace with a channel.
  • crypto/tls/conn.go: Again only used with Broadcast.

Of the above uses, only the one in syscall/net_nacl.go does not have an obvious channel equivalent. However, it appears to be used to limit the size of a buffer, and I know of at least one similar "limiter" API (x/sync/semaphore) that is implemented in terms of channels in order to support interoperation with the standard context package. (I did the Cond-to-channel conversion myself on a prototype of that package before it was open-sourced.)

In light of the above observations, I propose that we remove the Cond type from the sync package in Go 2.

Go2 NeedsDecision Proposal

Most helpful comment

Channels cannot be reopened to broadcast twice.

All 46 comments

The main thing that sync.Cond provides that (as far as I can see) channels do not is the ability to use both Signal and Broadcast on the same synchronization point. But if nobody does that then I agree that sync.Cond is not especially useful.

cc @bradfitz since he decided to use sync.Cond in x/net/http2 in https://golang.org/cl/16310 .

I don't use sync.Cond often, but when I have, I've been happy it exists. I agree that it's hard to use in other languages where you only have, say, locks and condition variables as concurrency primitives. But in Go, where we have channels, one can reserve use of sync.Cond for those specific cases where it's a good fit.

Besides the signal+broadcast case mentioned, another case where it's not obvious to me how to replace a Cond with a channel is where the mutex linked to the Cond protects some state. I typically want to lock, modify some state, and then cond.Wait -- which releases the lock. If I try to replace the Cond with a channel but keep the mutex, now I have two independent synchronization primitives, and that seems trickier to reason about.

@cespare

another case where it's not obvious to me how to replace a Cond with a channel is where the mutex linked to the Cond protects some state.

Indeed, that's the trickier type of usage I've seen in practice. (It's also, in my experience, the type more likely to have deadlocks, or spurious or missed signals.) That kind of usage is illustrated in syscall/net_nacl.go, where the Mutex guards two monotonic variables (r and m) and triggers based on the difference between them.

I know of at least two techniques that can apply for that type of usage.

  1. If the state is always associated with a single condition, we can apply the technique illustrated in https://github.com/golang/go/issues/16620#issuecomment-317532224 and store the data in the channel payload itself.

  2. Otherwise, we can replace the Cond with a list of channels instead of a single channel, and ensure that edits to that list only occur with the Mutex held. To cancel a wait, we acquire the Mutex, check whether the channel was signaled, and (if it wasn't) remove the entry from the list. This is the technique used in (*semaphore.Weighted).Acquire.

(For the latter technique, there may be something we could factor out into a library that would harmonize with the rest of Go better than sync.Cond does today, but I'm not sure that it's a common enough use case to warrant being in the standard library.)

sync.Cond makes emulating pthreads simple. It seems to me semantics of sync.Cond is actually heavily inspired by pthreads.

@cznic

sync.Cond makes emulating pthreads simple.

Agreed, but is pthread really a good example for Go's concurrency model to emulate?

I think Go2 could improve Cond but I think removing Cond is kinda crazy. Using channels is currently a super heavy replacement.

Note that a single Cond lets multiple waiters each wait on a slightly different condition, such as an incrementing counter reaching various thresholds. I mentioned a real-life example of this in https://github.com/golang/go/issues/16620#issuecomment-317911107. Channels seem to be a poor fit for this.

Channels cannot be reopened to broadcast twice.

Since x/net/http2 was mentioned: I recently wrote a CL for x/net/http2 that was complicated by the use of Cond, because I wanted to select on set of channels or a Cond. I had to implement this by spinning a goroutine to wait on the channels, then broadcast to the Cond if one of those channels fired.
https://go-review.googlesource.com/c/53250/8/http2/transport.go#955

And FWIW, x/net/http2 uses Broadcast exclusively, never Signal.

I don't have a position on removing Cond, per se, but more than once I have been annoyed by not being able to select on a Cond and a channel simultaneously. If _that_ could be fixed I would be happy.

I find channels easier to use than sync.Cond, but channels don't provide for efficient, repeated broadcasting of more than one bit from 1:N receivers. I suggest adding a new type of channel, a broadcast channel. It would be very similar to a channel with buffer size 1, with some extra logic:

a) if the channel has a value, then receiving on a broadcast channel doesn't consume the value in it. This allows an unlimited number of receivers to receive the same value.

b) delete on the broadcast channel removes any value in it. This allows the sender to stop broadcasting any value at all.

Notes

1) an empty broadcast channel would still cause receivers to block, as usual.

2) Senders then broadcast by sending into the channel, as usual. But each new send replaces the old value, and any receive after a replacement sees the new value. Just like a condition variable, receivers may miss values if a new value replaces the old before they awake. And just like a condition variable, a receiver may not receive a value at all if somebody deletes it from the broadcast channel before receipt.

Alternatives:

A2) a more general approach would be to allow the sending of values tagged as "sticky", and let sticky values be consumed an infinite number of times. This could be nice if you want the buffer of the channel to contain both sticky and non-sticky values. You would need a new operation to consume and eliminate a sticky value.

A3) allow the size of the buffer attached to a channel to be resized without being re-allocated. Then, assuming you know how many subscribers you have, the sender can send exactly that many values on a regular buffered channel. Unfortunately this requires that the sender have "register a new client" logic so it knows when to expand the queue. Corresponding de-registration logic required as well. Hence I prefer the broadcast type channels suggested first.

edit: A2 may have the advantage of being the most backwards compatible, and the advantage of having obviously novel syntax for sending sticky values. Suppose for instance that the operator <~ was chosen for sticky send. Example ch := make(chan int); ch <~ 1; would send the value 1 as a sticky value, that could be received many times. The new syntax would make the use of the sticky value very visually distinct.

Over in https://github.com/golang/go/issues/16620#issuecomment-327303229 I described taking some of my sync.Cond-using code and altering it to use a channel instead. I found it to be fairly tricky to get right (though of course it's possible that I overlooked some simpler way).

On top of that, condition variables are fiendishly difficult to use: they are prone to either missed or spurious signals

This couldn't be more true, I would gladly take performance loss in favor of debugging someone else's condition variable usage. People see it in stdlib, copy it, get it wrong, and then spend late nights debugging it.

sync.Cond is a keyword in my bug comb

Change https://golang.org/cl/94138 mentions this issue: shiny/driver/internal/event: use a channel in Deque instead of a sync.Cond

First I'll note that Go 2 should be largely if not entirely backward compatible with Go 1. sync.Cond is not actively buggy, so I don't see sufficient justification to remove it from "sync".

We could remove it from a new version of the sync package, whether that is called "sync/v2" or something else. But I don't think that removing condition variables is a sufficient reason to create a new version of the sync package.

I have used sync.Cond in a number of situations. Channels are really heavy weight, and its significant that a number of constructs that would be easy to implement as a relatively straight-forward condition variable followed by checks of several conditions can be implemented in terms of select {}, but at relatively large performance costs.

sync.Cond is harder to use than channels, and possibly more error prone, but for highly performance sensitive code there are so many constructs that are vastly more efficient with sync.Cond than select {} multiple conditions that removing sync.Cond would be devastating for projects that I work on.

Removing this construct would be devastating to me, and would be a reason to abandon to the language entirely for certain of my projects.

Let me be clear here. If an implementation using channels, and multiple channels, can be show to be roughly equivalent in terms of performance with a using a single condition variable followed by multiple if {} statements, then I'd be willing to accept that channels are an equivalent replacement. My own experience is that this is vehemently not the case, and I believe that this proposal stems from a naive understanding of the documented semantics of channels and condition variables, without sufficient experience in using either in performance critical code to make assertions about the lack of utility of one vs. the other.

Channels are really heavy weight

It is true that a channel is currently larger than a sync.Cond, and requires more allocations. However, I don't believe that that is inherent to the use of channels. (https://github.com/golang/go/issues/28366 describes one option to streamline both the API and implementation.)

for highly performance sensitive code there are so many constructs that are vastly more efficient with sync.Cond

Can you provide some benchmarks to demonstrate that? I'd be curious to see how well the channel alternatives can be optimized.

In my experience, even today the performance cost of untargeted wakeups can often swamp out the other performance advantages of condition variables.

I don't think that removing condition variables is a sufficient reason to create a new version of the sync package.

That's fair enough. If the benefit of removing sync.Cond isn't worth the code churn, can we at least mark it Deprecated?

We had a pretty clear education gap between use of sync.Cond and equivalent use of channels, but hopefully that's getting better: in particular, Rethinking Classical Concurrency Patterns illustrates a lot of alternatives starting around slide 37.

@bcmills What is your beef here? Why do you so want to take away a tool that has proven itself quite useful many many times over? I know that sync.Cond() has resulted in cleaner and simpler code over attempts to make this work with channels, and faster code too! in many of my projects. Yes, there are sharp edges here, but these are hardly the only such. They are no worse than sync.Mutex in terms of dangerous tooling.

Since this is still open for some reason, let me also contribute that, I love using sync.Cond(). What would have been nice, when I first started using it, is a better documented approach, usage and example instead of figuring things out myself. That was exactly at 7 Mar 2018.

Therefore, I think since there are more than enough people currently relying on this facility, it would be prudent to remove it and I don't understand the need for it either like @gdamore already asked.

Conclusion: If possible, improve documentation for future generation of condition users. Please do not remove. It's a precise tool for a specific problem, but for that it's spot on and nothing else is better or easier or cleaner to use.

Just to weight in is that sync.Cond is substantially different in meaning to channels because it can describe a system that "forgets" information sent on it before a waiter has appeared. Channels just cannot do this. So since it is not possible to always replace a sync.Cond with a channel (never mind the efficiency argument) I don't agree with removing or replacing sync.Cond

Maybe I'm misunderstanding something, but isn't that the same thing as sending with a select?

select {
case c <- data:
default:
}

Now it only sends if someone is listening, so it forgets any data sent if there's no one on the other end.

If you want to forget the data than yes that is equivalent. But if you want to remember only the last item that was sent you currently cannot. 

How about this?

c := make(chan Type, 1)
select {
case <-c:
  c <- data
case c <- data:
}

Is that not susceptible to a race between one goroutine pulling data out and another one pushing data in? Or is there none there because go is cooperative multitasking?

That above case @DeedleFake created demonstrates to me that while it may be possible to implement every use case of Condvar using channels, it isn't necessarily wise to do so. In many of my use cases Condvars result in simpler and easier to read (and implement!) code. I contend for a number of situations, they are also a lot more performant than channels.

(In fact, I expect that the logical implementation of channels makes use of condition variables under the hood. I haven't checked to be sure, but I honestly cannot think of any better way to implement them.)

It very much bothers me that this proposal is still open.

Not everyone writes code precisely the same way, and not every solution is necessarily equivalent. Taking away sync.Cond would be an enormous step backwards.

It should also be noted that channels cannot be used for broadcasting

That broadcasting thing is one of the biggest complexities. It is possible to hack something around this using closed channels, but its really really ugly, and to correctly do this without race conditions you have to write some rather complex code using mutexes. With condition variables this is much more natural.

In the vein of parody, let's just take away variables too. Bugs are highly correlated with having variables in your program.

Come on people. Cond variables are an advanced optimization, mostly used by professional network programmers (i.e. not for beginners) for those who need the performance benefits (and these are significant). The Go ssh packages use them; you would cripple alot of software without them. Please close this short sighted discussion.

I have suggested additions to channels to bring their broadcast capabilities up to par with Cond variables, but was ignored. The same fate should befall this thread. There's no backwards compatible path here. The slowdowns would force people back to C++, which would be tragic.

@gdamore Channels are not implemented using condition variables. I don't quite see how that makes sense, but maybe that's because I'm too familiar with the current implementation. The trick with channels is, of course, the select statement. Each channel can participate in an arbitrary number of select statements, and each select statement can be looking at an arbitrary number of channels.

The other trick of the current channel implementation is a fast hand-off from a sending goroutine to a waiting receiving goroutine, or a receiving goroutine from a waiting sending goroutine. This minimizes value copying.

@DeedleFake It may be possible to get the same functionality as sync.Cond using channels, but that looks ugly as hell. If you understand how condition variables work, code that uses them is clear. Using channels to achieve the same thing is not at all clear.

That broadcasting thing is one of the biggest complexities. It is possible to hack something around this using closed channels, but its really really ugly

I think this argument can be simplified as this: Use chan to implement sync.Cond is a hack. And hack is bad.

As I person who just finished a feature using sync.Cond, I must say I love it. I don't think remove it and replace it with that chan hack is a good idea unless the chan one is faster (which is unlikely the case).

@firelizzard18

If you understand how condition variables work, code that uses them is clear.

I would very much dispute that. Broadcast looks like it should be O(N) with the number of waiters, but programs written using Broadcast have an unfortunate tendency to end up “accidentally quadratic”.

Moreover, with condition variables, it can be very difficult to spot the invariants around exactly when a signal must be sent and why. In contrast, if you send _the actual data_ on a channel (rather than merely using the channel to simulate a condition variable), the invariants are usually much clearer: you cannot send data that you do not have, and if you compute some result but don't send or store it at all, that's usually much more obvious than a store with a forgotten call to Broadcast.

@niruix

I think this argument can be simplified as this: Use chan to implement sync.Cond is a hack. And hack is bad.

Agreed. You should not use a channel to _implement_ a sync.Cond. You should use a _channel-based pattern_ instead of a _condition-variable-based pattern_. The resulting code may be quite different.

That is: channels are not a _drop-in replacement_ for condition variables. However, they can be used to address the same _underlying use-cases_.

@bcmills If you're using a cond.Broadcast to wake up hundreds or thousands of goroutines, you're not using it correctly, and I would say that you don't really understand condition variables at all. These shouldn't be used for example to wake all possible waiters for work. There are wakeup versions that wake just one waiter, if you're wanting to do that, and if that's your sole use case, then channels are almost certainly better.

The use cases I have for broadcasting typically involve a small number of waiters (most often 2, or in rare circumstances 3). Usually there won't even be any waiters. I may not know if they are waiting at the time. What the broadcast does is allow me to structure that code simply and clearly.

If "n" is small, I don't care if it's quadratic. For tiny values of "n", what dominates is not the complexity, but the constant factor.

Just because your use cases tend to match better to channels, does not necessarily mean that there are not other, equally valid use cases where the condition variable is superior to constructing something with channels.

Btw, those uses cases for condition variables usually involve more than just a simple store on the condition. More often it's a bit more complex logic. Sometimes we even have multiple different conditions which get considered under a single condition variable. It would often be difficult to express these using a purely message driven paradigm.

Condition variables are one of the hardest to use correctly, relatively commonly used, synchronization patterns. I feel very strongly that just because they are hard to use does not mean they should be removed (if you want to suggest they should somehow migrate to x/sync to add a bit more of a discovery burden, I'd be fine with that).

However, I would support these two actions:

  1. Adding a warning in sync.Cond documentation that it really is hard to use correctly. (This deviates from Go's "just the facts" reference-style document style, but I feel it might be worth it.)
  2. Writing more, better & more discoverable examples of the simpler channel-based alternatives, especially for the cases where the full power of sync.Cond is not needed.

Removing sync.Cond will just have people using a copy-pasted implementation from a random project. Guiding people to better designs, where possible, actually improves code.

sync.Cond still has valid uses. Don't prevent, but guide.

Just a data point here -- this isn't about sync.Cond so much as the fact that channels are not the silver bullet for all synchronization primitives. The data point is sync.WaitGroup. You'll notice that this is implemented in hand tuned code using atomics.

Obviously this could have been done just as easily with a channel and a lock. It wasn't. For performance reasons. This is also why I generally advocate using waitgroups when you need to wait for a goroutine to complete instead of creating a channel and closing it at the end of the goroutine (via defer).

Channels are awesome and solve a lot of problems. They don't solve all the problems equally well.

You can fasten things with a hammer and nails. But for certain applications there are better choices, like glue or wood screws. Understanding the options, and how to use them correctly, is what defines a craftsman -- condition variables, channels, and waitgroups are all useful things in our tool box and we'd be poorer without them.

@gdamore, this proposal _is_ specifically about sync.Cond.

Other synchronization primitives, such as atomic variables, are often just as clear as (if not clearer than) channels, and sometimes more performant as well. I don't think anyone disputes that.

On the other hand, _sync.Cond in particular_ is — in my experience — _empirically never_ clearer than an alternative based on channels, WaitGroups, and/or atomics.

So my experience is quite different than yours in that regard. There are a number of times where I felt trying to use channels to express what I needed was doing strange and unnatural acts, were a condition variable was simpler and clearer. Admittedly we have different experience, but at this point I think I can say I've worked on a non-trivial body of work both in Go and other languages.

It's possible that my experience working in C in places like the Solaris kernel means that I'm coming at problems with a built-in understanding of condition variables that others may lack. I don't think that necessarily invalidates my point of view though.

@bcmills What is your suggested alternative for a scenario where I am calling Broadcast() on a single condition multiple times? If I were only broadcasting once, I could use a channel and close it to send the broadcast, but that strategy does not work if I need to broadcast multiple times.

@firelizzard18, that depends on the use-case. I covered (at least) two alternatives in my GopherCon '18 talk.

So I've looked at that talk, and while there is some good stuff there, I think the assumption here may be a rather simplistic view of how condition variables are used.

The cases I where I find them most useful are cases like this;

  • two (or maybe more) workers need to watch for different, but linked events. (Frequently these are complex conditions, not something trivial.) For example, goroutine A cares about the state of variables x1, x2, x3. goroutine B cares about the state of variables x3, x4. Changes to any of these are made under a lock, because there is some relationship. In this case the mutex with condition variable leads to a simpler construct.

If I were to build that same construct with channels, I wind up needing to have multiple channel operations and the lock, because I can't communicate the "new" channel over a closed broadcast. Channels haven't bought me anything here -- I still have to have the potential for spurious wakeups, I still have a lock. Worse, I actually have heavier weight operations as I'm selecting and locking. I can also miss events btw, as I might have missed an intermediate channel close and replace if I was not timely enough in waking up.

  • I have many things I'd be selecting on. In this case, yes, I can use a big select. It's a bit more complex if the decision about whether an item is of interest to me or not varies at runtime, particularly if it might be very large. If I can use a channel of channels, that might be easier, but this construct doesn't always work out in real settings. One thing that is definitely true is that there is a quite noticeable (measurable) performance impact of adding extra cases to a channel -- even an unselectable (not ready) channel. I believe this comes down to the fact that each case is going to require extra atomic operations to be performed, at a minimum. In real world practice I've been able to measurably reduce latency by replacing multi-channel selects with a condition variable and either a bit mask for conditions or just separate booleans. (In my use case this was for very low latency processing, where even an addition microsecond of latency was very noticeable. Of course, in my use cases I was dealing with very light contention, but in those cases latency is still king for the application use cases I have in mind.)

Do I think that new Go programmers should try follow the share by communicating paradigm? Yes! Do I think that channels are the solution for all of our concurrency needs? No.

Condition variables might be somewhat tricky to use correctly, and I believe it's right to steer folks towards channels first -- especially if they are using one of the patterns you described (e.g. resource pools) or simple broadcasting situations. But condition variables very definitely have real uses. Yes, you could probably implement all of the concurrency operations using channel, but ... and I can't stress this enough ... that doesn't mean you should.

Removing condition variables would be a huge regression in the language, and a big enough dissatisfier to drive me from it. I recommend closing this issue, unless the Go team has decided to ignore these considerations and charge ahead anyway (or rather charge backwards....)

To highlight the difference between channels, and condvars(and that difference is an argument in favor of maintaining the current level of diversity in the std lib), it might help to think about what each can be "optimized" for:

  1. As noted above, channels can be optimized for fast "hand-offs" between waiting goroutines.
  2. A "fast" condvar is rather one that is optimized for notifying when no-one is waiting(similar to a fast lock optimized for un-contented operations), see for example https://webkit.org/blog/6161/locking-in-webkit/

So a channel is used a bit like email, when you send someone one you kind of expect the other person to eventually read it, preferably soon. Signalling on a condvar would be more like showing an ephemeral notification to people that happen to be glued at their screen waiting for one, and then you'd organize your team in such a way that you hope there rarely will be anyone waiting when you signal(which would mean everyone is actually busy working on something else in parallel).

From reading this thread(no pun intended), it appears to me gophers have occasionally been using a condvar as a poor man's broadcast channel, which then creates frustration since it "doesn't compose well with select" and so on. The solution for Go 2 might be to actually introduce a proper broadcast channel, and make clear in the docs that that is not what a condvar is.

The gold standard on the subject online: https://www.chromium.org/developers/lock-and-condition-variable

In particular, the paragraphs "Why is cv.Wait() always in a while-loop?" and "Why must the condition used with cv.Wait() be a function of state protected by the mutex?" in my opinion highlight why a condvar really is a different animal from a channel.

sync.Cond is the only thing that I still cannot wrap my head around in Go. I have one working example...or it seems to be working... and I am even affraid to try and test it because I might break it. When I read the documentation and look at the code examples...none of it makes any sense whatsoever. So I think that maybe the implementation itself is not to be removed but maybe the documentation or method names needs to be changed?

@ivanjaros Condition variables are a standard synchronization mechanism in languages other than Go. The Go method names Wait, Signal and Broadcast correspond to the names used in C: pthread_cond_wait, pthread_cond_signal, and pthread_cond_broadcast. You can also see these same names at https://en.wikipedia.org/wiki/Monitor_(synchronization)#Condition_variables_2. In any case, we can't change the names now.

Of course, better documentation is always OK. But that is separate from this issue and should ideally be discussed elsewhere. Or just send a CL. Thanks.

Was this page helpful?
0 / 5 - 0 ratings