In HTTP/2, I pause a fiber when we can't send more data on a stream; we must wait for the receiver to tell us that we can send more data; when it does I will resume the fiber so it can send data, but until then the fiber is paused.
The problem is I ended up with a deadlock: I would stop receiving new frames, and found out that fiber.resume won't put the current fiber into the scheduler queue, thus will never resume the current fiber. Adding a Scheduler.enqueue(Fiber.current) before the fiber.resume call fixed the issue, but this behavior was unexpected, at least to me. Also, I believe Scheduler should be internal detail that we shouldn't know about.
I'd like to propose a Fiber.pause method and a breaking change in Fiber#resume, that would allow to not have to know or use Scheduler:
Fiber.pause 鈥攑auses the current fiber (internally calls Scheduler.reschedule) assuming the developer memorized the fiber / created an event to resume the fiber;Fiber#resume 鈥攅nqueue the current fiber; resume execution of the specified fiber, so the current fiber won't be put to infinite sleep.Note: changing Fiber#resume may have a bigger impact than expected, since we may expect it to not queue the current fiber, so maybe we need this functionality?
The way I implemented this before was with Channel.
Channel(Nil) for each pausable fiber.receive on it.send nil on it.This doesn't look like the most efficient way to do it, neither the cleaner way. But both Scheduler and Fiber are internal non-documented APIs, so Channel is the only thing left.
The question actually is, should Fiber become a public API? If so, the proposed change makes sense.
Both Fiber#resume and the Scheduler are not public API. Only spawn and Channel are. In fact, Fiber is not public API at all either.
I don't think using channels here would be mucher harder than using Fiber directly, and it'd be much cleaner. So I don't think we need to make Fiber public.
I knew Scheduler was internal, but I didn't know Fiber was, and expected Fiber.yield and Fiber#resume to be public, as simple methods to control a fiber state 鈥攆or which there are valid scenarios.
For example in HTTP/2 I control the current fiber in two places: in CircularBuffer and Stream#send_data to block execution and control when it's resumed later, when we receive data that we can read, or when we can send data.
Having to use channels to control a fiber state is... unexpected. I don't want to pass data, not even nil or a single byte, I want to pause/resume a fiber, and would like it to be efficient.
Is it something set in stone, or shall we consider opening/reworking such an API? That is:
yield or pass to pass execution to another fiber, while resuming the current fiber later; we can use sleep(0) but... that sounds like a hack;pause pass execution, developer takes responsibility for resuming the current fiber later;resume to pass execution to given fiber (or just requeueing it) while still resuming the current fiber later (i.e. enqueues both fibers).I don't think that would hinder the coroutines or internal implementations, just a basic public API to control execution of some fibers; something that doesn't sound like a hack (sleep) or requires external indirections (channels) or use of internal undocumented private API to achieve. Especially since stdlib requires those primitives, and it doesn't use sleep or channel to achieve this :-)
I think the only one that can answer that is @waj, so we'll probably have to wait until he's available.
Just for your information, in the beginning we had Fiber#yield and Fiber#resume, like in Ruby, but eventually decided to do it like in Go. Go has runtime.Gosched() that basically does our Fiber.yield. We just didn't know where to put that method yet so we decided to put it in Fiber. But Fiber is absolutely not public API. Only spawn and Channel are.
Most helpful comment
I knew Scheduler was internal, but I didn't know Fiber was, and expected
Fiber.yieldandFiber#resumeto be public, as simple methods to control a fiber state 鈥攆or which there are valid scenarios.For example in HTTP/2 I control the current fiber in two places: in CircularBuffer and Stream#send_data to block execution and control when it's resumed later, when we receive data that we can read, or when we can send data.
Having to use channels to control a fiber state is... unexpected. I don't want to pass data, not even nil or a single byte, I want to pause/resume a fiber, and would like it to be efficient.
Is it something set in stone, or shall we consider opening/reworking such an API? That is:
yieldorpassto pass execution to another fiber, while resuming the current fiber later; we can use sleep(0) but... that sounds like a hack;pausepass execution, developer takes responsibility for resuming the current fiber later;resumeto pass execution to given fiber (or just requeueing it) while still resuming the current fiber later (i.e. enqueues both fibers).I don't think that would hinder the coroutines or internal implementations, just a basic public API to control execution of some fibers; something that doesn't sound like a hack (sleep) or requires external indirections (channels) or use of internal undocumented private API to achieve. Especially since stdlib requires those primitives, and it doesn't use sleep or channel to achieve this :-)