Winit: Consolidate poll_events() and run_forever() into get_event(Timeout)

Created on 17 Jul 2017 · 32Comments · Source: rust-windowing/winit

EDIT: THIS ISSUE HAS BEEN SUPERSEDED by #459.

In #219, I examined some unfortunate event handling complexities on MacOS X. That examination led me to view and re-implement both poll_events() and run_forever() as special cases of a more general API, get_event(Timeout) -> Option<Event>.

I propose:

Refactoring each platform to expose this get_event(Timeout) interface,
Refactoring poll_events() and run_forever() to be implemented in terms of get_event(Timeout), and
Deprecating poll_events() and run_forever() in favor of get_event(Timeout).

This is a user-facing API change, so I'm accompanying this proposal with a larger than usual explanation.

Receive-with-time-constraint is the fundamental operation

poll_events() allows the user to communicate "I want all the events that are ready, but I don't want you to wait". Let's call that get_event(Timeout::Now), and make the loop explicit:

// current
    events_loop.poll_events(|event| {
        match event {
            Event::WindowEvent { event: WindowEvent::Resized(w, h), .. } => {
                println!("The window was resized to {}x{}", w, h);
            },
            _ => ()
        }
    });

// proposed
    while let Some(event) = events_loop.get_event(Timeout::Now) {
        match event {
            Event::WindowEvent { event: WindowEvent::Resized(w, h), .. }) => {
                println!("The window was resized to {}x{}", w, h);
            },
            _ => (),
        }
    }

run_forever() allows the user to communicate "I want all the events until I tell you to stop", which… is kind of weird, isn't it? At any rate, it's the user's intention to block indefinitely and eventually break out of the loop. Let's call that get_event(Timeout::Forever), and again push down the loop:

// current
    events_loop.run_forever(|event| {
        match event {
            winit::Event::WindowEvent { event: winit::WindowEvent::Closed, .. } => {
                winit::ControlFlow::Break
            },
            _ => winit::ControlFlow::Continue,
        }
    });

// proposed
    loop {
        match events_loop.get_event(Timeout::Forever) {
            winit::Event::WindowEvent { event: winit::WindowEvent::Closed, .. } => {
                break;
            },
            _ => ()
        }
    }

Some users have additional needs, like "I want all the events between now and when I need to update my state/draw the next frame". pistoncore-glutin_window does this today by unconditionally spawning a thread that wakes up the event loop after a fixed period of time. We should be able to support it efficiently on every platform without needing extra threads, and with a timeout-centric API, we can:

// current
    let events_loop_proxy = self.events_loop.create_proxy();
    thread::spawn(move || {
        thread::sleep(timeout);
        events_loop_proxy.wakeup().ok(); // wakeup can fail only if the event loop went away
    });
    {
        let ref mut events = self.events;
        self.events_loop.run_forever(|ev| {
            events.push_back(ev);
            glutin::ControlFlow::Break
        });
    }
    let event = self.events.pop_front();

// proposed
    let event = events_loop.get_event(Timeout::Deadline(timeout));

Timeouts are sufficient to justify a new API

Consider a game that wants to update and render at 60 fps. If the game is running faster than 60 fps, it should sleep and wait until it's time to start the next frame; if it's running slower, it should busy-wait.

This use case presently requires a calling run_forever(callback) to allow the thread to sleep, plus a second thread keeping track of time and calling Proxy::wakeup() as needed. That's wasteful and unnecessarily complex.

winit should support timeouts pattern directly. Additionally, this is a common requirement; every event loop has some way to block until a specified time, or some idiomaitc way to synthesize a timeout. winit should use this where appropriate.

Simplifies the API

winit puts user code at the bottom of the stack, and does not force the user to organize their code in terms of callbacks… except for event handling. Changing the API so that the user calls get_event(Timeout) in a loop adds consistency, and it lets the user define and manage their event loop with normal Rust statements instead of a callback that returns a ControlFlow enum.

This also makes EventsLoop look more like a futures::stream, or a generator, or many other things that users might already know. "I want to get a message from you" requires less explanation than having two different functions that take two different callbacks.

Reduces duplication

poll_events() and run_forever() have considerable overlap, and in the case of the MacOS platform, considerable code duplication. Not only can get_event(Timeout) provide more functionality, in the case of the MacOS platform, refactoring into this caused a net reduction in code.

Low risk

It's very easy to implement poll_events() and run_forever() in terms of get_event(Timeout), which means there's very little risk in deciding to move that direction.

winit can continue to provide the same poll_events() + run_forever() API to the user while refactoring platforms into get_event(Timeout) behind the scenes. Once they're all ready, winit can update the examples, deprecate poll_events() and run_forever(), and ultimately remove at some later point.

Next steps

I think this covers everything:

pub enum Timeout {
    Now,
    Forever,
    Deadline(std::time::Duration),
}

impl EventsLoop {
    pub fn get_event(Timeout) -> Option<Event> {
        // …
    }
}

A change like this requires buy-in, and this isn't my project, so I don't know what all would need to happen. I'm happy to start by soliciting feedback, so: what does everyone think?

hard high waiting api

Source

willglynn

👍12

Most helpful comment

It's amazing that the maintainers don't seem to care.

They care very much so - I believe most of the points in this issue are being addressed in #638 in @Osspial's heroic effort to rewrite the event loop, though I'm not familiar enough with the PR to know for sure if everything in this issue has been addressed. That PR is just about ready to go - feel free to have a look, test it and share any constructive input you might have. The maintainers are a group of collaborators working on this in their limited, free time and for many, winit is just one of many open source crates to which they contribute, each with their own slew of "showstopper" issues to address.

mitchmindtree on 31 Jan 2019

❤6

All 32 comments

What you describe is more or less winit's old API. We transitioned to callbacks because it solves the fact that MacOS uses a resize callback, and also in the longer term because of Android and Emscripten.

This also makes EventsLoop look more like a futures::stream

It's the opposite, the futures library is based on callbacks.

tomaka on 17 Jul 2017

👍2

It's the opposite, the futures library is based on callbacks.

Huh? futures is based on structs that have a poll method which you call to see if they're ready or not, and get a value from if they are.

We transitioned to callbacks because it solves the fact that MacOS uses a resize callback

Couldn't this (and any similar system) be handled by putting the message in a queue in the event loop struct and waking it up?

Ralith on 17 Jul 2017

Huh? futures is based on structs that have a poll method which you call to see if they're ready or not, and get a value from if they are.

This is missing half of the documentation. If the future is not ready when you call poll() then it will later automatically call Task::unpark without the user's intervention.

tomaka on 17 Jul 2017

Couldn't this (and any similar system) be handled by putting the message in a queue in the event loop struct and waking it up?

@Ralith see #219 - The main problem is that in the macOS backend, nextEventMatchingMask blocks from the moment the user presses their mouse to begin a resize until the end when the user releases the mouse. This means we can't return from run_forever, poll_events or the proposed get_event method until the user ends the resize, at which point all the Resized events would be dumped at once. @willglynn I just tested the window.rs example with your coroutines branch on my macOS machine and it has re-introduced this problem now that you have re-implemented run_forever in terms of get_event.

I would recommend breaking this proposal into two distinct proposals:

1. Offer a way to timeout on waiting for events.

I agree that this makes sense (as long as all platforms offer a native API for this) to avoid the need for users to spawn a thread.

2. Change the event emitting API by removing the user callback approach in favour of returning events from a function.

From my understanding this is a much harder problem due to the frustrating behaviour of macOS that I mentioned above. As it stands, I can't really get behind this proposal without first solving this problem. As tomaka mentioned, this problem is one of the reasons we changed the API from returning Event iterators to user callbacks in the first place. That said, if we can work out how to solve this issue, I would love for a more control-flow friendly approach like we used to have! I just don't yet see a solution that will allow this with consistent behaviour across all platforms.

mitchmindtree on 17 Jul 2017

@Ralith:

Couldn't this (and any similar system) be handled by putting the message in a queue in the event loop struct and waking it up?

Yes, with the caveat that the thread receiving the callback _is_ the caller's thread, and there's an internal event loop that we don't control sitting between poll_events()/run_forever() and the callback. I moved the part of the code that can trigger a second internal event loop into a coroutine, and now it works exactly the way you describe: callbacks post a message to a queue, which wakes the event loop, which hops back to get_event() and lets it return.

@mitchmindtree:

From my understanding this is a much harder problem due to the frustrating behaviour of macOS that I mentioned above. As it stands, I can't really get behind this proposal without first solving this problem.

The MacOS problem is sticky because the normal event handling scheme can wander into an internal event loop. Sure, you can get callbacks at various points -- NSWindowDelegate, runloop observers, timers -- but you can't actually return to the caller.

The current approach plumbs the user's callback all the way down so that the MacOS callbacks can trigger them. That delivers resize events as they happen, yes, but it breaks the contract in other ways. poll_events() can hang for the entire duration of a window drag instead of returning immediately, which blocks the example game loop. run_forever() seems like it would be unaffected, but Event::Awakened never gets delivered during a resize, and in practice, callers rely on making it return, which it also can't do during a resize.

My changes in #219 address the core of _that_ issue, which is that some parts of the MacOS event loop can block for a long time. get_event() is responsible for driving those parts forward, but now that happens in a coroutine that periodically returns control to get_event() so that it can check its timeout and return if needed. This makes poll_events() always return without blocking and makes run_forever() respond promptly to ControlFlow::Break and wakeup().

@tomaka:

What you describe is more or less winit's old API. We transitioned to callbacks because it solves the fact that MacOS uses a resize callback, and also in the longer term because of Android and Emscripten.

Callbacks are insufficient to solve the MacOS problem; if poll_events() happens to see a resize start, it'll block for seconds. The only solution that fits the constraints is stack switching, as I describe and implemented in #219. With that in place, get_event(Timeout) seems like the underlying primitive we should expose.

Emscripten bring me to another question: is this the wrong model entirely? Should winit code be running at the bottom of the call stack, instead of running on top of user code? That would let winit hide the difference between a loop {} on most platforms and emscripten::set_main_loop_callback() on JS. That would also fix the MacOS problem without needing coroutines: if _everything_ were a callback, it wouldn't matter if the callbacks were delivered directly from the normal runloop or the middle-of-a-resize runloop. It's only difficult because EventsLoop promises to return to the caller, instead of the caller always running inside EventsLoop.

How does Android fit in?

willglynn on 17 Jul 2017

but it breaks the contract in other ways. poll_events() can hang for the entire duration of a window drag instead of returning immediately

Ahh you're totally right, this is a pretty major issue with the current implementation.

For anyone else reading along here: As I just mentioned in #219, I'd forgotten to enable --features "context" when testing @willglynn's coroutines branch. It does indeed return during the live resize, and seems like a promising solution for the bug where macOS becomes blocked on an inner loop.

mitchmindtree on 17 Jul 2017

Well yes, but as it is right now, the EventsLoop runs for literally every event, causing the whole app to lag, because there are way too many events and the closure-based design blocks on every event. Try resizing a window with the new winit API - it lags like hell.

The old wait_events is missing and there is no replacement in sight, which is very bad. It maps 1:1 to the OS-native event loop, which is good for applications. Macs could use the current closure design, but just abandoning the OS-native loop is a very bad idea.

fschutt on 1 Sep 2017

I wonder why these issues aren't getting solved. Winit is basically useless as it is now. You can choose between using 100% CPU (poll_events) or not rendering anything at all (loop_forever). If this isn't going to be solved, please update Winit with a warning telling people not to use it.

portstrom on 22 Jan 2018

😕1 👍1

I'm using it without issues (high CPU usage) like this:

    for (now, delta_t) in FpsLoop::new(60) {
        events_loop.poll_events(|event| match event {
        ...
        });
    }

My FpsLoop does the necessary sleeping.

It's an "inversion of control" but in a good way, winit gives you the control of how fast you want to loop.

Boscop on 22 Jan 2018

@Boscop It seems you alternate between waiting for the timer and polling events, but you can't wait for the timer and listen to events at the same time. When rendering 60 frames per second, it may not be noticeable, but when drawing only occasionally, it doesn't work.

I don't see why there can't be a function like poll_events that takes a timeout argument as proposed in this issue.

portstrom on 22 Jan 2018

@portstrom But the issue above says:

Consider a game that wants to update and render at 60 fps.

Ok, that's my use case, too.

If the game is running faster than 60 fps, it should sleep and wait until it's time to start the next frame; if it's running slower, it should busy-wait.

Yes, that's what my FpsLoop iterator does.

This use case presently requires a calling run_forever(callback) to allow the thread to sleep, plus a second thread keeping track of time and calling Proxy::wakeup() as needed. That's wasteful and unnecessarily complex.

Why does it require that? It seems to work without that for me..

@portstrom I'm not trying to be a dick. What is your use case and why can't it be done by looping at high FPS and only drawing sometimes?

Boscop on 22 Jan 2018

@Boscop I'm trying to make a program that draws only after events (user interaction, fs/network input, timers above 1 second). I expect to make a loop that waits until there is an event, then processes all pending events, then draws.

The first thing I try is poll_events. It handles all pending events and then yields to rendering, but I still need something to wait for the next event before I repeat the loop. The next thing to try is loop_forever. It waits for events and handles them, but it doesn't return when there are no more events. It's hard to understand how someone could have decided that there shall be one function that processes events without waiting and then returns, and one function that waits and processes events but doesn't return, but no function that waits for events, processes them and then returns. I can't draw from within the infinite loop, because that causes only one event to be processed for each frame, making the program unresponsive. There is no 'idle' event I can use to detect when it's appropriate to draw.

portstrom on 22 Jan 2018

👍1

@portstrom In case you are looking for a workaround, I ran into the same problem. I forked and backported important patches to both glium (https://github.com/fschutt/glium-backport/tree/glium_backport - use the branch "glium_backport") and vulkano (https://github.com/fschutt/vulkano-win-backport). They are the last versions to use the old API. I am currently using those because of this issue.

So if you have old code depending on the old glium API, you can depend on those. If you have issues, I'll help to backport fixes. It's not a perfect solution, but it's something until this situation is resolved. I can't accept performance problems just because of a macOS weirdness (which I don't even use).

fschutt on 22 Jan 2018

@portstrom So you want to avoid unnecessary redrawing? Why not set a flag from within your poll_events callback when events happen that require a redraw? Then check the flag after poll_events to see if you should redraw.

Boscop on 22 Jan 2018

@fschutt: the worst part is, the poll_events() + run_forever() model doesn't even fix MacOS.

I describe the problem in #219, but it boils down to a fundamental design mismatch: winit is designed such that the user's code primarily controls the main thread and calls into the platform, whereas MacOS is designed such that _the platform_ primarily controls the main thread and calls into the user's code. I developed a changset to fix this (#237) which does stack switching trickery: the user's code calls into MacOS while _also_ letting MacOS call into the user's code. I maintained it for a while but it never got merged, and while I recently indicated I'd be happy to update it again, the maintainers have yet to express interest.

That PR would fix the problem on MacOS, which would eliminate the only technical blocker in this issue. Having said that, the fundamental design disagreement remains: some platforms want to run the event loop themselves, and winit assumes they don't. We can fix MacOS by dancing around with coroutines, but is that the only instance of this problem? Web browsers don't much like infinite loops in JS or WebAssembly…

willglynn on 22 Jan 2018

@Boscop No matter how the flag is set that will lead to 100% CPU usage. poll_events doesn't handle waiting.

portstrom on 22 Jan 2018

@willglynn Doesn't run_forever solve that problem without using stack switching if you remove poll_events? If so, I would suggest removing poll_events and adding an 'idle' event and a 'timeout' event. The 'idle' event would be emitted after all other pending events have been processed. Or for even better platform support, instead of the 'idle' event, there could be a 'paint' event and a method to explicitly request the 'paint' event to happen, which will call the requestAnimationFrame function on the web platform. On other platforms requestAnimationFrame can be mimicked as well, by not emitting the 'paint' event when the window is not visible.

portstrom on 22 Jan 2018

@portstrom Sure, that would work. Removing poll_events() and having the user do everything in a callback would fix the problem. This forces everyone to turn the main thread over to winit, which could either run its own loop or defer to the platform event loop as appropriate. This also forces every single user of winit to significantly restructure their code, which is why I opened a PR for coroutines on MacOS and opened this issue to propose a smaller API change.

willglynn on 22 Jan 2018

I agree that there is kind of a problem with the current API. This is especially noticeable on the emscripten port of winit where paint messages are being dispatched by the JS engine in order to make it work.

However I cannot think of any reasonable solution at the moment.

tomaka on 22 Jan 2018

One option is to totally invert the control flow:

struct MyApp();

impl winit::App for MyApp {
  fn create_windows(&mut self, window_builder: &mut winit::WindowBuilder) {
    // use window_builder
  }

  fn on_event(&mut self, event: winit::Event) {
    match event {
      WindowEvent::Closed => {
        // tell winit to shut down
      }
      WindowEvent::KeyboardInput { ... } => {
        // change state and re-paint
      }
      // …
    }
  }
}

fn main() {
  winit::Main(MyApp());
}

The user's fn main() would only exist to call into winit::Main(), which might be a Rust event loop that runs forever (most platforms), a native event loop that runs forever (MacOS), or a function that sets up callbacks and returns (web). winit need only guarantee that each platform would somehow call back into the user's code after initialization.

willglynn on 22 Jan 2018

👍2

It's not sufficient to invert the control flow - you also need to provide a way to send custom events, otherwise there's no way to wake up the main thread, winit already supports this though I think. If you can send custom events then it will be possible to implement futures-compatible reactor. Even then, the callback based approach may not always be appropriate, as sometimes you just do need more control.

I propose that winit supports two APIs:

1) The synchronous API. This API is based on the fundamental get_event(Timeout) operation. On macos this is implemented via stack switching.

2) The asynchronous API. This is based on winit::Main(MyApp()) and has an adaptor to make it compatible with futures. This adaptor implements task notifications as custom events.

Diggsey on 24 Jan 2018

👍1

@Diggsey I'm can imagine some situations when callbacks might be _inconvenient_, but what is the specific use case you have in mind?

The web platform in particular makes me think that the user might need to surrender the main thread to winit, since the typical JS/WebAssembly contract requires winit to return back to the browser in order to process events. There are workarounds of varying practicality (much like how stack switching is a workaround for MacOS), but it seems to me that "the platform calls you" works on every platform without trickery, whereas "you call the platform" doesn't.

willglynn on 24 Jan 2018

There are any number of reasons: porting existing code which is not written in a callback style, to building libraries on top of winit to name a couple. If two libraries make the decision to control the event loop then those libraries are forever incompatible.

Inconvenience is also an important factor - I may not even care about the resizing issue on macos, as for many applications it doesn't matter if the loop blocks for that period of time, and yet to implement the basic primitive of "wait with timeout" the universal operation support by synchronous APIs, I have to:

Launch a second thread
Setup a concurrent queue between the two threads
Pass a reference to my WindowProxy to the other thread
Each time I receive a normal event on the main thread, send a message across the queue to the second thread, with a value indicating how long I am willing to wait before being woken up
Each time I receive a wake-up event on the main thread, send a "confirm" message across the queue
On the second thread, loop while performing a wait-with-timeout on the concurrent queue, using the timeout from the last message
Each time the second thread doesn't get a message in the time, send a wake-up message to the WindowProxy, and then wait indefinitely for a "confirm" message

Even then, there's still a race condition between the "wake up" and the next event coming in - I'm not really sure how you avoid that if you're forced (by winit and not the platform) to use multiple threads!

The stack swapping feature is a nice-to-have workaround for a relatively minor platform-specific bug: replacing winit's old API with the current one because of this bug was a terrible mistake.

Diggsey on 24 Jan 2018

👍2

If two libraries make the decision to control the event loop then those libraries are forever incompatible.

winit already controls the event loop. It provides no wait handles with which to link to other event loops, which means you can either a) let winit block inside the windowing system or b) block yourself somewhere outside the windowing system. If your goal is to service events promptly, you must use run_forever().

I opened this issue to propose poll-with-timeout as a more useful common denominator: this would let you block inside the windowing system and regain control again after a fixed period. This doesn't change the fact that winit is still blocking in its own event loop; there's no way to tell winit to e.g. wait on a windowing event _and_ wait on a socket becoming readable.

IMO, the main issue here is control over the main thread. Right now, users call winit and winit returns back to the user. This isn't how MacOS works; you're not _supposed_ to constantly bail out of the platform event loop, and there are situations (like resizing) where the platform's event loop takes over anyway. It's also not how the web is designed to work, and it's much more difficult to paper over that.

Poll-with-timeout is a more useful common denominator, but it's not the lowest common denominator. The common interface that _all_ platforms can share is to have the user surrender their thread to winit, and have winit call back into the user's code.

Inverting control flow would absolutely be a giant breaking change, but that doesn't avoid the question: which is the right model? Should the user call winit, or should winit call the user? Does the main thread belong to the user, or does it belong to the platform? @tomaka?

If the user should call winit, then we should merge coroutines support for MacOS, since MacOS's design is at odds with winit's design. We should also give up on having winit's web platform work the same as the other platforms, since… it can't.

If winit should call the user, then we should figure out what that API should look like, and we should figure out how to get there from here. It's worth noting that I used coroutines to merge two incompatible event loop models in #237; we could do the same thing to provide a user-calls-winit compatibility layer on top of a new winit-calls-user API.

willglynn on 24 Jan 2018

👍1

In the past I was against inverting flow control for "ideological" reasons. After all in a pre-emptive multitasking model, from the point of view of the application everything happens linearly. My point of view is that the API provided by the OS was wrong and existed for legacy reasons, and that we should abstract over it. Newer models such as wayland seem to comply to this model after all.

However on the other hand I'm generally in favour of exposing the lowest-level possible API on top of the OS, in order to not add any overhead.

In addition to this, the pre-emptive multitasking model of execution isn't compatible with the cooperative model of Emscripten, Android and iOS.

Maybe one possibility would be to use the new upcoming coroutines feature of Rust and ask the user to yield whenever they want to query for events.

tomaka on 25 Jan 2018

I finally found a way to make a meaningful program with Winit. It only works on X11. An answer on Stack Overflow suggested using select on the X connection to wait for events with a timeout.

I found out I can get the connection number of a window like this:

let connection_number = {
    let connection = window.get_xlib_xconnection().unwrap();
    let display = window.get_xlib_display().unwrap();
    unsafe {
        (connection.xlib.XConnectionNumber)(display as _)
    }
};

Then I can use it to set up Mio:

let mut events = mio::Events::with_capacity(1);
let poll = mio::Poll::new().unwrap();
poll.register(
    &mio::unix::EventedFd(&connection_number),
    mio::Token(0), mio::Ready::readable(),
    mio::PollOpt::edge()
).unwrap();

When done calling poll_events on the event loop and painting, and calling poll_events again after painting, I use Mio to wait until more events are available or it's time to paint the next frame:

let current_time = chrono::offset::Local::now();
let _ = poll.poll(&mut events, Some(::std::time::Duration::new(0, 1000000000 - current_time.timestamp_subsec_nanos())));

I hope the event loop in Winit will be fixed soon, so I don't have to do this manually in a way that uses unsafe code and only works with X11.

portstrom on 1 Feb 2018

Unfortunately, using the X connection fd like this does not work reliably, because OpenGL/vulkan implementations will do IO on that fd internally, causing unrelated events to be read early and read-readiness notifications to be missed.

Ralith on 2 Feb 2018

👍1

@Ralith Could you expand on it? I was just searching for more information on polling XConnectionNumber and found this issue.

Could you give an example when it happens? What can one do then? Just to do a busy loop? Or send to yourself a fake event to wake it? But then it just needs to be on separate thread.

hadrianw on 13 Feb 2018

So, if I'm understanding things correctly, combining what's been outlined here with the problems we have with Windows, then doesn't this mean that our current design is at odds with most platforms?

francesca64 on 18 Apr 2018

😄1

This issue is a total showstopper. It's amazing that the maintainers don't seem to care. I'm still using the workaround I posted above, which only works on X11. I never had the problem that @Ralith mentioned. I've now published an example program made with winit, which unfortunately only works on X11.

portstrom on 31 Jan 2019

It's amazing that the maintainers don't seem to care.

mitchmindtree on 31 Jan 2019

❤6

Honestly, I understand being frustrated about the lack of visible progress - there are several issues that can't be fixed without updating the main event loop API, and the changes have been in the pipeline for... uh... oh jeez has it really been over six months? Those changes should land Very Soon, though.

Anyhow, I'm going to close this issue in favor of #459 since that's where the discussion on the event loop API ended up getting finished.