Currently bevy depends on rayon for multithreaded dispatching of systems. @lachlansneff and @aclysma have iterated on a prototype with feedback from @kabergstrom and @cart. This issue is meant to:
@lachlansneff and @aclysma implemented a prototype using multitask. It’s a small executor based on async-task, which is used within async-std. The dependencies are:
├── multitask
│ ├── async-task
│ ├── concurrent-queue
│ │ └── cache-padded
│ └── fastrand
├── num_cpus
│ └── libc
├── parking
└── pollster
The API does three things:
We have a prototype of ParallelExecutor using it instead of rayon, allowing us to remove rayon as a dependency from bevy. This repo is a prototype and we intend to add it as a module within bevy directly. (bevy_tasks) This will allow us to do more sophisticated instrumentation and profiling than if we were using an externally-managed thread pool.
Finish a PR to add bevy_tasks as a module, remove rayon as a dependency, and update ParallelExecutor to work with bevy_tasks. In principle these steps are already done, but we may want to polish a bit first.
We have a feature branch for this underway here: https://github.com/lachlansneff/bevy/tree/bevy-tasks
Thread management is clearly a key problem that needs to be solved in a scalable game engine. In my opinion there are three main uses of threads in a modern, ECS-based game engine, from high-level to low-level:
We plan to apply this solution to #2 now, and longer term expose the same underlying system to solve #3. (#1 is out of scope for now, but might also be able to use this system)
We discussed #3 in the #rendering channel in discord:
@lachlansneff and @aclysma also discussed the need to assign threads to the proposed buckets (IO, async compute, and compute). We considered several approaches:
This is awesome I'm so glad we have people like you contributing to bevy
It seems like there is consensus to move this forward. I think we are quite close to having a PR ready :)
We had a bit more discussion on discord tonight from a number of participants, mainly about how this would work within a browser:
You can see an example here how amethyst integrated web workers into rayon. Rayon allows specifying spawn_handler which we use to execute rayon inside web worker.
The main blocker for amethyst was join (or install, or anything that waits for other threads) calls on the main thread, which are not allowed on the web. Main usage of the join was waiting for all the dispatched systems to finish executing. The only way to solve this is making join an async function. There are multiple ways to implement this:
postMessage to the main thread to wake async task. It would be faster than the polyfill as no new threads are spawned, but is incompatible with the upcoming async atomics proposal.The implications of async join are quite outreaching. The whole call stack up until executor join has to be async. In amethyst this involved application, whole gamestate code including user code and then finally ECS dispatcher. The shallower this stack is the less code needs to be async.
@chemicstry I looked into the proposal for waitAsync and I'm not sure it'll actually be useful here. I was thinking that it would close over a function that calls atomics.wait at some point, and then turn that into a promise, so you could use atomics.wait on the main thread. However, it looks like it's just a way of calling atomics.wait in javascript in an async context, so not too useful as far as I can tell.
One way of completely sidestepping the "no blocking the main thread" problem is to only run bevy in workers when on the web. Those can block no problem. We can render with an offscreen canvas.
@lachlansneff async is the only way to achieve "blocking" execution in web context. Any wasm execution must immediately return to prevent blocking main thread and this inevitably loses stack, so you can't resume execution. winit hacks this by throwing exception and leaking all stack, afterwards all execution is carried by browser events.
You can emulate this async behavior like I mentioned in 2nd solution, but it would be less performant than async atomics because of worker message overhead.
We explored the option of running everything in web workers, but it has its downsides:
winit does not support OffscreenCanvas. See: https://github.com/rust-windowing/winit/issues/1518window API does not exist, even though some browsers provide it. Hence this is not available in wasm-bindgen. This includes things like performance.now().The list is probably longer. It would require a lot of glue to make everything run in web workers, but if this could be implemented as a separate glue logic instead of a bunch of cfg macros, then it may very well be a desired solution. Although, I would still say that avoiding locks in main thread and relying on ECS for ensuring safe shared access is a good pattern from performance and code quality perspective.
@chemicstry I think most people are in agreement that Bevy should be focusing on targeting the web as it stands to exist sometime around when Bevy hits 1.0 (or more generally: targeting some future state of the web.) This means avoiding development of suboptimal solutions in an attempt to make something work today.
So with that in mind, and recognizing that you did a lot of research in this area:
No disagreement here regarding solutions that avoid this problem altogether!