Bevy: cpu usage

Created on 11 Aug 2020  ·  22Comments  ·  Source: bevyengine/bevy

e.g. In the "button" example, sometimes the cpu usage is about 10-20% in release mode, which is relatively high.
Is there any plan for future improvement? Thanks.
edit: I used win10, Intel 2.40GHz.

performance

Most helpful comment

Rayon 1.4.0 has been released with the fix! https://github.com/rayon-rs/rayon/issues/784

Release notes - "Implemented a new thread scheduler, RFC 5, which uses targeted wakeups for new work and for notifications of completed stolen work, reducing wasteful CPU usage in idle threads."

All 22 comments

CPU usage is between 300% and 400% for me with the simplest examples (button, sprite...)

EDIT: Using Mint 20 on Linux 5.4 on Intel i7-8705G and dual video card Radeon RX Vega M GL + HD Graphics 630

I confirm this.
On a i7-7700HQ CPU @ 2.80GHz × 8 and iGPU, htop outputs indicates that e.g the breakout example takes ~40% on all cores

@coolit @joseluis @martin-fl What operating systems, hardware, and render backend do you all use?

@skreborn
I'm using PopOS 20.04 with Linux 5.4 on a Intel® Core™ i7-7700HQ CPU @ 2.80GHz × 8 and Mesa Intel® HD Graphics 630 (KBL GT2). The render backend is Vulkan I suppose.

I have anywhere between 50% and 70% usage on Windows 10 with an Intel® Core™ i7-9750H CPU @ 2.60GHz × 12 and an NVIDIA GeForce RTX 2080 Max-Q for button.

It's worth noting that release mode reduces that to slightly above 30%.

It's not just low-end hardware
image
image
EDIT:
Even in release mode, the CPU usage is at 60%.
Apparently the logic updates every "frame" and it's not set, but faster the CPU, more the frames.

We should lock logic frames to a sane number like 90 or 180.

It's worth noting that release mode reduces that to slightly above 30%.

Compiling in release mode does in fact reduce CPU usage to ~10% on the button and breakout example on my computer. Still a bit high for a single button though.

Release mode had the sprite example running at around 30% for me, based on htop. Bit disappointing such a simple example isn't performant, but the library looks like a great start so I feel silly even making the complaint. I figure this will be fixed quickly.

here is a flamegraph, hopefully it's helpful

I expect there to be a _ton_ of low hanging fruit when it comes to optimization. So far the focus has been on api surface and building solid foundations. For example, right now Bevy is way more hash-ey per-frame than I would like it to be. We can fix most of the CPU getting eaten there by using the new "change detection" features in Bevy ECS.

On top of that, I think some persistent CPU usage is expected, as we (currently) use rayon under the hood and other projects have encountered similar behavior. I think the most important metric bevy can optimize is frame_time (which you can measure by adding the FrameTimeDiagnosticsPlugin and PrintDiagnosticsPlugin to your app).

This is the sort of issue that will never fully be resolved. There will always optimization work to do.

I am inclined to close this issue (and reference it whenever a new one comes up). Feel free to open issues for specific cases where you have isolated slow parts of Bevy.

(also i wont close this for a small period of time. feel free to respond here with rationale if you think leaving this open is better)

This is a problem with rayon. The more cores you have, the more CPU you burn. My 3950x pins the CPU at 100% in the window_settings example. I've seen this behavior on several projects that use rayon.

https://github.com/rayon-rs/rayon/issues/642

As a workaround for anyone having an unpleasant time with this, you can use the environment variable RAYON_NUM_THREADS to limit the number of threads. I would recommend anyone with more than 8 cores set it to something <=8.

Just to be clear, this isn't "bevy is using a lot of CPU" it's "rayon doesn't properly idle threads that have no work to do". A 32 logical core CPU will pin at 100%, but when forced to use 8 logical cores, will run at about 13% utilization (despite that being 25% of the cores). That is still higher than it ought to be. I don't think bevy's examples generate enough workload to justify saturating multiple cores.

Would something like Unity's Application.targetFrameRate be related to this issue?

Rayon 1.4.0 has been released with the fix! https://github.com/rayon-rs/rayon/issues/784

Release notes - "Implemented a new thread scheduler, RFC 5, which uses targeted wakeups for new work and for notifications of completed stolen work, reducing wasteful CPU usage in idle threads."

I ran the breakout example on my old quadcore machine using master and also using master + rayon bumped to 1.4. Both were run in release mode:

Master CPU usage: 220%
Master + rayon 1.4 CPU usage: 55%

Huge improvement but still _extremely_ high for only rendering a few rectangles. Hope to see even more improvements in the future! 🎉

The extreme CPU usage requires more than 4 cores to reproduce.

I roughly reproduce your results if forcing RAYON_NUM_THREADS=4
Debug 1.3 limited to 4 cores: 150%
Release 1.3 limited to 4 cores: 50%

On a 3950x (32 cores) using Bevy 0.1.3:
Debug Rayon 1.3: 900%
Release Rayon 1.3: 450%

I saw no difference with rayon 1.4:
Debug Rayon 1.4: 900%
Release Rayon 1.4: 450%

Moving to the new task system I get:
Debug Task system: 110%
Release Task system: 40%

(For whatever reason I'm not seeing the all-cores-100% I was seeing the other night but I've seen it in other projects using rayon.)

The new task system (to replace Rayon) was just merged in https://github.com/bevyengine/bevy/pull/384

Can anybody previously experiencing high CPU usage testify the new task system fixes this issue?

I will test it out on my work computer (the one I ran it on before) in about 9 hours.

Examples no longer saturate all cores on my computer. They use far less than a single core now. Seems good to me on mac. 👍

The CPU usage while running the Button example has been fixed on my work computer as well.

image
(The CPU usage is mostly connected to other stuff I'm running.)
image
The usage is still high, but that could be due to an unoptimised example.

Alrighty I think that's enough evidence to close this out. CPU usage will always be a moving target and there's still plenty of optimization potential, but we've made enough progress here that I think we can move from "generic CPU usage issue" to "specific issues for specific optimization cases".

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Bauxitedev picture Bauxitedev  ·  5Comments

ahfuckme picture ahfuckme  ·  4Comments

atsuzaki picture atsuzaki  ·  4Comments

rod-salazar picture rod-salazar  ·  4Comments

PradeepKumarRajamanickam picture PradeepKumarRajamanickam  ·  4Comments