Godot: Sprite rendering performance regression

Created on 19 Dec 2018  路  18Comments  路  Source: godotengine/godot

Introduced with bec76cfa19684cbfce3677044331805fd2a54d8b

The change (i.e. not using the static quad buffer) makes sprite rendering slower. See the project attached to #19943, i.e. polygon2d_performance.zip. Just make sure to use the GLES2 renderer.

For me it's 110 vs 350 FPS on a release_debug build for "Sprite x1" mode when running the project on Intel 4600.

bug high priority rendering

Most helpful comment

Yes this made my app usable on android again. Increased the FPS around 5 times

All 18 comments

It's either this or nvidia bugs, so pick your poison.

I can imagine batching will improve performance again, but not a lot more can be done.

In that case what do you think about bringing back the draw calls batching that's been implemented in #20965 (and then reverted with #21204)?

if it can be done well, fine
but i have a ton of high priority bugs to look at at the moment

Considering most games aim for 60fps (and more importantly most physics implementations - 2D as well as 3D - nowadays target 60fps), is it really that important?

yes. it's pretty laggy on phone with gles2 now even simple scene.
ps. it's noticeably laggy on PC too.

I can have a look into batching again when the GLES2 implementation is feature-complete.

@dragmz GLES2 is now finally feature complete (At least 2D)

@dragmz any good news?

Apologies, but at this rate this will have to be kicked to 3.2. As soon as anyone can contribute batching support, will be glad to merge it. Otherwise, I will have to do it myself sometime after 3.1 is out (and likely after Vulkan back-end is implemented). What I can offer is to undo the workaround only for mobile, which should increase performance a bit.

The fact we are still so low on rendering contributors is still a problem, will need to find a way to tackle this problem after 3.1 is out.

@volzhs @reduz I've been doing some initial tests with a very scope-limited draw calls batching but that does not increase performance much for the use cases I'm mostly interested in (e.g. drawing sprites or polygons).

My previous experiments show that what the GLES2 renderer lacks is buffer updates batching (i.e. glBufferSubData) . This has been pull requested and rejected in #20077 and at the moment I don't have any new idea on how to reimplement to get it accepted.

I hope you find a way.
I am not sure but this regression seems to affect to gui also.
Gui renders pretty laggs on phone, not usable with gles2.

@volzhs: It used to be the case that every letter drawn in gui controls was a draw call. Not sure if it's still an issue because one of the batching attempts closed it and I don't think it was reopened, but it might be the case still.

@Zireael07 It's still a case because #20965 that batched the draw calls has been reverted with #21204

This is an issue for us on an android app that shows a lot of text, running on KitKat. Any pointers or hints on how to implement this ourselves?

6d8083ea should solve this issue on OpenGL ES2/WebGL 1 platforms, but restricting the new rendering path to OpenGL 2.1 (since it's only meant to work around Nvidia driver issues, so only relevant on desktop).

For desktop platforms, it would be good to have some metrics of how big the performance difference is on some projects like the platformer 2D, some GUI-heavy demos, etc.

You can test locally by compiling current master to test the "after" performance, then revert 6d8083ea and bec76cf and compile again to test the "before" performance. Or instead of reverting, you can edit https://github.com/godotengine/godot/blob/6d8083ea656d1dce5c00257f308e464d1d8feae2/drivers/gles2/rasterizer_canvas_gles2.cpp#L460 and replace it by #if 0.

Please mention OS, GPU, drivers and the projects tested with the before/after framerate.

Some test results:

"After" state: f4ac678 unmodified (i.e. the Nvidia hack for #9913)
"Before" state: f4ac678 with rasterizer_canvas_gles2.cpp:L460 changed to #if 0 (back to the original "fast" method)

Linux x86_64, Nvidia GTX 670MX with drivers 410.93. Drivers Vsync off for testing.
Editor, debug build (scons p=x11 tools=yes target=debug).

Running the demos in fullscreen with godot --video-driver GLES2 --print-fps --fullscreen and then averaging the FPS output (minus the first 2-3 ramp-up frames) over ~30 s of doing nothing in-game (I initially tried to stress-test by moving around, but I would get too much variance).

I also put GLES3 results for comparison.

Testing demos from https://github.com/godotengine/godot-demo-projects, commit 916c9c9.

Note: This are quick tests on one single configuration and should be taken with a pinch of salt, yet they should some trend.


2d/platformer:

Before: 550 FPS (506 to 565)
After: 372 FPS (366 to 377) 32% performance drop
GLES3: 408 FPS (400 to 412)

2d/lights_and_shadows: (only one light works in GLES2 out of 3)

Before: 637 (608 to 650)
After: 467 (436 to 491) 27% performance drop
GLES3: 516 (503 to 525)

2d/isometric:

Before: 1037 (1017 to 1043)
After: 519 (509 to 521) 50% performance drop
GLES3: 717 (703 to 723)

gui/rich_text_bbcode:

Before: 810 (764 to 825)
After: 218 (217 to 219) 73% performance drop for text rendering
GLES3: 560 (554 to 567)

misc/joypads:

Before: 881 (846 to 892)
After: 402 (394 to 404) 54% performance drop
GLES3: 798 (776 to 811)

https://github.com/KOBUGE-Games/jetpaca, branch godot-3-port:
Ran with argument res://stages/world_1/intro.tscn to be directly in-game.

Before: 555 (552 to 563)
After: 289 (287 to 292) 48% performance drop
GLES3: 398 (396 to 400)


Those are simple tests, and different configurations might show different results, but to

It's either this or nvidia bugs, so pick your poison.

I'd definitely pick the Nvidia bugs kind of poison over a ~50% performance regression, or more for text.

Yes this made my app usable on android again. Increased the FPS around 5 times

Was this page helpful?
0 / 5 - 0 ratings