Godot: Slow initial instancing in _process()

Created on 23 Nov 2019  路  18Comments  路  Source: godotengine/godot

Godot version: v3.2.beta2
OS/device including version:
Intel Core i7-4790K CPU
16GB ram
Windows 8.1 64bit

When I measure the time it takes to instance scenes (or do .new()), under certain circumstances it's taking much longer in beta2 than beta1. For my computer in beta1 it takes 15ms, in beta2 it's taking 80ms. It seems subsequent instancing doesn't have this delay, so there's a possibility that it's not a bug at all and just how the engine works in handling initial instances, I wouldn't know.

This only affects certain nodes:

  • MeshInstance with a mesh assigned
  • Light2D
  • Particles2D

Possibly others, I only tested a dozen or two.

Steps to reproduce:
This takes 80ms on my computer:

func _process(delta):
    var test = OS.get_ticks_msec()
    var id = Light2D.new()
    print('Code time: '+str(OS.get_ticks_msec()-test)+'ms')
    set_process(false)

This takes 0ms on my computer:

func _ready():
    var test = OS.get_ticks_msec()
    var id = Light2D.new()
    print('Code time: '+str(OS.get_ticks_msec()-test)+'ms')

Minimal reproduction project:
SlowInstancing.zip
Here is a project that tests the speed of instancing MeshInstance scenes subsequently. If you uncomment the text in _ready() all speedtests will drop to 0ms. I have no idea what's going on.

archived core

Most helpful comment

I didn't call it a regression and give it high priority, that was your interpretation.

For my computer in beta1 it takes 15ms, in beta2 it's taking 80ms.

This is still true. And weird. So your interpretation of a regression is fine. Don't let your user error bum you out too much. haha.

between 3.2 beta 2 and beta 3

You mean beta 1 and beta 2. Nice user error. lol

FINE I MESSED UP BIG TIME I'M CRYING NOW ALL THOSE SPEEDTESTS FOR NOTHING AAAAAAHHHHH

All 18 comments

Which graphics card model do you have? Do you have a lot of USB devices connected when starting the project? I wonder if this is tangentially related to #20566 (even if less extreme).

GTX 970, unplugging USB devices made no difference.

I tried running the test on my other older PC (E8400 cpu) with a fresh Godot installation. The result was 342ms. I'm sure this bug is consistently reproducible by anyone.

I can't reproduce issue on Ubuntu 19.10 also with Nvidia GTX 970 with master 636bc5c32f07050fb387a7f8f5f78f7dc9aef7be
My times are usually:

Code time1: 1ms
Code time2: 1ms
Code time3: 1ms
Code time4: 1ms

but I have a little modified test script

extends Node2D
var scene = load("res://MeshInstance_Scene.tscn")

func _ready():
    var test1 = OS.get_ticks_msec()
    for i in range(100):
        var id1 = MeshInstance.new()
    print('Code time1: '+str(OS.get_ticks_msec()-test1)+'ms')

    var test2 = OS.get_ticks_msec()
    for i in range(100):
        var id2 = scene.instance()
    print('Code time2: '+str(OS.get_ticks_msec()-test2)+'ms')

func _process(delta):

    var test3 = OS.get_ticks_msec()
    for i in range(100):
        var id3 = MeshInstance.new()
    print('Code time3: '+str(OS.get_ticks_msec()-test3)+'ms')

    var test4 = OS.get_ticks_msec()
    for i in range(100):
        var id4 = scene.instance()
    print('Code time4: '+str(OS.get_ticks_msec()-test4)+'ms')

    set_process(false)

SlowInstancing.zip

Remove the _ready() code and run it again, the issue only occurs without _ready(). Sorry if I wasn't clear.

With your test project I get this when I remove your _ready():

Code time3: 1ms
Code time4: 88ms

Without ready I still have very similar times:

Code time3: 0ms
Code time4: 2ms

Seems that the problem occurs only on Windows

Can confirm on Windows 10.

--- Debugging process started ---
Godot Engine v3.2.beta.Xrayez.636bc5c32 - https://godotengine.org
OpenGL ES 3.0 Renderer: GeForce GTX 750 Ti/PCIe/SSE2

Code time2: 185ms
Code time3: 0ms

Using _physics_process results in ~0.0 delay.

Could one of you bisect between 3.2 beta 1 (077b5f6c2c06bb2c0af525ee25f87e0db719f9d2) and 3.2 beta 2 (b7ea22c5d203da1b592a743a4c893de25cd34408)?

Given that it seems to affect rendering-related nodes, I guess it might be a regression from changes to rendering code. Might be good to check if it happens only with GLES3 or also with GLES2.

git bisect reported regression from cd40154890ba9791b219d66beaf187a7d8dcdba5 (CC @clayjohn) 馃檪

@Xrayez that bisect must be mistaken. That commit mostly changes shader code. The only CPU-side code it changes is only run when instancing a sky resource.

It will have absolutely no effect on instancing nodes.

@clayjohn It could be that the shaders take more time to compile initially though, especially since this slowdown is only reproduced on the first instantiation.

@akien-mga Potentially, but it still shouldn't affect any 2D nodes. And the changes made to scene.glsl were very minor.

Potentially, but it still shouldn't affect any 2D nodes

The reproduction project does use MeshInstance specifically. Opening up the scene is also slow, but I don't use 3D to tell the difference...

Shader compilation bottleneck was always an issue indeed, it's going to become more noticeable in these cases perhaps. Lets hope Vulkan solves this in the future: #13954.

I speed tested 100-200 nodes/resources using .new()

Here's the ones that exhibit this behavour where they all consistently take 70-80ms on my PC in _process (the expected time is usually 0-1ms):

  • Light2D
  • Particles2D
  • Particles
  • World
  • CubeMap
  • CPUParticles2D
  • CPUParticles
  • Texture3D
  • TextureArray
  • Skeleton2D
  • OccluderPolygon2D
  • CameraFeed
  • ViewportTexture
  • StreamTexture
  • ProxyTexture
  • NoiseTexture
  • ImageTexture
  • GradientTexture
  • CurveTexture
  • AnimatedTexture

Out of the Textures group specifically, here's the textures that don't exhibit this behaviour:

  • CameraTexture
  • AtlasTexture
  • LargeTexture
  • MeshTexture

By the way MeshInstance only has this issue when it has a mesh assigned. Here's the same issue without needing to spawn a scene:

func _process(delta):
    var test1 = OS.get_ticks_msec()
    var newMesh = MeshInstance.new()
    newMesh.mesh = CubeMesh.new()
    print('Code time: '+str(OS.get_ticks_msec()-test1)+'ms')
    set_process(false)

MeshInstance.new() and CubeMesh.new() by themselves have no issue.

Here's some speedtests using this MeshInstance with CubeMesh code I just posted above (ran 3 times each):

GLES3
Godot_v3.0-stable_win64: 81ms, 75ms, 81ms
Godot_v3.1.1-stable_win64: 86ms, 86ms, 78ms
Godot_v3.1.2-rc1_win64: 81m, 73ms, 88ms
Godot_v3.2-alpha1_win64: 15ms, 16ms, 15ms
Godot_v3.2-alpha2_win64: 15ms, 15ms, 15ms
Godot_v3.2-alpha3_win64: 15ms, 15ms, 15ms
Godot_v3.2-beta1_win64: 15ms, 16ms, 15ms
Godot_v3.2-beta2_win64: 77ms, 76ms, 82ms

GLES2
Godot_v3.1.1-stable_win64: 2ms, 2ms, 2ms
Godot_v3.2-alpha3_win64: 2ms, 2ms, 2ms
Godot_v3.2-beta1_win64: 2ms, 2ms, 2ms
Godot_v3.2-beta2_win64: 2ms, 2ms, 2ms

I also did the tests with ImageTexture.new() and got the same results.

GLES2 is displaying an issue here with its consistent 2ms, sometimes 3ms. When I do it in _ready() it's 0ms across the board. If it's a shader compiling issue why is it so fast in _ready?

Few important things to note. The issue is in 3.0, 3.1+ and 3.2 beta 2, accordingly it is unlikely that a PR introduced in 3.2 beta 2 is solely responsible.

Second. The fact that it appears only in _process and only in GLES3 makes it seem overwhelmingly likely that it is a shader compile issue.

Shaders are compiled on the first frame that an object is drawn, so when the object is instantiated in _process and drawn for the first time, the shaders have to compile right away. When you instantiate in _ready, the shaders would compile later.

I'm way ahead of you clayjohn.

so there's a possibility that it's not a bug at all and just how the engine works in handling initial instances

lol but seriously, thank you for the answer.

I'm confused, why was this reported as a regression between 3.2 beta 1 and beta 2 then? User error?

I didn't call it a regression and give it high priority, that was your interpretation.

For my computer in beta1 it takes 15ms, in beta2 it's taking 80ms.

This is still true. And weird. So your interpretation of a regression is fine. Don't let your user error bum you out too much. haha.

between 3.2 beta 2 and beta 3

You mean beta 1 and beta 2. Nice user error. lol

FINE I MESSED UP BIG TIME I'M CRYING NOW ALL THOSE SPEEDTESTS FOR NOTHING AAAAAAHHHHH

@rainlizard in either case, we just have to accept the possible slowdown and implement #13954.

The times when I bisected the issue between 3.2 beta 1 (077b5f6) and 3.2 beta 2 (b7ea22c) there was only two values: 30ms and 180ms, and interpreted those as "good" and "bad" on my machine.

Was this page helpful?
0 / 5 - 0 ratings