Godot version:
3.2.3 custom build (with official octotree fix)
OS/device including version:
5.8.11-1-MANJARO, GLES2
Issue description:
Mesh flickering. Really noticeable on player hands/weapon in fps games.
I was asked to make dedicated issue by https://github.com/godotengine/godot/issues/40059#issuecomment-701650694
Steps to reproduce:
Move around a few hundreds of small objects.
Minimal reproduction project:
For Godot 3.2.x : octobreak.zip
~500 meshes, sticks to the right flicker
I tried this sample scene(i had to add "3D" to Camera and MeshInstance, addding light) in 4.0 nightly build and sticks slowly disappeared.
octobreak_godot40.zip
cc @lawnjelly
@gdesecrate Can you reproduce this in 3.2.3 without the octree fix?
Ah sorry my mistake, didn't see there were 2 reprod projects. Trying to do too many things at a time! :grin:
Yes can confirm flickering, I'll see if I can work out what is causing it.
@Calinou In 3.2.3 without fix (the stable godot) no flickering, just memory leak that leads to terrible performance degradation.
In samozbor id:heaven flickering affects the weapon in player hands (very noticeable)
In Kotel Ne Gori the finall boss sometimes missing face model if you corner him in particular place of map. In previous experimental lawnjelly's fix the boss was missing not the face but lower torso.
I haven't experienced any flickering on weapon when was testing naithar's fix.
Well it's not the cached_lists, and it seems dependent on the project_settings->rendering->quality->spatial_partitioning->render_tree_balance
. It occurs with the default (0.17, which translates to 6 minimum per octant), but not with 1.0 or 0.0. 0.0 being closest to the old behaviour.
In fact set at 0.0, apart from the cached lists, the behaviour I think is identical to the old behaviour, because the minimum per octant before splitting is zero.
It could actually be another pre-existing problem with the tree, but at least it is not to do with the pairing. :grinning:
I'm seeing if I can pin it down to an even simpler scenario for debugging in detail.
@lawnjelly by "old behavior" you mean memory leak?
Or can I fix it by fine-tuning the value in project settings?
Also, does your octo-tree fix affects 3d sounds by any chance?
After 3.2.3+ I got 3d sound randomly stopping (re)playing like on https://youtu.be/WicQ5gm8Xi8?t=392 saw stopped looped sound but in stable 3.2.3 it is perfectly looped. Do I need to hunt it as a different issue?
You can effectively 'fix' it by settings the render_tree_balance
to 0.0. Or possibly 1.0 which may be faster in your game, which relies on brute force and will have most of the elements in only a few octants.
The bug is highly likely to be pre-existing in the octree, but certain conditions more likely in the modified version may be more likely to bring it on. The changes to the octree are actually quite limited:
1) Bug fix for leaked octants
2) Cached lists for faster runtime (no change to logic)
3) Minimum number of elements required in an octant before splitting
This bug seems to have been brought on by (3), which is a small change to make the tree balance more sensible, and shouldn't (in theory) affect the functionality. However there is always the possibility of an unforeseen consequence, something that relied on the previous behaviour, or also very likely a pre-existing bug that occurs only under certain conditions.
Debugging such problems is quite painstaking, it is a case of reproducing the behaviour in as simple a situation as possible, and debug outputting the tree and trying to debug what is going on.
The problem with octree is that the current design is not a very elegant solution to the problem, and is probably over complex. Elements are kept in multiple octants, which is incredibly error prone as far as management is concerned in an octree.
Depending on time I can always simply revert the behaviour to remove the minimum elements per octant, however, that re-exposes another bug, #38142 I think is the one, the overly deep recursion in the octree problem.
An alternative is to simply replace the current octree template with a better (simpler) spatial partitioning scheme. That is what is intended for 4.0 .. we were intending to keep what 'worked' for 3.2, however there are a lot of problems with the current octree.
Replacements could include loose octrees (where elements are not kept in multiple octants) or bounding volume hierarchies which have tended to become a more preferred solution these days. Grids also work, but we need a scalable solution because of the general purpose nature of Godot, we don't know the scene size in advance, for example.
There is also the pairing to deal with, which is a major pain with the current octree. It may be possible to outright replace the innards of this with a simpler more traditional system that still meets the contract of the existing interface.
I'll only know after some further debugging. Although it sounds quite involved, simply writing a more rigorous fresh structure could be less time consuming than working out all the bugs in the old. There are also off the shelf open source trees that could potentially be an option, or used for reference.
On the sound, from a look I don't think the 3d sound uses the octree (90% sure?), but I can't say for absolute definite. All I can do is FindInFiles for the functions, and looking through AudioStreamPlayer3D. If you want to look through the source code for audio_stream_player_3d.cpp you are as likely as me to see a reference to an octree, even if you don't know c++.
The sound may however be using the physics, which could be related to your sound bug. If you are using godot physics (rather than bullet) it does use the octree indirectly, but the previous version, not the modified.
I think I've tracked this down. There is another pre-existing bug in the old octree. It involves an optimized path for moving objects without changing their octants. As I suspected this is a consequence of trying to keep track of elements in multiple octants, which is very error prone.
In OCTREE_FUNC(void)::move:
// it still is enclosed in the same AABB it was assigned to
if (e.container_aabb.encloses(p_aabb)) {
e.aabb = p_aabb;
if (use_pairs)
_element_check_pairs(&e); // must check pairs anyway
return;
}
If this entire path is removed (by e.g. commenting out) forcing it to erase the element from the octree and reinsert for each move, the bug no longer exhibits.
So the bug is not introduced by the octree PR, but having larger octants makes it more likely to exhibit.
I think it is occurring because although the element is moving within the container_aabb (highest parent octant), the actual octants that the element is in are changing, so when it does the cull tests it is not hitting the octants that the elements should now be in.
I'll see if I can work out a fix. It may slow down the octree a bit doing it properly .. or maybe not. Maybe I can do a quick check for the most common path where octants do not change.
I'm still of the opinion that replacing the octree may be the best move, but will wait and have a look at that once this is fixed in 3.2. Maybe I can make an experimental simple BVH and see how it works.
But why in 4.0 the object don't flicker - they just disappear?
But why in 4.0 the object don't flicker - they just disappear?
I don't know much about 4.0 yet, as I'm primarily working on 3.2. I'm assuming it is still using the old unfixed octree (or a slight variation of). As such there will be loads of bugs in 4.0 octree, but it is due to be replaced. In the unlikely event that we don't replace it for 4.0, I'll forward port the 3.2 changes.
For future reference this is the slightly modified version of the min reproduction project I'm using:
octobreak.zip
This is quite a major bug actually, it is a good thing few people were using Godot physics in 3D, this could have caused all kinds of buggy behaviour. Well done for spotting it! :+1:
I am slightly concerned that there may not be a way of fixing this that runs fast. It is quite easy to detect when an element has moved outside its immediate owner octants, just do an AABB check with the new AABB.
The problem comes that the element may enter any new child (or grandchild etc) octant of the common parent. All these AABBs need to be checked (which isn't a problem) however we need to detect whether the element is ALREADY held by octants that pass the AABB check.
There is, at the moment, no fast way of checking whether an element is in an octant. You can do the reverse, lookup the octant from the element, but not the check the other way around. The elements stored in the octant are linked lists, which are very slow to iterate, and this would be need to done on every octant concerned for every element moving on boundaries between octants. Even if these were changed to flat arrays it would be slow.
The problem is possibly less when we have a render_tree_balance
shifted towards 1.0, because there are likely to be less of these problematic 'boundary elements'. However that doesn't help the physics octree or the visibility octree, both of which have the balance (minimum elements) set at 0.0.
There may be some data structure that can deal with this, but if not, this may require the octree being replaced.
So to fix flickering I just need to adjust project setting? Is it doable by gdscript line of code?
For now you can just change the project setting I mentioned above, render_tree_balance
. No gdscript required. This will 'fix' your project.
When I say fix it, you can't see visual problems in your test project. However the problem is not properly fixed, there is a fundamental flaw in the octree, as detailed above, which will need to be fixed.
I've already got a proper fix, but it will drastically affect performance.
If a proper fix cannot be made without being detrimental to performance, the tree structure may need rewriting.
This may not be a massive deal, and the same rewrite could be used in 4.x. I will have to consult with reduz later. We may come up with a way to solve the above problem in a more performant manner, perhaps using bitfields or somesuch.
Thanks! You were right - 1.0 setting works for me.
It not just my test project. The "test project" is modeled after actual game situation.
Would be a pain to record video for new trailer while the weapon and hands are blinking...
Most helpful comment
I think I've tracked this down. There is another pre-existing bug in the old octree. It involves an optimized path for moving objects without changing their octants. As I suspected this is a consequence of trying to keep track of elements in multiple octants, which is very error prone.
In OCTREE_FUNC(void)::move:
If this entire path is removed (by e.g. commenting out) forcing it to erase the element from the octree and reinsert for each move, the bug no longer exhibits.
So the bug is not introduced by the octree PR, but having larger octants makes it more likely to exhibit.
I think it is occurring because although the element is moving within the container_aabb (highest parent octant), the actual octants that the element is in are changing, so when it does the cull tests it is not hitting the octants that the elements should now be in.
I'll see if I can work out a fix. It may slow down the octree a bit doing it properly .. or maybe not. Maybe I can do a quick check for the most common path where octants do not change.
I'm still of the opinion that replacing the octree may be the best move, but will wait and have a look at that once this is fixed in 3.2. Maybe I can make an experimental simple BVH and see how it works.