Godot: Android GLES3 - Environment is black [regression in 3.2 beta 2]

Created on 25 Nov 2019  Â·  14Comments  Â·  Source: godotengine/godot

Godot version:
3.2 Beta 2
Not in 3.1.1 Release! (Regression)

OS/device including version:
Google Pixel 3a
Android 10 (Latest Beta)

Issue description:
The environment appears as black, other rendering seems to work fine

Android Log: (No errors from what I see)
https://pastebin.com/g2fBW80X

Editor:
image

Phone:
a

Steps to reproduce:
Place a camera in a brand new project and export to Android
The project is GLES3, Doesn't happen in GLES2

Minimal reproduction project:
Test2.zip

bug confirmed high priority regression rendering

Most helpful comment

@akien-mga I've reverted all the GLES3 changes related to buffer management and the problem is still there. So this particular regression is not caused by the glBufferSubData PR. Looking at the code, it is most probably caused by the previous PR entitled "Fix issues with environment mapping" which heavily deals with the environment, the sky and all the symptoms reported here.

@clayjohn Could you please check on that? I am having a bit of a hard time reverting the environment mapping changes to save you some time, sorry about that.

All of which doesn't invalidate my analysis above which implies reverting some bogus glBufferSubData replacements - tbd.

All 14 comments

Is it the same as 3.1.1?

Also, are any errors reported?

Thanks, I've edited my post to include that 3.1.1 Works fine, the bug is not reproducible there (It's a regression)

No errors are reported in 3.2, I will attach a log to OP shortly

I can confirm the issue with Godot 3.2 beta 2 on Xiaomi Pocophone F1: http://opengles.gpuinfo.org/displayreport.php?id=3448

It's a regression in beta 2, the environment works fine in 3.2 beta 1.

GLES3 commits between beta 1 (077b5f6c2c06bb2c0af525ee25f87e0db719f9d2) and beta 2 (b7ea22c5d203da1b592a743a4c893de25cd34408):

commit 203fb1b3484f621984e4595986cc77c9b9174d6a
Author: clayjohn <[email protected]>
Date:   Thu Nov 21 07:44:09 2019 -0800

    Fix GL error by properly using float uniform

commit 4d6737ec730a1f620c20768c1792099683aca4f3
Author: clayjohn <[email protected]>
Date:   Wed Nov 20 22:54:44 2019 -0800

    Fix bugs introduced by IBL fixes

commit 3be6e76f220f702d41ae5efc526064f87059539b
Merge: 1bd32388ae cd40154890
Author: Rémi Verschelde <[email protected]>
Date:   Wed Nov 20 08:45:53 2019 +0100

    Merge pull request #33668 from clayjohn/Fix_environment_mapping_issues

    Fix issues with environment mapping

commit cd40154890ba9791b219d66beaf187a7d8dcdba5
Author: clayjohn <[email protected]>
Date:   Sat Nov 16 16:56:34 2019 -0800

    Fix issues with environment mapping

commit 6536105af26bb3a6a18369b4647ac479816090b7
Merge: cc3b7d2ee2 1253a33423
Author: Rémi Verschelde <[email protected]>
Date:   Tue Nov 19 09:36:30 2019 +0100

    Merge pull request #33527 from clayjohn/GLES2-bufferdata_optimization

    Improve glBufferSubData usage where safe

commit 1253a3342381e15ff95074f31a465a18e6459991
Author: clayjohn <[email protected]>
Date:   Sun Nov 10 10:51:56 2019 -0800

    Improve glBufferSubData usage where safe

commit 4e2343160c6a6aefa5f6422620a6f90de255eeb1
Author: Bastiaan Olij <[email protected]>
Date:   Sun Nov 10 19:30:20 2019 +1100

    Add special external MSAA modes for GLES2 Rift S/Quest and OpenXR optimisation

commit 47389c3a167797ef6e5756937e9d888f6a7719b7
Author: Rémi Verschelde <[email protected]>
Date:   Thu Nov 7 15:37:18 2019 +0100

    Partial revert of #32657, undoing line shifting by 0.5

    As discussed in #32657, this can't be done here as lines can be used
    with a canvas scale, and this breaks them.
    A suggestion is to do the pixel shifting at matrix level instead.

    Fixes #33393.
    Fixes #33421.

@clayjohn @oeleo1 The regression is caused by 1253a334 / #33527.

Edit: See below, looks like my bisect was mistaken and the regression is from #33668.

I'll fix it tonight. I may just revert the 3D pipeline related changes. The initial bug reported was related to 2D issues anyway.

Thanks for reporting this promptly! And mea culpa -- what we did looks innocent and safe but is a very hard core change, and we obviously messed up despite the vigilance.

A quick fix is to revert everything. But that's a quickie. We have to understand where we sinned and selectively revert the 11 locations where we replaced glBufferSubData with glBufferData to identify the problem and assess the buffer usage wrt the rendering cycle. Somewhere we are allocating a new buffer where we should be operating on the same buffer.

I owe you spending time investigating this and fix it properly. Bound to learn the hard way.

@clayjohn I think I figured out what's wrong even without looking at the code... If you go back to my original comment on the rationale for safely replacing glBufferSubData with glBufferData in the isolated code patterns, there is a flaw in the reasoning. It says bind/upload data/_[eventually draw]_/unbind. This is wrong and should read _[must draw]_. Issuing a draw command signals that we are done with the buffer after writing to it from the start at offset 0. Not drawing means the buffer can be used or be overwritten elsewhere so it is __not__ safe to use glBufferData in these cases. Please revert those instances first as primary candidates for the fix.

The same logic applies to the buffer orphaning but I think the glBufferSubData sequences all end up with a draw command after writing from the start of the buffer so these should be safe. Yet, I will now scrutinize and double check each and every change again then retest.

The same logic applies to the buffer orphaning …

Note that it works (for few ms), if you e.g. change the Top Color via the remote debugger:

image

si cambias el MODE : SKY por el MODE : CLEAR COLOR funciona por ahora es lo que funciona :c

@akien-mga I've reverted all the GLES3 changes related to buffer management and the problem is still there. So this particular regression is not caused by the glBufferSubData PR. Looking at the code, it is most probably caused by the previous PR entitled "Fix issues with environment mapping" which heavily deals with the environment, the sky and all the symptoms reported here.

@clayjohn Could you please check on that? I am having a bit of a hard time reverting the environment mapping changes to save you some time, sorry about that.

All of which doesn't invalidate my analysis above which implies reverting some bogus glBufferSubData replacements - tbd.

Here is a patch relative to today's master which reverts 7 out of the 11 replacements of glBufferSubData calls which _may_ be unsafe. But better safe than sorry. They are all in GLES3.

@clayjohn This contains again the bug fix in _gui_draw_primitive(), where the size parameter is not multiplied by sizeof(float). Note that the exact same code in GLES2 is correct and does contain the sizeof(float). The reasoning is simple: we copy a buffer of (2 + 2 + 4) * 4 floats, where 2 + 2 + 4 is the max size of the stride. So we shall copy stride * 4 * sizeof(float) bytes.

This doesn't fix the issue raised here but is a step forward wrt. reducing the risk of the buffer changes. It also may not be the final version at it needs more testing on mobile, but I'd like to see [#33668] (https://github.com/godotengine/godot/pull/33668) reverted/fixed first, which is a little too much for me to do at this point. I'll resume testing when I can get my hands on such a branch.

Let me know if I can be of any further assistance.
glBufferSubData_reverted.patch.txt

@oeleo1 I'll look into it as soon as I can.

@akien-mga @oeleo1 I can confirm that reverting 1253a33 / #33527. does not fix the issue

reverting 203fb1b3484f621984e4595986cc77c9b9174d6a then 4d6737ec730a1f620c20768c1792099683aca4f3 then cd40154890ba9791b219d66beaf187a7d8dcdba5 makes the problem go away. So it is definitely my fault. I will figure out the reason ASAP.

@clayjohn Deal :) I can only add that in the meantime I tested the sizeof(float) mod. My rant about it is void because with the sizeof(float) GLES3 crashes my iPhones/Pads at the first drawLine or similar, and without it is doesn't. Which hits that the bug is probably the other way around since the same code is in GLES2 and contains the sizeof(float).

That said, please don't revert any buffer changes for now until you sort it out on your side. We'll take it from there.

@clayjohn @akien-mga Thanks. 1/ Re: buffers, we wont revert anything unless there is evidence we need to revert. I tried to challenge myself intellectually and almost succeeded, but the point is that we replaced glBufferSubData because it always writes to the start of the buffer at index 0. Whether this happens on an old buffer or a newly allocated one with glBufferData is irrelevant for any followup code that may use the buffer to append data or draw. 2/ Re: sizeof(float) the crashes I get on iOS are due to something else, so I believe this is still relevant for GLES3 vs GLES2.

Was this page helpful?
0 / 5 - 0 ratings