What the game (Pokemon AS) does:
Pokemon will try to render outlines by rendering the scene normally (to color and depth buffer) and writing a different stencil value per object. The depth/stencil buffer is located at 0x18419800 [256x512], format: D24S8.
Once the scene has been fully rendered there will be a fullscreen quad which is using the previously rendered depth/stencil buffer as texture 0: 0x18419800 [256x512], format: RGBA8 (actually ABGR8?).
It appears that the game then subtracts the red color value of each pixel from the red color value of a nearby pixel (I believe the 8 red color value bits would map to the stencil 8 bits in the existing framebuffer). If the difference is high enough the pixel is colored as part of the outline.
The issue:
The problem with the current texture-forwarding is that the D24S8 surface which is in the cache is not taken into consideration as a RGBA8 lookup texture here. As such, the texture is flushed to 3DS memory here and correspondingly re-upload in a different format shortly after.
Note that the texture downloading / uploading is very slow. Additionally, flushing the memory into the 3DS memory reduces it's resolution to the native resolution which actually results in the pixelated graphics.
Solutions:
I've tested this in the truck scene at the start of Pokemon AS which usually renders at around 30ms on my Intel HD 4000.
I've then tried a couple of solutions to solve this issue:
Quick test was to disable the format check. This avoids the texture flush & re-upload.
The cache entry of D24S8 will be passed to GL as a RGBA8 texture then. The red lookup ends up from the normalized depth value which means no outlines are rendered. However, due to the missed flush / texture re-upload the frametime will be reduced drasticly to 7ms (!) in the truck scene.
The second test was to download the surface into CPU memory (glGetTexImage to std::vector) in the cached resolution, then re-uploading a second cache entry for the same address in the better resolution. Additionally the texture was still flushed to system memory too. This caused frame times of around 130ms (!) but fixed all visual errors.
To speed up the fix, I've tried downloading into a PBO and re-uploading to the texture from that. Additionally I've disabled flushing the cache for testing. This still only gave me around 100ms frametime.
Ideally we'd find something like solution 1 to avoid all texture downloads / uploads. Additionally texture forwarding doesn't seem to be able to handle 2 cache entries for the same memory region correctly (seperate issue should be created?!), so temporarily interpreting data differently or converting back and forth between formats should be an option.
As the PBO is too slow we'll probably need a different way to convert the data.
I considered GL texture views, but they don't allow re-interpreting GL_DEPTH_STENCIL textures as RGBA textures & are not available in GL3.3.
stencil_texturing is also not in GL3.3 so we can't easily access the stencil buffer for a conversion in shaders either.
Maybe someone else has a better idea? ( @tfarley ? )
(Note that Vulkan seems to allow reinterpreting data differently, so that might be a solution in the future.)
Screenshots for Pokemon OR/AS: http://imgur.com/a/c8ENP
You can find my repo where I tested stuff here. Note that the AccelerateFill hack is not necessary anymore once #2218 has been merged.
The current revision at the time of writing is d3b7f57b2b45752084eb31dab8eeaa644e964935
Hello, not sure if this is the right place to ask this, but
would you mind sharing the exact code you described in your first attempt to solve the problem or is there any place where I can find it?
I don't have the fastest computer and I wouldn't mind some missing outlines and a slightly alternated/worse look in exchange for a higher framerate... I'm still a programming beginner and even thought you described What you did, I couldn't implement it by myself.
I would fully understand if you won't do it, afterall it's additional work for you and more of a dirty hotfix for a single individual than the ideal solution you seek... but I thought it's atleast worth trying to ask.
I really apreaciate the work of everyone that's contributing to this project
@JayFoxRox: I'd say leave in method 3 for now, and see if any more
optimizations can be done. Besides, what might be a 100 ms frame time for
you could be a 33 ms or less frame time for someone with a better GPU. I
think we could just call Pokemon a high-intensity game for now (which it
is) and continue investigating new ways to make texture forwarding more
correct but yet faster.
On Thu, Dec 1, 2016 at 6:17 PM, NaveTK notifications@github.com wrote:
Hello, not sure if this is the right place to ask this, but
would you mind sharing the exact code you described in your first attempt
problem or is there any place where I can find it?I don't have the fastest computer and I wouldn't mind some missing
outlines and a slightly alternated/worse look in exchange for a higher
framerate... I'm still a programming beginner and even thought you
described What you did, I couldn't implement it by myself.
I would fully understand if you won't do it, afterall it's additional work
for you and more of a dirty hotfix for a single individual than the ideal
solution you seek... but I thought it's atleast worth trying to ask.
I really apreaciate the work of everyone that's contributing to this
project—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/citra-emu/citra/issues/2220#issuecomment-264336749,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA6ZW1c-mj4vhLyHmcnGtwy3QyCCIsDuks5rD2ORgaJpZM4K9O6m
.
@NaveTK if people on gbatemp, 4chan etc. were more ethical and not stupid idiots who take any code and run with it, I'd have released it. But unfortunately I know that showing the code would only lead to "Pokemon 1337"-builds. As it's a dirty hack it's not acceptable imo. It would draw Pokemon players away who might become devs of Citra otherwise (they'd just play the version which already plays Pokemon fine, not caring about this repo).
@MoochMcGee you do realize that this would drop the framerate from around ~30 FPS to ~10 FPS for a minor graphic improvement? I think such a fix is not acceptable. It would be the bottleneck in all of Citra.
@JayFoxRox okay, I just found this repo today and have no clue about the overall situation, if that's the case then that's understandable... I'll just try to figure it out by myself then, but thanks alot for replying :)
@JayFoxRox while we search for alternatives that target GL3.3, is it an option to enable stencil_texturing iif GL>4.2 is present?
Fixed by #3281
Most helpful comment
@JayFoxRox while we search for alternatives that target GL3.3, is it an option to enable stencil_texturing iif GL>4.2 is present?