After porting my code to work with the latest master, buffers use much more VRAM than before. One buffer of length 805306368 (i.e. 768MB) seems to be single-handedly consuming almost 6000MB of VRAM, according to my NVIDIA X Server Settings app. I did not have this problem with the older wgpu-rs version 0.5.0 from crates.io.
I'm running on native linux/vulkan with a GeForce RTX 2070. Relevant log section below:
2020-07-04T21:15:22Z INFO wgpu_core::device] Create buffer BufferDescriptor { label: 0x55d58dd3d080, size: 805306368, usage: VERTEX | STORAGE, mapped_at_creation: false } with ID PhantomData
[2020-07-04T21:15:22Z INFO wgpu_core::device] Created buffer (3, 1, Vulkan) with BufferDescriptor { label: 0x55d58dd3d080, size: 805306368, usage: VERTEX | STORAGE, mapped_at_creation: false }
That's unexpected. Than you for reporting!
I guess the next step here would be to take a Vulkan API trace/capture and double-check how much memory we allocate.
Is this what you mean? https://gist.github.com/bch29/3f64e9f3051a51ec243d1e28ab7ca576
If not could you provide some guidance on how to get the data you need?
Thank you, that's the wgpu API trace.
What I'm talking about is the lower-level one - from Vulkan. See https://vulkan.lunarg.com/doc/view/1.2.135.0/windows/trace_tools.html for vkTrace. There was a new tool from LunarG now as well - https://github.com/LunarG/gfxreconstruct
Finally, taking a capture with https://renderdoc.org/ would also help.
Here's the vkTrace and WGPU trace from a new run. This time the buffers I use should be ~224MB but instead the NVIDIA app reports ~1600. It seems like these buffers are using exactly 8 times as much memory as necessary actually.
trace.zip
@bch29 sorry for not getting back to you. I was trying to setup vktrace, but it turns out to be quite painful :(
Anyhow, I'm able to reproduce this rather trivially, and I see what the issue is. Working on a fix in gfx-extras.
Great to hear, thank you.
As of the fix, the situation has improved a lot but I am still seeing more VRAM utilization than I expect. Previously it was 8x, now it's 2x.
@bch29 that is correct, now it's 2x. It's only the case where you have one large allocation that is a big outlier, it's not a general "wgpu consumes 2x of VRAM". More work will need to follow to make this even better.
Makes sense, thank you for explaining.