Wgpu-rs: Example `shadow` panics on window resize [Windows, DX12]

Created on 6 Jun 2020  路  21Comments  路  Source: gfx-rs/wgpu-rs

I'm on the lastest commit d06470e and running the shadow example on Windows 10, I believe I'm using DX12.

When I resize the window I get the following panic trace

thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `Extent { width: 1024, height: 769, depth: 1 }`,
 right: `Extent { width: 1024, height: 768, depth: 1 }`: Extent state must match extent from view', C:\Users\Krooq\main\tools\rust\rustup\toolchains\nightly-x86_64-pc-windows-gnu\lib/rustlib/src/rust\src\libstd\macros.rs:16:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[2020-06-06T12:44:33Z ERROR gfx_memory::heaps] Heaps still have 8 types live on drop
[2020-06-06T12:44:33Z ERROR gfx_memory::allocator::general] Memory leak: SizeEntry(11206656) is still used
[2020-06-06T12:44:33Z ERROR gfx_memory::allocator::general] Memory leak: SizeEntry(3276800) is still used
[2020-06-06T12:44:33Z ERROR gfx_memory::allocator::general] Memory leak: SizeEntry(3801088) is still used
[2020-06-06T12:44:33Z ERROR gfx_memory::allocator::general] Memory leak: SizeEntry(256) is still used
[2020-06-06T12:44:33Z ERROR gfx_memory::allocator::general] Memory leak: SizeEntry(2048) is still used
[2020-06-06T12:44:33Z ERROR gfx_memory::allocator::general] Memory leak: SizeEntry(16384) is still used
[2020-06-06T12:44:33Z ERROR gfx_memory::allocator::general] Memory leak: SizeEntry(131072) is still used
[2020-06-06T12:44:33Z ERROR gfx_memory::allocator::general] Memory leak: SizeEntry(1024) is still used
[2020-06-06T12:44:33Z ERROR gfx_memory::allocator::linear] Not all allocation from LinearAllocator was freed
[2020-06-06T12:44:33Z ERROR gfx_descriptor::allocator] DescriptorAllocator is dropped
error: process didn't exit successfully: `target\debug\examples\shadow.exe` (exit code: 101)

When I run with extra backtrace it is saying something about the drop of RenderPass on line 814.

bug urgent

Most helpful comment

this is very telling to me, on x11 I'm more convinced there's some race condition with winit resize reported size, when the event arrives, and what vkGetPhysicalDeviceSurfaceCapabilitiesKHR.

So I added these logs:

@@ -203,6 +205,9 @@ fn start<E: Example>(
     let mut last_update_inst = Instant::now();

     log::info!("Entering render loop...");
+    let mut resize = 0;
+    let mut rs = vec![];
+
     event_loop.run(move |event, _, control_flow| {
         let _ = (&instance, &adapter); // force ownership by the closure
         *control_flow = if cfg!(feature = "metal-auto-capture") {
@@ -219,6 +224,22 @@ fn start<E: Example>(
         };
         match event {
             event::Event::MainEventsCleared => {
+                if resize > 0 {
+                    let size = window.inner_size();
+                    log::error!("Last {}x{}", sc_desc.width, sc_desc.height);
+                    log::error!("Count: {}", resize);
+                    log::error!("Resizes: {:#?}", rs);
+                    log::error!("{:?}", size);
+                    resize = 0;
+                    rs.clear();
+
+                    log::info!("Resizing to {:?}", size);
+                    sc_desc.width = size.width;
+                    sc_desc.height = size.height;
+                    example.resize(&sc_desc, &device, &queue);
+                    swap_chain = device.create_swap_chain(&surface, &sc_desc);
+                    log::error!("DONE SWAPCHAIN");
+                }

Now at some point during resize, we can see the validation layer output an error, which is, as expected, when we recreate the swapchain (it's after the first logs, before the SWAPCHAIN DONE log):

[2020-07-10T17:27:14Z ERROR shadow::framework] Last 808x573
[2020-07-10T17:27:14Z ERROR shadow::framework] Count: 1
[2020-07-10T17:27:14Z ERROR shadow::framework] Resizes: [
        PhysicalSize {
            width: 808,
            height: 591,
        },
    ]
[2020-07-10T17:27:14Z ERROR shadow::framework] PhysicalSize { width: 808, height: 591 }
[2020-07-10T17:27:14Z ERROR gfx_backend_vulkan] 
    VALIDATION [VUID-VkSwapchainCreateInfoKHR-imageExtent-01274 (2094043421)] : Validation Error: [ VUID-VkSwapchainCreateInfoKHR-imageExtent-01274 ] Object 0: handle = 0x55bf7a96dcf0, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x7cd0911d | vkCreateSwapchainKHR() called with imageExtent = (808,591), which is outside the bounds returned by vkGetPhysicalDeviceSurfaceCapabilitiesKHR(): currentExtent = (808,596), minImageExtent = (808,596), maxImageExtent = (808,596). The Vulkan spec states: imageExtent must be between minImageExtent and maxImageExtent, inclusive, where minImageExtent and maxImageExtent are members of the VkSurfaceCapabilitiesKHR structure returned by vkGetPhysicalDeviceSurfaceCapabilitiesKHR for the surface (https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#VUID-VkSwapchainCreateInfoKHR-imageExtent-01274)
    object info: (type: DEVICE, hndl: 94280883821808)

[2020-07-10T17:27:14Z ERROR shadow::framework] DONE SWAPCHAIN
[2020-07-10T17:27:14Z ERROR shadow::framework] Last 808x591
[2020-07-10T17:27:14Z ERROR shadow::framework] Count: 1
[2020-07-10T17:27:14Z ERROR shadow::framework] Resizes: [
        PhysicalSize {
            width: 808,
            height: 596,
        },
    ]
[2020-07-10T17:27:14Z ERROR shadow::framework] PhysicalSize { width: 808, height: 596 }

Notice the size it wants to set it to is our resize value winit gave us: vkCreateSwapchainKHR() called with imageExtent = (808,591).

however, very interestingly, notice the size it expects vkGetPhysicalDeviceSurfaceCapabilitiesKHR(): currentExtent = (808,596), minImageExtent = (808,596), maxImageExtent = (808,596) is identical to the NEXT winit resize value being reported!

[2020-07-10T17:27:14Z ERROR shadow::framework] Resizes: [
        PhysicalSize {
            width: 808,
            height: 596,
        },
    ]

If I had to guess, that resize in the next loop should have been given to us in the previous loop?

All 21 comments

As reported in #403, same problem is on Vulkan.

Is this a problem with the examples themselves or a problem with wgpu-rs? I see the same thing in my app and would like to know if there is anything I can do to fix it.

I'm not entirely sure what I do differently, but my app has never had resizing issues even when the examples did.

This is some combination of the example framework, winit, and gfx backends. Framework thinks the window has one size, while the backend thinks the associated surface has a different size... Could be caused by some kind of a lag between the window resize and the associated surface updated.

So unfortunately, while recreating the swapchain appears to fix the instant panic, i.e.:

diff --git a/examples/framework.rs b/examples/framework.rs
index f4339599..f66dc24b 100644
--- a/examples/framework.rs
+++ b/examples/framework.rs
@@ -240,6 +240,7 @@ fn start<E: Example>(
                 sc_desc.width = size.width;
                 sc_desc.height = size.height;
                 example.resize(&sc_desc, &device, &queue);
+                swap_chain = device.create_swap_chain(&surface, &sc_desc);
             }
             event::Event::WindowEvent { event, .. } => match event {
                 WindowEvent::KeyboardInput {

playing around more with it, slow resizes can trigger validation errors, and then subsequent panics:

thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `54`,
 right: `53`', <::std::macros::panic macros>:5:6
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.44/src/backtrace/libunwind.rs:86
   1: backtrace::backtrace::trace_unsynchronized
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.44/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:78
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:59
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1063
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1426
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:62
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:49
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:204
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:224
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:470
  11: rust_begin_unwind
             at src/libstd/panicking.rs:378
  12: std::panicking::begin_panic_fmt
             at src/libstd/panicking.rs:332
  13: wgpu_core::track::ResourceTracker<S>::remove_abandoned
             at ./<::std::macros::panic macros>:5
  14: wgpu_core::device::life::LifetimeTracker<B>::triage_suspected
             at /home/m4b/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/43c67ac/wgpu-core/src/device/life.rs:403
  15: wgpu_core::device::Device<B>::maintain
             at /home/m4b/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/43c67ac/wgpu-core/src/device/mod.rs:303
  16: wgpu_core::device::queue::<impl wgpu_core::hub::Global<G>>::queue_submit
             at /home/m4b/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/43c67ac/wgpu-core/src/device/queue.rs:536
  17: wgpu::backend::direct::<impl wgpu::Context for wgpu_core::hub::Global<wgpu_core::hub::IdentityManagerFactory>>::queue_submit
             at ./src/backend/direct.rs:19
  18: wgpu::Queue::submit
             at ./src/lib.rs:2250
  19: <shadow::Example as shadow::framework::Example>::render
             at examples/shadow/main.rs:813
  20: shadow::framework::start::{{closure}}
             at examples/shadow/../framework.rs:275
  21: winit::platform_impl::platform::sticky_exit_callback
             at /home/m4b/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.22.1/src/platform_impl/linux/mod.rs:698
  22: winit::platform_impl::platform::x11::EventLoop<T>::run_return
             at /home/m4b/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.22.1/src/platform_impl/linux/x11/mod.rs:312
  23: winit::platform_impl::platform::x11::EventLoop<T>::run
             at /home/m4b/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.22.1/src/platform_impl/linux/x11/mod.rs:390
  24: winit::platform_impl::platform::EventLoop<T>::run
             at /home/m4b/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.22.1/src/platform_impl/linux/mod.rs:645
  25: winit::event_loop::EventLoop<T>::run
             at /home/m4b/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.22.1/src/event_loop.rs:149
  26: shadow::framework::start
             at examples/shadow/../framework.rs:208
  27: shadow::framework::run
             at examples/shadow/../framework.rs:285
  28: shadow::main
             at examples/shadow/main.rs:818
  29: std::rt::lang_start::{{closure}}
             at /rustc/8d69840ab92ea7f4d323420088dd8c9775f180cd/src/libstd/rt.rs:67
  30: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:52
  31: std::panicking::try::do_call
             at src/libstd/panicking.rs:303
  32: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:86
  33: std::panicking::try
             at src/libstd/panicking.rs:281
  34: std::panic::catch_unwind
             at src/libstd/panic.rs:394
  35: std::rt::lang_start_internal
             at src/libstd/rt.rs:51
  36: std::rt::lang_start
             at /rustc/8d69840ab92ea7f4d323420088dd8c9775f180cd/src/libstd/rt.rs:67
  37: main
  38: __libc_start_main
  39: _start

with validation errors like:

[2020-07-10T07:22:24Z ERROR gfx_backend_vulkan] 
    VALIDATION [VUID-VkSwapchainCreateInfoKHR-imageExtent-01274 (2094043421)] : Validation Error: [ VUID-VkSwapchainCreateInfoKHR-imageExtent-01274 ] Object 0: handle = 0x5603ba21ed80, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x7cd0911d | vkCreateSwapchainKHR() called with imageExtent = (873,579), which is outside the bounds returned by vkGetPhysicalDeviceSurfaceCapabilitiesKHR(): currentExtent = (873,607), minImageExtent = (873,607), maxImageExtent = (873,607). The Vulkan spec states: imageExtent must be between minImageExtent and maxImageExtent, inclusive, where minImageExtent and maxImageExtent are members of the VkSurfaceCapabilitiesKHR structure returned by vkGetPhysicalDeviceSurfaceCapabilitiesKHR for the surface (https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#VUID-VkSwapchainCreateInfoKHR-imageExtent-01274)
    object info: (type: DEVICE, hndl: 94574007676288)

Interestingly, I see this identical validation error if i try to set the window's inner size in the builder (on x11), e.g.:

diff --git a/examples/framework.rs b/examples/framework.rs
index f4339599..989076ea 100644
--- a/examples/framework.rs
+++ b/examples/framework.rs
@@ -68,7 +68,9 @@ struct Setup {
 async fn setup<E: Example>(title: &str) -> Setup {
     let event_loop = EventLoop::new();
     let mut builder = winit::window::WindowBuilder::new();
-    builder = builder.with_title(title);
+    builder = builder
+        .with_title(title);
+        .with_inner_size(winit::dpi::PhysicalSize::new(1100, 700));
     #[cfg(windows_OFF)] // TODO
     {
         use winit::platform::windows::WindowBuilderExtWindows;

I wonder if they're related? Unfortunately this does now appear to be some unusual race condition with resizing, winit and wgpu/gfx

Unfortunately this does now appear to be some unusual race condition with resizing, winit and wgpu/gfx

Yeah, that's what I thought was happening her. We probably need to dive into winit and see how the inner size gets propagated.

this is very telling to me, on x11 I'm more convinced there's some race condition with winit resize reported size, when the event arrives, and what vkGetPhysicalDeviceSurfaceCapabilitiesKHR.

So I added these logs:

@@ -203,6 +205,9 @@ fn start<E: Example>(
     let mut last_update_inst = Instant::now();

     log::info!("Entering render loop...");
+    let mut resize = 0;
+    let mut rs = vec![];
+
     event_loop.run(move |event, _, control_flow| {
         let _ = (&instance, &adapter); // force ownership by the closure
         *control_flow = if cfg!(feature = "metal-auto-capture") {
@@ -219,6 +224,22 @@ fn start<E: Example>(
         };
         match event {
             event::Event::MainEventsCleared => {
+                if resize > 0 {
+                    let size = window.inner_size();
+                    log::error!("Last {}x{}", sc_desc.width, sc_desc.height);
+                    log::error!("Count: {}", resize);
+                    log::error!("Resizes: {:#?}", rs);
+                    log::error!("{:?}", size);
+                    resize = 0;
+                    rs.clear();
+
+                    log::info!("Resizing to {:?}", size);
+                    sc_desc.width = size.width;
+                    sc_desc.height = size.height;
+                    example.resize(&sc_desc, &device, &queue);
+                    swap_chain = device.create_swap_chain(&surface, &sc_desc);
+                    log::error!("DONE SWAPCHAIN");
+                }

Now at some point during resize, we can see the validation layer output an error, which is, as expected, when we recreate the swapchain (it's after the first logs, before the SWAPCHAIN DONE log):

[2020-07-10T17:27:14Z ERROR shadow::framework] Last 808x573
[2020-07-10T17:27:14Z ERROR shadow::framework] Count: 1
[2020-07-10T17:27:14Z ERROR shadow::framework] Resizes: [
        PhysicalSize {
            width: 808,
            height: 591,
        },
    ]
[2020-07-10T17:27:14Z ERROR shadow::framework] PhysicalSize { width: 808, height: 591 }
[2020-07-10T17:27:14Z ERROR gfx_backend_vulkan] 
    VALIDATION [VUID-VkSwapchainCreateInfoKHR-imageExtent-01274 (2094043421)] : Validation Error: [ VUID-VkSwapchainCreateInfoKHR-imageExtent-01274 ] Object 0: handle = 0x55bf7a96dcf0, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x7cd0911d | vkCreateSwapchainKHR() called with imageExtent = (808,591), which is outside the bounds returned by vkGetPhysicalDeviceSurfaceCapabilitiesKHR(): currentExtent = (808,596), minImageExtent = (808,596), maxImageExtent = (808,596). The Vulkan spec states: imageExtent must be between minImageExtent and maxImageExtent, inclusive, where minImageExtent and maxImageExtent are members of the VkSurfaceCapabilitiesKHR structure returned by vkGetPhysicalDeviceSurfaceCapabilitiesKHR for the surface (https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#VUID-VkSwapchainCreateInfoKHR-imageExtent-01274)
    object info: (type: DEVICE, hndl: 94280883821808)

[2020-07-10T17:27:14Z ERROR shadow::framework] DONE SWAPCHAIN
[2020-07-10T17:27:14Z ERROR shadow::framework] Last 808x591
[2020-07-10T17:27:14Z ERROR shadow::framework] Count: 1
[2020-07-10T17:27:14Z ERROR shadow::framework] Resizes: [
        PhysicalSize {
            width: 808,
            height: 596,
        },
    ]
[2020-07-10T17:27:14Z ERROR shadow::framework] PhysicalSize { width: 808, height: 596 }

Notice the size it wants to set it to is our resize value winit gave us: vkCreateSwapchainKHR() called with imageExtent = (808,591).

however, very interestingly, notice the size it expects vkGetPhysicalDeviceSurfaceCapabilitiesKHR(): currentExtent = (808,596), minImageExtent = (808,596), maxImageExtent = (808,596) is identical to the NEXT winit resize value being reported!

[2020-07-10T17:27:14Z ERROR shadow::framework] Resizes: [
        PhysicalSize {
            width: 808,
            height: 596,
        },
    ]

If I had to guess, that resize in the next loop should have been given to us in the previous loop?

recreating the swapchain appears to fix the instant panic

FWIW, this works for me, too (Wayland, GNOME 3, Intel UHD Graphics).

@m4b winit has a queue of events, right? So what if the window gets a resize, Resize event gets generated, then MainEventsCleared gets generated, then window gets resized and another Resize event gets generated, and then we process the first event? Effective window size will be different from the resize parameters.

What if, instead of resizing the swapchain to the resize parameters, we ask Window for its size and use that?

So yea that was my thoughts too; if you notice in patch above I actually use window.inner_size() for the size I set swapchain to, and only print the sizes that the resize event gave.

I don鈥檛 see another way to read window size, perhaps you had something else in mind ?

Oh bummer, it sounds like we need to ask winit folks again

I changed my swapchain rebuilding code to use window.inner_size() instead of the event size and that solves it for me.

@bch29 what system are you on/testing this? So e.g., patch above does this, and yes it makes it better (I see less validation errors/per resize) but you can still trigger what I pasted above (at least on x11). E.g., for me, i make a slow resize, dragging slowly, then very fast opposite direction, then slow again, and I'll see valdiation errors occasionally. One can also see the artifacts in the framebuffer since it won't occupy the resized space, but this only lasts for fraction of a second.

I'm on X11/Gnome 3. My main concern was the panicking but you are right, I can still get it to produce validation errors.

Ah, I know whats going on, its not winit's fault.

We are currently blindly using the window size winit provides us which may at any time become outdated compared to the values provided in VkSurfaceCapabilitiesKHR.

Instead our logic for creating the swapchain should look something like:

get VkSurfaceCapabilitiesKHR
if minImageExtent <= user provided extent <= maxImageExtent
   use user provided extent
else if currentExtent != (0xFFFFFFFF, 0xFFFFFFFF) 
   use currentExtent
else
    use minImageExtent

Then the user should recreate their resolution dependent structures when swapchain.get_current_frame() returns a different resolution, as there is no guarantee that it will have the user requested resolution. (When compiling for web the requested resolution is just thrown away anyway as its not part of the webgpu spec)

There's still a race condition between getting the VkSurfaceCapabilitiesKHR and creating the swapchain, which is weird, but I don't see a way around that, at least its better then before.

If you assign this to me i'll have a go at implementing the fix, but I may not be able to get to it till next weekend.

@rukai right, that is what we figured out so far. WSIs are async, and there is simply no way we can guarantee that we are creating the proper size of a swapchain.

As to your proposal, I don't think it's going to work. The minImageExtent and maxImageExtent can change too with the window asynchronously, so there is no way for us to guarantee that we'll fit in there.

I believe we can have a simpler solution. If we detect the min/max extent violation, just issue a log::warn and continue instead of panicking. This will work in practice, and we expect Vulkan validation layers to eventually stop reporting this as an error either (cc @Ralith)

It sounds like you know that the vulkan spec will change to allow using dimensions outside of minImageExtent/maxImageExtent? Do you have a link or anything? That would certainly remove any possibility of a race condition.

To be clear: I was suggesting getting VKSurfaceCapabilities just before creating the swapchain, so it definitely provides a better "way for us to guarantee that we'll fit in there." then what we have currently.

There is the discussion - https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/1340
The last answer from LunarG stated that the check will be removed.

I was suggesting getting VKSurfaceCapabilities just before creating the swapchain

it's still not guaranteeing anything, since, again, WSI is asynchronous. So by the time you get back the answer and before your request to create a swapchain reaches back to the WSI, it may have already been changed (during resize).

I think I'm currently hitting this problem and it makes me wonder whether anything has been done towards what @kvark suggested doing. I don't really see a workaround here from the application side of things which is a bit annoying to be honest.

What I suggested (changing an assert to a warning) was implemented, you must be facing something related but not exactly this issue. Also, looks like you found a solution on your side - https://github.com/parasyte/pixels/pull/122

Yes, indeed. I basically ported the same solution and it does indeed work. Probably we should just close this issue here then?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sagacity picture sagacity  路  3Comments

branpk picture branpk  路  3Comments

gzp-crey picture gzp-crey  路  3Comments

OptimisticPeach picture OptimisticPeach  路  3Comments

MarioSieg picture MarioSieg  路  4Comments