Wgpu: DirectX12 backend exhibits different behaviour than other backends; scene will not render on some machines

Created on 22 Oct 2020  路  14Comments  路  Source: gfx-rs/wgpu

Description
My application is simply supposed to draw a grid of lines on the screen. On some Windows 10 machines using the DirectX12 backend, when the iced-wgpu backend is initialized before my custom rendering, the grid disappears. This does not happen in DirectX11, or Vulkan on Linux. On some machines the DirectX12/iced-wgpu combo does work and the grid displays; luckily it seems to be reproducible in a Windows 10 VM.

Repro steps
See my repository here

Platform
Windows 10 build 19042.572
DirectX12

upstream

Most helpful comment

Ah, no I missed that now it's also blank..

All 14 comments

Thank you for a great issue! What real vendors/models suffer from this issue? I tested on AMD Ryzen 3500U so far, unable to reproduce.

Same for Intel Iris 550. Going to run on VM now...

My GTX 1080 doesn't seem to suffer the issue, I'll check with the other people I had test my app

Edit: Looks like it exhibited the issue on someone else's 1080, and they were also on Win10

So far, it looks like the problem occurs with WARP device only (software-ish implementation of D3D12 that's a part of the SDK). I found https://github.com/gfx-rs/gfx/issues/3432 while looking at the memory in RenderDoc that is all zeroes, but fixing this doesn't address the bug.

I narrowed it down a bit with this code:

if use_iced {
        //let _renderer = Renderer::new(Backend::new(&mut device, Settings::default()));
        let constant_layout =
            device.create_bind_group_layout(&wgpu::BindGroupLayoutDescriptor {
                label: None,
                entries: &[wgpu::BindGroupLayoutEntry {
                    binding: 0,
                    visibility: wgpu::ShaderStage::VERTEX,
                    ty: wgpu::BindingType::UniformBuffer {
                        dynamic: false,
                        min_binding_size: None,
                    },
                    count: None,
                }],
            });

        let constants_buffer = device.create_buffer(&wgpu::BufferDescriptor {
            label: None,
            size: 64,
            usage: wgpu::BufferUsage::UNIFORM,
            mapped_at_creation: false,
        });

        let _constants = device.create_bind_group(&wgpu::BindGroupDescriptor {
            label: None,
            layout: &constant_layout,
            entries: &[wgpu::BindGroupEntry {
                binding: 0,
                resource: wgpu::BindingResource::Buffer(
                    constants_buffer.slice(..),
                ),
            }],
        });
    }

It would be great to see if on master the situation is any different (since this test no longer relies on iced)

Looking at this case leads me nowhere. The stuff created in this snipped is getting successfully freed, and doesn't affect anything that happens per frame...
It would help if you could ask a person who can reproduce this on real hardware the following:

  1. print the adapter.get_info() that they get
  2. run from Visual Studio and get us the debug warnings/errors from the runtime. On real adapters I tested it doesn't complain about anything, and on WARP it doesn't, either, for some reason.

@msiglreith I've been looking at this for more than I hoped, and I was trying to narrow down the bug. I got to the end of it, I believe, but not sure how to explain or fix it yet, asking for your opinion.

Here is what happens in a test app:

  1. A uniform buffer A is created, a descriptor set for it is created and filled with data.
  2. A uniform buffer B is created, similar to A, then a similar descriptor set is created and filled (with write_descriptor_sets).
  3. Rendering starts using A, every frame copying something into it and making a draw call. Nothing shows up on screen when on WARP, but it should and does on non-WARP, and no validation warnings are reported.
  4. After the first frame the descriptor set of B is freed, then B itself is freed. This probably doesn't matter.

I narrowed the problem down to the descriptor update pool in our DX12 backend. It's a CPU-only descriptor heap for views (UAV/SRV/CBV). What write_descriptor_sets does is:

  1. it creates a view for each written descriptor (that needs a view) in this CPU-only heap
  2. it issues a CopyDescriptors from this heap into the shader-visible heap of the descriptor
  3. it instantly clears the CPU heap

What happens is that for both writes (of buffer A descriptor and buffer B descriptor), even though the target shader descriptors are different, and the source buffers are different, the intermediate CPU heap we use is the same, and we use the same index 0 of this heap for temporarily storing the CBV.

If I disable step (3) - clearing of buffer_desc_pool, then the problem goes away (!). It forces the temporary CBV to be a different handle for the copy between A and B.

I was thinking if CopyDescriptors is asynchronous in any way, but no, nothing in the docs suggests that it is, if I'm reading it correctly, and adding a wait for idle after the copy doesn't help either.

A few more things I tried:

  • recreating the buffer A - works
  • sleep_ms(1000) after CopyDescriptor - doesn't work

Hmm, I forced WARP adapter and seems to show correct results - do I need to set anything specific?
grafik

All I needed to reproduce was specifying the "dx12 iced" parameters. I see that you are on dx12, but are you sure you got iced enabled as well? If true, it would be even more weird :/

Ah, no I missed that now it's also blank..

Agree, looks like a bug in the WARP implementation to me. The memory regions of the non shader visibile descriptors should be immediately reusable after CopyDescriptors. The implementation seems to keep only references.

Thanks looking into this, @msiglreith ! Let's figure out a workaround, or otherwise WARP becomes totally unusable (and we'd need to block it).

Workaround landed in gfx-backend-dx12-0.6.10

Was this page helpful?
0 / 5 - 0 ratings

Related issues

YuuriMomo picture YuuriMomo  路  18Comments

cloudhead picture cloudhead  路  15Comments

m4b picture m4b  路  14Comments

zicklag picture zicklag  路  84Comments

kvark picture kvark  路  11Comments