Wgpu-rs: Semaphore error with map_read

Created on 9 Jun 2020  路  17Comments  路  Source: gfx-rs/wgpu-rs

Not sure how to name this one.

My render workflow is to render the 3D scene to both the framebuffer and a second output which contains a value associated to the pixel pointed by the mouse. It's a depth value between the 3D point and a world position. I then read back the value and show it in the 2D overlay.

I get the following error: [2020-06-09T18:01:01Z ERROR gfx_backend_vulkan] [Validation] Validation Error: [ VUID-vkQueuePresentKHR-pWaitSemaphores-03268 ] Object 0: handle = 0x1bd68ec4ec8, type = VK_OBJECT_TYPE_QUEUE; Object 1: handle = 0xd76249000000000c, type = VK_OBJECT_TYPE_SEMAPHORE; | MessageID = 0x251f8f7a | VkQueue 0x1bd68ec4ec8[] is waiting on VkSemaphore 0xd76249000000000c[] that has no way to be signaled. The Vulkan spec states: All elements of the pWaitSemaphores member of pPresentInfo must reference a semaphore signal operation that has been submitted for execution and any semaphore signal operations on which it depends (if any) must have also been submitted for execution. (https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#VUID-vkQueuePresentKHR-pWaitSemaphores-03268).

The error point is there https://github.com/Yamakaky/joy/blob/44d101892877c3379bfb827c5e7798bbcc30b309/src/render/mod.rs#L390-L391 if I switch these two lines. To be exact, it's the map_read call that must be executed before submission of the command buffer.

From the error message, maybe it's because map_read issues a read semaphore, but it can't be executed since the command buffer is already executed?

wgpu 0.5
vulkan

bug

All 17 comments

What are you trying to map? The buffer containing mouse picking info?
You should then be requesting the map after the submit. Otherwise, the map will not wait for the submission (obviously), and you'll get wrong results (and ideally - a proper panic).

After the 3D render, I copy the pixel pointed by the mouse to a buffer, map the buffer and read it (since I don't think you can directly read from a texture?). Calling map_read before the submit seem to work correctly, and calling it after the submit also seem to work but prints the warning. I'll try to see the call trace in renderdoc to see which semaphore is problematic.

Also, what can I do to improve the ugly self.device.poll(wgpu::Maintain::Wait); in the middle of my render process? Run it in a separate thread?

It should be noted that for now I have a shitty laptop with no debug vulkan layer, so there may be some errors hidden. If we can't find the solution, I'll try with my new laptop in 1-2 weeks.

Calling map_read before the submit seem to work correctly

It doesn't make sense. You want to read the data produced by the submission, so you need to submit first.

Also, what can I do to improve the ugly self.device.poll(wgpu::Maintain::Wait); in the middle of my render process? Run it in a separate thread?

WebGPU doesn't allow your buffer to be used by both CPU and GPU. So if you want to use the buffer in the next submission, it has to wait till the GPU is done, then till CPU is done, etc. That's definitely not what you need.
If you, however, have a list of staging buffers, you can avoid the wait entirely. Supposing it has a list of available buffers that aren't used by the GPU. You grab one during command recording, issue commands to copy data into it, and then request to map it after the submission. That should remove the buffer from the list. Once the mapping callback comes back, you read the data, do something with it (i.e. process mouse picking), and return it back to the list of available buffers.

Yes, it must read the data from the last frame.

As you say, it's not efficient to wait for GPU + CPU. However, I would at least expect not to have the validation warning above if I call this function (https://github.com/Yamakaky/joy/blob/44d101892877c3379bfb827c5e7798bbcc30b309/src/render/mod.rs#L309-L318) after the submission. Any idea why it does warn?

Sorry for not making this clear: wgpu-rs is developed to be a completely safe library in Rust sense. No native API validation errors are supposed to show up. If something goes off rails, we need to at least panic. So your case of getting a validation error is a bug on our side, no matter how you look at it. I'm just trying to explain that it's caused by something you aren't doing right, and when we fix the bug on our side, you'll get a panic and will have to fix your client code anyway.

It would help to know if master behaves differently. I understand that porting to master may be challenging on your side. From ours, the issue would be much less important if it's already fixed on master.

OK ^^

I use iced which depends on 0.5, so it may be easier to write a small test program.

Hum, so I tried on my new linux laptop with intel igpu and all vulkan debug layers, and I don't get the warning if I put the buffer copy before the submit and the map_read after the submit, like you suggested. Maybe a bug on windows only?

I use the mesa driver, not the intel one.

Haven't digged into it yet, but to add a potentially useful data point right away: My project gets this as well but interestingly I didn't hit it before updating from 0.5 to a (still fairly recent) master version.
Latest repo just now tested on Windows @ Nvidia at https://github.com/Wumpf/blub/commit/f6bc4b051ff934fc8b386af806cd0f9524543d62. To repro run via cargo run and take a screenshot via Print key (which will cause the app to print out an error).

The exact validation error:

    VALIDATION [VUID-vkQueuePresentKHR-pWaitSemaphores-03268 (622825338)] : Validation Error: [ VUID-vkQueuePresentKHR-pWaitSemaphores-03268 ] Object 0: handle = 0x1c781243100, type = VK_OBJECT_TYPE_QUEUE; Object 1: handle = 0x6fcb860000000f2a, type = VK_OBJECT_TYPE_SEMAPHORE; | MessageID = 0x251f8f7a | VkQueue 0x1c781243100[] is waiting on VkSemaphore 0x6fcb860000000f2a[] that has no way to be signaled. The Vulkan spec states: All elements of the pWaitSemaphores member of pPresentInfo must reference a semaphore signal operation that has been submitted for execution and any semaphore signal operations on which it depends (if any) must have also been submitted for execution. (https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#VUID-vkQueuePresentKHR-pWaitSemaphores-03268)
    object info: (type: QUEUE, hndl: 1956376752384), (type: SEMAPHORE, hndl: 8055679693040389930)

Nothing much different from what @Yamakaky reported though, but my code looks a little bit different. Rough overview on relevant things the app is doing:

  • start frame
  • do rendering and whatnot with offscreen target A
  • copy A to frame texture
  • submit (one and only) encoder to queue
  • do a map_read on A
  • poll(wgpu::Maintain::Wait)
  • block for map_read to finish
  • drop frame

See: https://github.com/Wumpf/blub/blob/master/src/screen.rs#L192
Issue goes away if I drop the frame any point before doing the map_read. Which (apart from the whole "we shouldn't get validation errors" issue) doesn't entirely make sense since there was a poll-wait before 馃

Ok, I see what's going on.
We are trying to wait for a semaphore on present(), but the relevant submission has already been waited for on CPU using a fence. So we shouldn't be waiting for a semaphore. This is a bug on our side, rightfully pointed out by the validation error.

However, there is no reason why your applications need to keep the frame around while waiting for GPU. It doesn't make any sense. You are supposed to have a frame to live as short as possible.
Needless to say, the fact you are doing poll(wgpu::Maintain::Wait) is also undesirable: it ruins your performance, and it's not portable to the Web.

Makes sense!
btw. have the poll there only for quick & dirty screenshot functionality. I do want to keep a queue of futures around going forward that writes them out when ready, but this good enough for a start. I'm ofc not doing that Wait when there's no screenshot being taken :)

What would be a good pattern for poll? Wait in a different thread?

This is async API, you aren't supposed to wait for anything. Once the buffer is available, you'll get the callback.

Would you be able to test https://github.com/gfx-rs/wgpu/pull/722 and see if it helps?
I'm fairly sure it's the fix, but haven't tested it myself.

Do you have an example of a good integration of an async runtime? All the examples use block_on.

The example framework only blocks at the top level, i.e. waits for run_async to finish because that one does all the things. Everything inside run_async is truly asynchronous.

So for example have:

  • tokio run the main loop instead of futures
  • run poll in a loop via tokio::task::spawn_blocking`
  • use tokio::spawn and winit::EventLoopProxy to map the buffer + read it + modify the application state?

What about if the mapping takes more than one frame to run for whatever reason? Would it be better to only run copy the texture and map the buffer if the previous future already finished?

These all suggestions sound very reasonable, and it would be great to have them represented in some way!

What about if the mapping takes more than one frame to run for whatever reason?

It can easily take a few frames. It's basically waiting till GPU is done using the buffer.

Would it be better to only run copy the texture and map the buffer if the previous future already finished?

You can't do anything with a buffer while the mapping request is in flight. You should have a number of different buffers that are used in different phases, a thing we call "staging belt", which I described in short above. Will be happy to provide more details, but it's better to talk about this on #wgpu:matrix.org instead of here.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bvssvni picture bvssvni  路  5Comments

kvark picture kvark  路  3Comments

rukai picture rukai  路  5Comments

MarioSieg picture MarioSieg  路  4Comments

lordnoriyuki picture lordnoriyuki  路  4Comments