The upstream WebGPU spec is written (currently in the discussion logs, not yet in the actual spec) in a way that makes two bind group layouts interchangeable as long as the content passed into their creation is exactly the same ("by value" semantics). This also matches all the low-level APIs behavior.
We need to change our validation rules to compare them by value instead of by ID.
I'd like to tackle this. I'll post updates/questions in this thread.
Okay, I think I've got a really rough version implemented. Since this relaxes the current constraints, I need to write some tests that to be more sure I didn't do something totally terrible - the current examples working doesn't prove anything. While implementing, I of course ran into things I'd like to understand a bit better:
The Hub Access DAG
I understand this is meant to prevent deadlocks by only allowing the programmer to access subsequent nodes in the graph, based on what was previously accessed, but I am unsure whether the current code enforces this. Because the registry methods don't consume the token and only take it by &mut, it is legal to use the same token twice (or even only use the root token always) and access various parts of the graph:
// This is not prevented currently
let (mut pass_guard, _) = hub.render_passes.write(&mut token);
// ... possibly somewhere else in code ...
let (mut pass_guard2, _) = hub.render_passes.write(&mut token);
I this could be made more difficult to accidentally violate by always consuming the access token, modifying its drop impl, and providing a method for transitioning a general token into a more specific one. The token would probably need to store the count by which to decrement the ACTIVE_TOKEN on drop, or something similar...
Bind Group Layout Compatibility
What does it mean for two Bind Group Layouts to be compatible? I currently assume they are compatible iff their BindGroupLayoutBindings vectors are equal, because everything else is just implementation structures derived from these descriptors. Am I missing something?
I also do have a couple of more specific, implementation questions:
PartialEq for BindGroupLayoutBinding or is it not implemented on purpose for future proofing?I will publish a draft PR soon.
Because the registry methods don't consume the token and only take it by &mut, it is legal to use the same token twice (or even only use the root token always) and access various parts of the graph:
They borrow the token mutably, which means you can't have two storages accessed using the same token at the same time. In your example, either the first guard gets dropped before the second one is acquired, or the compiler will yell at you.
I currently assume they are compatible iff their BindGroupLayoutBindings vectors are equal, because everything else is just implementation structures derived from these descriptors. Am I missing something?
That's good for now. We can come back to this later.
Is it ok to derive PartialEq for BindGroupLayoutBinding or is it not implemented on purpose for future proofing?
Yes, it's ok and useful.
Is it ok, if I propagate the storage for bind group layouts down the call stack?
It sounds fine, we'll get to this when your PR is reviewed.
In your example, either the first guard gets dropped before the second one is acquired, or the compiler will yell at you.
I see your point and it indeed should prevent 2 borrows, but the current implementation doesn't. You can borrow two guards and deadlock with the current API, one doesn't even have to explicitly drop the tokens like I did in the above snippet.
I think this is because the input lifetime of the token is not tied to the output lifetime of either the token or the lock guard. I experimented by modifying the read and write functions a bit and managed to get at least the token lifetimes tied so that one really gets the borrow-checker error for invalid use.
However, even with my tweak, if you explicitly drop the tokens (but not the guards) returned from read and write, the original token will once again be unborrowed, leaving you free to deadlock :smile:
I think this could be possible to solve on the type level, but I realize this is all really academical and only concerns an internal API, so I understand if we don't want to spend more time on this :smile:
That's good for now. We can come back to this later.
Yeah, I just realized the BindGroupLayoutBinding for a bind group can specify a superset of shader stage visibility flags than what the BindGroupLayoutBinding for a pipeline specified, and they should still be compatible. We can improve this later.
@yanchith looks like you are on to something very important!
Please review the fix in #403
We had to disable that for now: https://github.com/gfx-rs/wgpu/blob/e39aaa9cb309a9326ac69e2767681a00dda347a8/wgpu-core/src/device/mod.rs#L882