Dxvk: d912pxy

Created on 2 Feb 2019  路  9Comments  路  Source: doitsujin/dxvk

Any idea why d912pxy have lower overhead than native? Is the paradigm partially applicable to DXVK?Thanks.

Most helpful comment

@artivision what is your point?

There's a slight difference between having a D3D9 implementation that is designed to run _one_ game, and one that is designed to run hundreds of games. It should be obvious that focussing on only Guild Wars 2 gives d912pxy the ability to implement incorrect performance hacks.

If you want better performance, feel free to make a pull request.

Good day sir and thank you for your hard work. I don't make demands here, just polite suggestions:

a) I don't prefer native implementation for an old api, uplifting is better. Examples (and please don't stay on them they are just examples): When i run Nvidia's blob, i see that there is an Immediate_Thread that is not tanked and six Deferred_Threads that all call the driver module. When i run Amd's blob i see that the Immediate_Thread spark's like crazy and the one and a half Deferred_Threads do not even call the driver module. Just 2.3x performance difference. Another example: Chopping the graphics in slices to use Multigpu. I know that a slice could be small or take the entire screen but at least standard_processing and post_processing could be different slices. That is why we call it "post" because it comes after.

b) Maybe the DXVK's D3D11 implementation is wrong from the beginning. Because MS released a lot of tools in the past for those reasons, it would cost 1/10 the manpower to uplift D3D9-10-11 to D3D12 and then translate D3D12 to Vulkan.

c) And if you think that those things are crazy, i can even propose a Motion Interpolation Filter for DXVK. If the fake frame is ready by the Gpu an -ms- after the real one and with distance data as well, then there are no major artifacts and the original frame can be uploaded half a frame-time after the fake that can be viewed immediately. Just 8ms latency for 60_to_120fps translation.

d) Please kill the competition.

All 9 comments

DXVK does not implement D3D9.

Any idea why d912pxy have lower overhead than native?

  • d912pxy is optimized specifically for GW2, does it even work with most other games?
  • Windows D3D9 drivers might not be as good as the D3D11 ones.
  • Windows D3D12 drivers might be faster than the Vulkan ones. (don't think so though)
  • D3D9's shader bytecode translates better to SM4 DXBC than to SPIR-V, maybe the driver can do a better job with that (SM4 shader compilers are pretty mature now compared to SPIRV)
  • D3D11 has a lot more features than D3D9 and modern games work almost entirely different from a lot of the old D3D9 ones.

Is the paradigm partially applicable to DXVK

No.

D3D9's shader bytecode translates better to SM4 DXBC than to SPIR-V

d912pxy doesnt even do dxbc->dxbc conversion it does dxbc->hlsl->hlsl->dxbc.

We know what it is, still irrelevant for DXVK.

@artivision what is your point?

There's a slight difference between having a D3D9 implementation that is designed to run one game, and one that is designed to run hundreds of games. It should be obvious that focussing on only Guild Wars 2 gives d912pxy the ability to implement incorrect performance hacks.

If you want better performance, feel free to make a pull request.

It's worth noting that d3d -> d3d is a lot easier than d3d -> not d3d.
d3d12 was also designed with the ability to support wrapping d3d9 in mind.
(they actually use this on ARM64 platforms in both ARM64 and the WoW x86 JIT stuff probably cause Qualcomm didn't want to write a d3d9 driver :frog:)
d3d12 also supports unmap ranges and other such things.

Lots of the benefits that d3d9on12 and d912pxy bring to d3d9 are not applicable to dxvk are because of the way d3d11+ works compared to d3d9.

There is also the fact that d3d9 drivers are getting more broken and less love as time goes on and they have to deal with a lot of hacks and support a lot of things that are now illegal in modern d3d.

Also, lots of things that were implemented in hardware (such as GetDC returning not some software copy and other such fun things) are now implemented in software (this was changed when we moved from GDI -> LDDM/WDDM iirc)

d3d9 will continue to get slower as time goes on without a proper solution, as graphics vendors simply don't care anymore and there is a lot of cruft and old legacy shit to deal with.

d3d11 does not have this issue, it's still receiving updates/upgrades, is still supported for developers and makes a lot more sense (as much as d3d can) and has a lot less edge cases than d3d9.

Hopefully this answers your question @artivision as to why some of the paradigms and ideas in d912pxy would be not applicable or actually hurt dxvk performance.

@artivision what is your point?

There's a slight difference between having a D3D9 implementation that is designed to run _one_ game, and one that is designed to run hundreds of games. It should be obvious that focussing on only Guild Wars 2 gives d912pxy the ability to implement incorrect performance hacks.

If you want better performance, feel free to make a pull request.

Good day sir and thank you for your hard work. I don't make demands here, just polite suggestions:

a) I don't prefer native implementation for an old api, uplifting is better. Examples (and please don't stay on them they are just examples): When i run Nvidia's blob, i see that there is an Immediate_Thread that is not tanked and six Deferred_Threads that all call the driver module. When i run Amd's blob i see that the Immediate_Thread spark's like crazy and the one and a half Deferred_Threads do not even call the driver module. Just 2.3x performance difference. Another example: Chopping the graphics in slices to use Multigpu. I know that a slice could be small or take the entire screen but at least standard_processing and post_processing could be different slices. That is why we call it "post" because it comes after.

b) Maybe the DXVK's D3D11 implementation is wrong from the beginning. Because MS released a lot of tools in the past for those reasons, it would cost 1/10 the manpower to uplift D3D9-10-11 to D3D12 and then translate D3D12 to Vulkan.

c) And if you think that those things are crazy, i can even propose a Motion Interpolation Filter for DXVK. If the fake frame is ready by the Gpu an -ms- after the real one and with distance data as well, then there are no major artifacts and the original frame can be uploaded half a frame-time after the fake that can be viewed immediately. Just 8ms latency for 60_to_120fps translation.

d) Please kill the competition.

Good day sir

Cringe.

a)...

What are you talking about none of what you said makes any sense.

That is why we call it "post" because it comes after.

Thank you for explaining a word in the English language. Who is we? What do you make or contribute to do allow you to say that?

b)...

d3d11on12 does exist but why would it be any better? You're just talking out your ass with no evidence here.

c)...

:frog: Go for it. Implement it please. This isn't VR where you have all the information in OpenVR and can calculate differentials and whatever. I'm not entirely sure how openvr does it, but I am fairly sure it does not apply here, and doing something would require modifying the game as we can't magically reproject.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

index-0 picture index-0  路  3Comments

torokati44 picture torokati44  路  4Comments

jekstrand picture jekstrand  路  5Comments

EnigmaRaptor picture EnigmaRaptor  路  5Comments

artivision picture artivision  路  4Comments