Rpcs3: Tales of Vesperia performance regression - Windows timer resolution

Created on 14 May 2018  ·  48Comments  ·  Source: RPCS3/rpcs3

After the Windows 10 1803 update, I was seeing a regression in performance with Tales of Vesperia. This was also reported by another user.

Sleeping dog 45 fps => 25 fps
Town 18-19 fps => 14 -15 fps

I was able to revert the 1803 update and the performance regression went away. Re-installed 1803, the performance regression returned.

After some debugging, the problem is not related to the 1803 update, but instead it appears to be related to the Windows timer resolution.
https://msdn.microsoft.com/en-us/library/windows/desktop/dd757633(v=vs.85).aspx

By default, timers have a 15 msec resolution in Windows. Multimedia applications can set the timer resolution lower if needed (e.g. for animation).

RPCS3 sets the timer to 1 msec.

If I force the timer resolution to the minimum (0.5 msec), then the Vesperia frame rates return to normal.

Sleeping dog 25 fps => 45 fps
Town 14 -15 fps => 18-19 fps

You can get/set the timer resolution by using this small app.
https://vvvv.org/contribution/windows-system-timer-tool

I'm guessing prior to Win10 1803, there was an OS component or possibly another app that was setting a 0.5 msec timer resolution. Note that the timer resolution is system wide.

I'd recommend that we change RPCS3 to use a 0.5 msec timer resolution. Also, we'd probably need to look at removing any dependencies on these timers (if possible) as it is definitely affecting multiprocessor performance.

Setup
Windows 10 (1803 update)
rpcs3-v0.0.5-6718-737db900_win64

Most helpful comment

I reported this issue in the Feedback Hub.

Upvote, s'il vous plait.
https://aka.ms/AA1a3ta

All 48 comments

timeBeginPeriod() only supports >1 msec resolution. NtSetTimerResolution() needs to be used to set 0.5 msec.

It appears the QT5CORE library is also setting the timer resolution to 1 msec. We need to make sure the QT5 setting doesn't overwrite the RPCS3 setting.

Whats wrong with 1ms? Does setting the timer to 1ms not work but only 0.5 works? If timebeginperiod isnt working as expected it might be worth notifying microsoft.

timeBeginPeriod() with 1 msec is working fine, but a lower timer resolution should help performance especially when synchronization is a major bottleneck (e.g. as seen with Vesperia).

Setting the system timer resolution to a lower value helps

  • Sleep() resolution
  • timeouts of WaitForSingleObject

I know what it does, I'm asking about the bug you mentioned about older windows automatically setting resolution to 0.5ms. Remember this is a regression afterall.
I'm very curious about this claim as I would like to isolate the bug instead of merely hiding the symptoms.
Some considerations:

  1. Wouldn't the calls to timeBeginPeriod from rpcs3 just reset the timer to 1ms? This makes it unlikely that a system component set the timer resolution unless it did so continously.
  2. Is there a reason that our call to timeBeginPeriod is not working?
  3. Is there really a system service setting resolution lower on older windows? (It can be tested using NtQueryTimerResolution, I'll do this one)
  4. Does setting the timer to 1ms with your override application work when rpcs3 does not?

P.S I'm not against the change, just being thorough. I believe in properly documenting anomalies like these so that we avoid voodoo fixes where the symptom was cured without diagnosing the disease.

image
Creator's update (Ryzen 1700). Looks like default timer resolution here is 1ms making rpcs3's timer config a nop. It is possible that timeBeginPeriod is not working on rpcs3 or we are mistakenly disabling the set period shortly after enabling it. I suggest starting rpcs3 + vesperia then checking the current value and see if it really was changed to 1ms. I should retest with a hard reset and see if I get the same value.

EDIT: 1ms after clean reboot although windows seems to spawn some processes anyway because it thinks I want them opened. Hard to determine this behaviour on newer versions of windows 10 but I think 1ms seems to be what it starts as.

The slow down you observe might be due to spectre and meltdown velnurability update to amd and intel cpus.

Accrding to report this can cause 20-80% slowdown.

I dont remember Spectre and Meltdown patches being exclusives for Windows 10...

@kd-11 - I agree, it would be good to confirm that the pre 1803 update was setting a lower timer resolution.

I made the assumption because

  • the FPS was exactly the same
  • I was lazy to revert the Win10 update again :-)

I'll try it today.

The OS should have a default 15 msec timer resolution (for lower power consumption). We can use powercfg to see who is setting it lower.

Wouldn't the calls to timeBeginPeriod from rpcs3 just reset the timer to 1ms? This makes it unlikely that a system component set the timer resolution unless it did so continously.

The timer resolution is system wide. If two processes set it to 1 msec and then 15 msec, the lower value is used so both processes are satisfied.

Is there a reason that our call to timeBeginPeriod is not working?

No, it works properly. I set a breakpoint on NtSetTimerResolution to confirm.

Is there really a system service setting resolution lower on older windows? (It can be tested using NtQueryTimerResolution, I'll do this one)

I'll give it a try too.

Does setting the timer to 1ms with your override application work when rpcs3 does not?

rpcs3 does set the timer properly.

I think all thats left is to have rpcs3 running with the poor performance and then check the current timer interval setting.

My concern is that even with 0.5 if some other application sets it back to 15 were back to square 1.

On Windows 10 1709 i have:

Desktop: 15 ms
RPCS3 Open: 1 ms
Tales of vesperia boot: 0.5 ms

Windows isn't letting me revert the update now.

@Illynir - thanks for doing the test.

Can you see who is setting the 0.5 msec timer resolution by

  • open an administrator command prompt
  • run “powercfg -energy duration 5”.

An HTML file will be generated showing all the processes that modified the timer resolution.

No problem but the report is in french though. :-P

energy-report.zip

I took 2 years of French in high school. :-)

But I don't see a warning section that shows "Platform Timer Resolution:Outstanding Timer Request".

It should look something like this:

Warnings

Platform Timer Resolution:Platform Timer Resolution
The default platform timer resolution is 15.6ms (15625000ns) and should be used whenever the system is idle. If the timer resolution is increased, processor power management technologies may not be effective. The timer resolution may be increased due to multimedia playback or graphical animations.
Current Timer Resolution (100ns units) 10000
Maximum Timer Period (100ns units) 156250

Platform Timer Resolution:Outstanding Timer Request
A program or service has requested a timer resolution smaller than the platform maximum timer resolution.
Requested Period 10000
Requesting Process ID 5824
Requesting Process Path \Device\HarddiskVolume3\Users\Paul\Documents\ps3_emu\rpcs3-v0.0.5-6718-737db900_win64\rpcs3.exe

I have a bunch of changes, from chrome, discord, calculator, microsoft.lockapp, devenv all setting to 10000 units. Looks like everything including the windows lockscreen is requesting 1ms.
I'm on windows 10 1709

I wondering if it's related to Windows 10 game mode. Can you run a game and see if the timer resolution is modified?

the game mode option on 1803 is completely useless for me.
the link in the settings is broken and there is nothing that let's me enable/disable it

little info:
image

@pauls-gh

I don't know, I used the command you gave me as administrator.

the game mode option on 1803 is completely useless for me.

It works for me. You might need to enable the Game Bar first. Settings/Gaming/Game bar.

I do not have game mode enabled, also I'm fairly sure any game will modify the timer interval on their own anyway so that cannot be a good test. However..
Looking at the callstacks, one thing jumps at me - all these apps load d3dx libraries, often via Windows.UI.Xaml.dll which is used for the windows 10 WPF style UIs. d3d9 and d3d11 dlls are setting the timer duration on load, thats for sure, however, I feel my situation is a red herring, what we really need is the energy report on 1803 setups. What is resetting the timer to default after rpcs3 sets it to 1ms? If this isnt happening, then what is the real cause of the regression? Does 0.5ms on 1803 equal 1ms on 1709 (obviously no)?

On my 1803 setup, the energy report shows that rpcs3 set the timer resolution to 1 msec and that the current timer resolution is 1 msec. I captured the report while Vesperia was running.

Where can I find a 1709 ISO?

So what you're saying is that on 1803, 0.5ms timer gives the same performance as older windows with 1ms timer if I understand correctly, right?

Actually I noticed something interesting - the timer value changed after rpcs3 finished loading shaders! It changed to 0.5ms, I think this might be caused by windows detecting cpu load?

Yes.

Desktop: 15 ms
RPCS3 open: 1ms
Shader loading: 1 ms
Ingame: 0.5 ms

Wouldn't it be possible to use RDTSC or QPC as a timer controlled by RPCS3 directly?

@kd-11 - now that you can reproduce it...can you capture an energy report? Then we can see who is setting the timer resolution to 0.5 msec.

I checked, the only request in the report by rpcs3 is for 10000 units but the platform timer is 5004 units. Its done implicitly by the system, I believe its detecting cpu load as there is an entry in there about high cpu utilization by rpcs3.exe

The energy report for 1803 also shows the same high CPU utilization warnings for rpcs3.exe. But of course, the timer resolution is not set to 0.5 msec.

can confirm it stays on 1msec on 1803

Looking at the situation again, I am doubtful it is implicit. If I stop the game, the timer rests to 1. If I pause the game (8% cpu load) it stays at 0.5. Suspending the exe (0% usage) also does not reset the timer (remains at 0.5). This means the runtime is changing this value intentionally. It also does the same when using interpreters so it is not something being applied by llvm for example. It does not happen with all games, just some, p5 guarantees it will drop to 0.5 for me but other games do not trigger this.

Sounds like a CPU benchmark optimization (hack). :-)

Yea, I have also removed timeBeginPeriod calls and verified that there are no imports of NtSetTimerResolution used anywhere in rpcs3 (it is declared but unused in the code). I have also noticed that the timer resolution fluctuates depending on what I'm doing on the desktop between 0.5 and 1ms sometimes only lasting for a second on one of the two. This hints at windows (probably ryzen scheduling?) is dynamically changing this value on its own and the timeBeginPeriod call is meaningless.

timeBeginPeriod call is meaningless

I wouldn't say that. It is a ceiling. It guarantees the timer resolution is at most 1 msec.

According to this link, Windows 10 supports dynamic timer resolution. Maybe this was disabled or tweaked for the 1803 update.

https://forum.warthunder.com/index.php?/topic/316978-improve-game-performance-through-dpc-latency-adustment/

can confirm it stays on 1msec on 1803

Can also confirm. Cold boot, and just idling around on Windows (or opening the occasional app) has it stay at 1ms. This is an Intel based build (8700k). Funnily enough, this issue has been brought up over on Nvidia's forum as well. https://forums.geforce.com/default/topic/1046797/geforce-drivers/windows-10-version-1803-feedback/post/5335732/#5335732

Obviously as the OP stated, setting it to 0.5 has fixed performance in Vesperia at least. As far as I can tell, it never changes from 1ms. May be worth reporting it in the Feedback Hub.

I wouldn't say that. It is a ceiling. It guarantees the timer resolution is at most 1 msec.

Fair enough, but that doesnt change the fact that windows seems to be messing with the values on its own and on older windows (pre-1803) it does detect rpcs3 utilization and adjusts accordingly. I'm surprised they suddenly changed this behaviour without warning, probably some kind of bug or maybe only applies when using balanced/power saver mode.

Obviously as the OP stated, setting it to 0.5 has fixed performance in Vesperia at least.

This change was done via an external program, need to confirm using it in rpcs3 works the same way. @pauls-gh The declaration for NtSetTimerResolution is defined but currently unused, so just use it directly and see if it fixes the performance problem. Ofc you have to reset the timer on exit as well if you plan on making it a PR.

I'm surprised they suddenly changed this behaviour without warning, probably some kind of bug or maybe only applies when using balanced/power saver mode.

Probably yes, on high performance it's good i think. On Spring creator update (1803) Micosoft has added Ultimate performance profile. There must have been changes made to all profiles.

One of the first things I tried on 1803 was Ultimate performance power mode. Didn't help. Both 1803 and 1709 are using Balanced power mode by default.

I agree that it might be worth reporting this to Microsoft.

I'll work on a PR for this. It's a little bit tricky because QT5 is also setting the timer resolution (for logging). I'll need to set the timer resolution just as the game is loaded.

Prob just a note, I am on the high performance profile. I did check the others and i was still getting the same 1ms when starting/playing RPCS3

I reported this issue in the Feedback Hub.

Upvote, s'il vous plait.
https://aka.ms/AA1a3ta

I've been saying on discord I have this issue on 1803 update with most of the games... While some games, like Dragon Ball Raging Blast, worked full speed on 1709, updating to 1803 would cause a performance regresion. I'll have to try it out with this build.

@kd-11

My concern is that even with 0.5 if some other application sets it back to 15 were back to square 1.

Maybe I'm misunderstanding your concern, but the timer resolution is one global resource, and it's set to MIN(all_requests). So if RPCS3 were to request 0.5, then it would be forced to 0.5 globally for the duration of that request regardless of what anything else does.

Paul, were there any changes between 6742 and 6770 that may have messed with Vesperia's performance? Reverting to 6742, fps is at 12, and goes back to 23 when using the timertool to set it to 0.5ms. On 6770, even with your timer fix, it seems FPS is down to 14 now, and forcing the Windows timer back to 1ms drops the FPS in half to 7 fps.

I'm on top of tree now and it does seem performance has regressed. It appears to be in the scenes that are SPU limited. For example the sleeping dog scene at the start used to be 45 fps on my system, now it's ~35 fps. It was ~25 fps when the timer resolution was 1 msec.

In town, the fps is about the same (18-19 fps), but this is GPU limited.

Looks like build 6750 has the regression.
PR - https://github.com/RPCS3/rpcs3/pull/4580

6748 + timer resolution = 0.5, sleeping dog = 45 fps
6750 + timer resolution = 0.5, freezes are seen, sleeping dog = 35 fps
6755 + timer resolution = 0.5, freezes are seen, sleeping dog = 35 fps

6784 - sleeping dog = 35 fps (latest build as of today, 5/17/18)

I'll open a new issue for this new performance regression.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Luffykun007 picture Luffykun007  ·  3Comments

XeClutch picture XeClutch  ·  3Comments

LokiGrants picture LokiGrants  ·  3Comments

Asinin3 picture Asinin3  ·  3Comments

AniLeo picture AniLeo  ·  3Comments