Describe the bug
Internally, sleep_until calls _To_xtime_10_day_clamped to convert time point of any clock to time point of System clock. Then _Thrd_sleep sleeps on this system time point:
https://github.com/microsoft/STL/blob/31419650d472932dd24b9c5602707b0454f61fb2/stl/inc/thread#L154-L166
_To_xtime_10_day_clamped uses system_clock
https://github.com/microsoft/STL/blob/31419650d472932dd24b9c5602707b0454f61fb2/stl/inc/chrono#L629-L652
If system_clock is adjusted between _To_xtime_10_day_clamped and _Thrd_sleep, the sleep interval is wrong.
sleep_for is also subject to this problem as it is based on sleep_until
https://github.com/microsoft/STL/blob/31419650d472932dd24b9c5602707b0454f61fb2/stl/inc/thread#L168-L171
Though standard says
The functions whose names end in _Âfor take an argument that specifies a duration. These functions produce relative timeouts. Implementations should use a steady clock to measure time for these functions.
Command-line test case
Hard to build stable repro, but will try if needed.
Expected behavior
Neither sleep_until relative to a steady clock nor sleep_for should depend on system_clock.
GetTickCount64 is good enough steady clock, as Sleep API does not have granularity less than a millisecond anyway (calling undocumented NtDelayExecution is unacceptable). GetTickCount64 is very fast clock, it is basically reading a global variable.
STL version
Microsoft Visual Studio Professional 2019 Preview
Version 16.6.0 Preview 2.1
Additional context
Suspect that tests for #593 are failing due to timeout because other tests simultaneously tamper with system clock. Ok, this theory is defeated now, but the bug still looks valid
Yep, this is a known bug that affects bincompat, thanks for reporting it.
After 5 minutes of additional thought, we've realized that this might be fixable by adding new functions to the import library (without changing the behavior of the DLL, which must remain unchanged for previously compiled code). We hadn't considered this solution before, for the thread sleep issue, despite using the import lib for other features/fixes. ðŸ§
My observations here:
this_thread::sleep_{for|until}condition_variable::sleep_{for|until}condition_variable_any::sleep_{for|until}
Others, like timed_mutex are based on one of CVs.
For consistency, all need to be fixed by a single change.
As I mentioned, imprecise GetTickCount64 will do. Currently, _Xtime_get_ticks calls __crtGetSystemTimePreciseAsFileTime, which has unnecessary overhead.
While converting chrono time to imprecise time, must round up, not round down. Sleep should sleep "at least for".
There is unnecessary conversion back and forth. this_thread::sleep_for goes to this_thread_sleep_until, then sleep_until forms system time point and ultimately uses for Windows API (as there are no public "until" APIs). I think extra conversion is fine as long as it does not impact performance and does not cause interval to be shorter (that is 2 and 3 are addressed).
I've exposed GetTickCount64 as a part of #593, Sleep is also internally used there and can be exposed. Yet, I think import library solution is better to fix this bug, instead of waiting satellite. (Moreover, if fix is committed before waiting, #593 can be switched to use new import library functions.
There is reverse version of this bug. System clock based sleep_until may not be sensitive to system clock adjustment. See [thread.req.timing]/4. But I think that whereas this bug could be fixed, it is impractical. The fix would mean that all waits on non-steady clock are proxied via another variable that is waked on system clock forward adjustment. Should this still be filled as a bug?
6. There is reverse version of this bug. System clock based
sleep_untilmay not be sensitive to system clock adjustment. See [thread.req.timing]/4. But I think that whereas this bug could be fixed, it is impractical. The fix would mean that all waits on non-steady clock are proxied via another variable that is waked on system clock forward adjustment. Should this still be filled as a bug?
Is sounds more like a Standard defect to me. Standard defines this hard to implement feature, but acknowledges it is not useful:
[ Note: If the clock is not synchronized with a steady clock, e.g., a CPU time clock, these timeouts might not provide useful functionality. — end note ]
So here are three functions that needs to be replaced with other functions:
_CRTIMP2_PURE void __cdecl _Thrd_sleep(const xtime*);
_CRTIMP2_PURE int __cdecl _Mtx_timedlock(_Mtx_t, const xtime*);
_CRTIMP2_PURE int __cdecl _Cnd_timedwait(_Cnd_t, _Mtx_t, const xtime*);
Apparently they can be re-implemented in a new translation unit injected into the import library.
The most complex is _Mtx_timedlock, but still copying existing function should do the trick.
New functions could be made taking either time point relative to GetTickCount64 or duration. Duration seems even better to me, as there are no unnecessary conversions, when program passes duration.
Duration can be unsigned long milliseconds (that directly match current Windows API) or long long nanoseconds (that provides potential for precision improvement for future, if more precise Windows APIs are ever exposed)
AFAIK they can't be implemented in the import library because we might choose ConcRT backing for the synchronization primitives, and ConcRT depends on global state in the STL and ConcRT DLLs that will be damaged trying to do this from outside.
Fixing this_thread:: things might be doable in an import library change because they don't depend on anything else. What I did in our vMajorNext branch was call NtWaitForSingleObject with a handle of GetCurrentThread(). We know that the current thread is running so the handle will never be signaled, so NtWaitForSingleObject must time out. NtWaitForSingleObject was chosen over WaitForSingleObject because the NtXxx version accepts both absolute and relative timeouts (according to the sign of the supplied LARGE_INTEGER).
I recommend just waiting until we can do all the vNext things though. Sleeping a thread isn't a super useful tool anyway.
I think the use of NtWait-whatever for sleep is great idea, but isn't it cheating in sight of #705 ?
I believe NtWaitForSingleObject is documented: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-zwwaitforsingleobject
It is driver kit, not SDK. In theory it might mean higher risk for this to be broken in future OS. In practice I don't think so, not at least for this function.
AFAIK they can't be implemented in the import library because we might choose ConcRT backing for the synchronization primitives, and ConcRT depends on global state in the STL and ConcRT DLLs that will be damaged trying to do this from outside.
I actually don't get this problem. Say there's exactly the same function as mtx_do_lock, which takes duration or time point relative to a steady click, coexisting with old function. What's the problem?
The implementation of the new function needs to do waits / wake waiters / etc. in a way that existing .objs that don't call the new function aren't broken. I don't know exactly how the ConcRT backend works but I'm pretty sure it has some globals that will be inaccessible from the import library.
I think you are referring to this:
These APIs are already in relative units.
xtime* - system time functions are built on top of them.
So new functions can be built on top of them too, there's no problem.
In fact the decision is made when mutex or condition_variable[_any] is created.
The timed functions just perform virtual function calls.
If you can get it to work I'm not going to dispute an existence proof
Any update on this? We've run into this issue
Not from me. I'm not interested in contributing a fix or other changes in near time.
Although due to cleanup in mutexes area recently performed by @StephanTLavavej and me, there's no more case __stl_sync_api_modes_enum::concrt: case, so the fix I have in mind would be a bit easier to test out.
Actually I think @BillyONeal would not question my idea anymore, as his concern was ConCRT:
AFAIK they can't be implemented in the import library because we might choose ConcRT backing for the synchronization primitives, and ConcRT depends on global state in the STL and ConcRT DLLs that will be damaged trying to do this from outside.
and like I said, it was ConCRT case was recently wiped out.
Though Billy will not participate in near time either.
But it is fixable, possibly if you want the fix that much, you can contribute it....