Runtime: Nanosecond time

Created on 22 Aug 2018  Â·  39Comments  Â·  Source: dotnet/runtime

I found myself writing a software PWM recently, for which high precision time is required. But there doesn't seem to be any reliable method to get system time in nanoseconds in .NET Core. The only available solution use system ticks, but more than one highly ranked google result on this doesn't work correctly.

Could an API be added to get nanotime (preferably not using a struct like DateTime to improve performance)?

api-suggestion area-System.Runtime

Most helpful comment

Since the microcontroller is dedicated to running this program

You are probably using a Linux distro or Windows 10 IoT on your RPi 3, which isn't a microcontroller.

Only .NET MF and LLILUM run on microcontrollers (closest to bare metal you can get with C#).

x86 machines implement a high resolution (more than BCM2837 at least) dedicated timer which Windows and Linux have APIs for (which are consumed by .NET Core for System.Diagnostics.Stopwatch).

But ARM itself doesn't require a high performance hardware timer as part of the core spec.
However, the BCM2837 microprocessor (ARMv8 architecture) in your RPi 3 has a hardware timer with a resolution of a microsecond (1MHz clock speed) implemented - (but the actual precision will definitely be lower than the resolution because it takes some units of time, depending on your hardware, to access the counter value itself.)

So I'm guessing that QueryPerformanceFrequency on Windows 10 IoT running on RPi 3 should return the equivalent of 1000 ns/1us (1,000,000) instead of 100ns (10,000,000) like it does on most x86 Windows machines. I don't have a RPi3 so I can't check for myself.

.NET Core System.Diagnostics.Stopwatch on Unix calls clock_gettime, which is implemented by RPi Linux kernel to use the BCM2837 HW Timer I mentioned above, just like on x86.

Stopwatch.Frequency will still return 10^9 (i.e., 1ns) on Rasbian even though it is actually 10^6

solve this part using core pinning and no yielding. Since the microcontroller is dedicated to running this program, I simply use a while (true) loop and continue; if enough time hasn't passed.

This is not an Arduino-like/MCU environment. Processor affinity/thread yielding still puts the OS scheduler in control of the execution, not the thread. Modern OSes have a preemptive multitasking model, not cooperative. It might also consume more power.

Your code in the current setup will be executed by the .NET Core (not compiled to native) on a GPOS (i.e., not RTOS) on a x86/x64/ARM system. There is a lot of overhead to be considered here at every stage (.NET CLR, OS, Hardware).

So, generating a software PWM signal with nanosecond accuracy is not realistically possible, nor is using such a signal from your software environment.

You can have nanoseconds instead of microseconds as the unit of time in your 64 bit data structures, but the last few bits will likely be padded with 0s depending on the hardware and APIs used because there simply isn't enough accuracy or resolution available for reliable timekeeping.

The documentation for QPC(QueryPerformanceCounter ) using TSC (TimeStampCounter) in Windows calculates a resolution of ~333 nanoseconds for a x86 PC with a 3GHz processor clock speed (~30ns if you use the assembler instruction directly). QPC is Windows only, but .NET Core Stopwatch (AFAIK) uses it on Windows and most Linux kernels use TSC in their time implementations which .NET Core uses.

The closest you'll get to a nanosecond accuracy is by counting processor cycles using your own kernel module like this to access the counter directly or by using the ARM PMU API but that's a bit out of scope for .NET, I think.

I don't see a point in adding an API for something that is so specialized and misleading, especially if more useful stuff like specialized collections isn't going to be added to corefx.

How does Java manage to do it in System.nanoTime()?

System.nanoTime() has this warning in the API docs:
" This method provides nanosecond precision, but not necessarily nanosecond resolution (that is, how frequently the value changes) - no guarantees are made except that the resolution is at least as good as that of currentTimeMillis()."

System.nanoTime() is also implemented using the same APIs that Stopwatch uses on Windows and Linux both so the value will be same as Stopwatch.

These articles are very informative, I suggest you read them if you want to know more about timers:

All 39 comments

Is

System.Diagnostics.Stopwatch.GetTimestamp();

Any use?

Also the properties

Stopwatch.IsHighResolution
Stopwatch.Frequency

https://docs.microsoft.com/en-us/dotnet/api/system.diagnostics.stopwatch

@benaadams Most hacks I found online to calculate nanosecond time used the properties you mention. But I think it would be very useful to expose one single property rather than calculate the whole thing. Most other langs (even high level ones like Java) have functions for this.

....I think I'm more curious as to why a password manager needs a timestamp at all, much less a high-precision one.

Also, depending on where this is being deployed to, the physical computer clock may not be able to give you what you need (although honestly, I really doubt you're going to encounter that particular problem...).

@Clockwork-Muse Haha guess I should have been more clear - PWM is Pulse Width Modulation (in the context of a GPIO pin on a Raspberry Pi). There is also hardware assisted PWM which is very accurate, but is only available on a few pins. Software PWM is an alternative that works on all pins. So high accuracy time is useful for that.

@shravan2x Not sure if this helps. When it comes to PWM, you might want to reach out to @edgardozoppi and review his work here: https://github.com/dotnet/corefxlab/issues/2426

@shaggygi Thanks, I'll take a look. But I still think a nanosecond API would be good to add. A google search shows a lot of people looking for the same thing.

Lack of a nanosecond API has the advantage people calculate a conversion factor which gets them in one step from the StopWatch TimeStamp to the unit they care about.

Software PWM

Have you observed visible/audible artifacts from the GC? :smile:

Have you observed visible/audible artifacts from the GC?

I thought there was a proposal to offer an API to allow the app to tell the GC to not run for a while, but I can't find it.

That's better than a proposal...

I don't believe, even with Stopwatch.GetTimestamp(), you are guaranteed nanosecond granularity. If such granularity is important, you should likely be checking Stopwatch.Frequency yourself and accounting for any variations.

The Stopwatch instance (which itself uses the more general GetTimestamp() static method) uses 100-nanosecond granularity (IIRC)

Spec question - do you need an actual timestamp (that is, convertible to human time/calendars), or are you fine with since-start-of-process (or system, or...) time?

Lack of a nanosecond API has the advantage people calculate a conversion factor which gets them in one step from the StopWatch TimeStamp to the unit they care about.

Converting nanos <-> micros <-> millis is easier. But I agree that this would primarily be convenience oriented.

Have you observed visible/audible artifacts from the GC?

Rarely, but I hadn't considered using GC.TryStartNoGCRegion. I'll try that.

I don't believe, even with Stopwatch.GetTimestamp(), you are guaranteed nanosecond granularity. If such granularity is important, you should likely be checking Stopwatch.Frequency yourself and accounting for any variations.

Interesting point. Most systems change frequency frequently as load changes. How does Java manage to do it in System.nanoTime()?

Spec question - do you need an actual timestamp (that is, convertible to human time/calendars), or are you fine with since-start-of-process (or system, or...) time?

I think ideally we would combine both. Timestamp is useful for those that need it, but when the system doesn't support it fully (or the granularity is bad), we could maintain since-start-of-process and add this to the timestamp-start-of-process.

Interesting point. Most systems change frequency frequently as load changes. How does Java manage to do it in System.nanoTime()?

Timestamp frequencies are almost never tied to the CPU frequency (anymore). They are generally done using an independent clock that updates monotonically and at a static frequency, regardless of the CPU frequency.

I think ideally we would combine both. Timestamp is useful for those that need it, but when the system doesn't support it fully (or the granularity is bad), we could maintain since-start-of-process and add this to the timestamp-start-of-process.

You're not getting both out of the same method, though. Stuffing a nanosecond-precise timestamp into a long limits it to ~300 years (assuming signed).

The reason I'm asking, though, has more to do with intended use; for something like what you're doing, I really doubt that either end of the system needs an actual human-relatable timestamp, as opposed to a monotonic tick source.
Note also that if you're getting a human-relatable timestamp, you _usually_ (but not always) want the clock to be updated to match some external source (ie, NTP). For what you're trying to do, I'm assuming clock updates would have poor effects on things. I doubt that any modern client implementations perform any of the known pathological behaviors (say, actually setting the clock back in time), as opposed to slightly slewing time as required, but it's possible...

Rarely, but I hadn't considered using GC.TryStartNoGCRegion. I'll try that.

Probably you want to PWM for a long time? This method allows you to postpone GC but eventually you need to GC.

To escape the GC the software PWM could be implemented in a native lib which has its own thread that can't be paused by .NET Core.

Probably you want to PWM for a long time? This method allows you to postpone GC but eventually you need to GC.

It's on the order of minutes, maybe a low number of hours. But my allocations are reasonably small and there's enough RAM in the microcontroller, so maybe I can get away this time?

The reason I'm asking, though, has more to do with intended use; for something like what you're doing, I really doubt that either end of the system needs an actual human-relatable timestamp, as opposed to a monotonic tick source.

I didn't realize this limited us to 300 years. Since-start-of-process definitely sounds better then. Java seems to do something similar https://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime().

...note that, for me at least, QueryPerformanceFrequency (which is what drives this kind of thing on Windows) returns 10000000... which is 100ns. Which means that I'm not getting nanosecond-precise time regardless, and starting a Stopwatch is the easiest thing for me to do. Given the way Windows is querying the hardware, I doubt Linux would be able to do any better.

I don't have an RPi or similar, so I can't check if they can do any better, but given the discussion the MSDN pages go into, I also doubt they can do any better there, either... Possible if it had a dedicated clock board, but that strikes me as overkill in most situations.

Which means that I'm not getting nanosecond-precise time regardless, and starting a Stopwatch is the easiest thing for me to do.

Do you mean that (Stopwatch doesn't support nanosecond precision either) or that (it does, but QueryPerformanceFrequency doesn't)?

@shravan2x for your PWM use-case do you need nanosecond precision?

QueryPerformanceFrequency may support nano-second precision, or it may not (Stopwatch uses QueryPerformanceFrequency and QueryPerformanceCounter behind the scenes).

QueryPerformanceFrequency will get the high-resolution performance counter frequency for your hardware (which means it can and will vary from machine to machine). If the call to QueryPerformanceFrequency fails (which will only happen on Win2k and earlier, Stopwatch will fall back to using DateTime.UtcNow.Ticks.

If you actually need nanosecond resolution, you probably need some piece of dedicated hardware. You also probably need some dedicated OS/etc, as there are likely too many factors to guarantee that your program will even execute once every x nanoseconds (given that most OS use a round-robin scheduling system that defaults to a specific time-slice per process).

Given that it sounds like you are running on a Raspberry PI, you will almost certainly not get (or need) nanosecond precision and using/working with the high-resolution performance counter that comes with your hardware is likely more than sufficient.

for your PWM use-case do you need nanosecond precision?

For my current project, I was able to get away without it. But I can imagine future projects where I might not be able to.

You also probably need some dedicated OS/etc, as there are likely too many factors to guarantee that your program will even execute once every x nanoseconds (given that most OS use a round-robin scheduling system that defaults to a specific time-slice per process).

I solve this part using core pinning and no yielding. Since the microcontroller is dedicated to running this program, I simply use a while (true) loop and continue; if enough time hasn't passed.

Given that it sounds like you are running on a Raspberry PI, you will almost certainly not get (or need) nanosecond precision and using/working with the high-resolution performance counter that comes with your hardware is likely more than sufficient.

I think exposing an API that works with units of time is more straightforward than ticks. Maybe a potential API could use the highest resolution clock available and add another method long GetResolution() that returns the number of ns to which accuracy is available?

Maybe a potential API could use the highest resolution clock available and add another method long GetResolution() that returns the number of ns to which accuracy is available?

@shravan2x, that is exactly what Stopwatch.IsHighResolution indicates support for and what Stopwatch.Frequency/Stopwatch.GetTimestamp() allow you to get (all are static methods).

  • IsHighResolution indicates the hardware has some kind of high-resolution monotonic counter
  • Frequency reports the resolution of that counter
  • GetTimestamp() gives you the current value of that counter

The Stopwatch instance methods use these values and normalize them to 100ns units of time (a tick), regardless of the actual frequency. If you don't want the normalized values, then you can use the raw values and normalize them yourself.

TimeSpan (and DateTime) also operates on 100ns units of time, so you can directly convert Stopwatch.ElapsedTicks to a TimeSpan.

If you use the raw values and normalize them to some other unit of time, you would also need to renormalize them to 100ns "ticks" to use TimeSpan (or DateTime)

Since the microcontroller is dedicated to running this program

You are probably using a Linux distro or Windows 10 IoT on your RPi 3, which isn't a microcontroller.

Only .NET MF and LLILUM run on microcontrollers (closest to bare metal you can get with C#).

x86 machines implement a high resolution (more than BCM2837 at least) dedicated timer which Windows and Linux have APIs for (which are consumed by .NET Core for System.Diagnostics.Stopwatch).

But ARM itself doesn't require a high performance hardware timer as part of the core spec.
However, the BCM2837 microprocessor (ARMv8 architecture) in your RPi 3 has a hardware timer with a resolution of a microsecond (1MHz clock speed) implemented - (but the actual precision will definitely be lower than the resolution because it takes some units of time, depending on your hardware, to access the counter value itself.)

So I'm guessing that QueryPerformanceFrequency on Windows 10 IoT running on RPi 3 should return the equivalent of 1000 ns/1us (1,000,000) instead of 100ns (10,000,000) like it does on most x86 Windows machines. I don't have a RPi3 so I can't check for myself.

.NET Core System.Diagnostics.Stopwatch on Unix calls clock_gettime, which is implemented by RPi Linux kernel to use the BCM2837 HW Timer I mentioned above, just like on x86.

Stopwatch.Frequency will still return 10^9 (i.e., 1ns) on Rasbian even though it is actually 10^6

solve this part using core pinning and no yielding. Since the microcontroller is dedicated to running this program, I simply use a while (true) loop and continue; if enough time hasn't passed.

This is not an Arduino-like/MCU environment. Processor affinity/thread yielding still puts the OS scheduler in control of the execution, not the thread. Modern OSes have a preemptive multitasking model, not cooperative. It might also consume more power.

Your code in the current setup will be executed by the .NET Core (not compiled to native) on a GPOS (i.e., not RTOS) on a x86/x64/ARM system. There is a lot of overhead to be considered here at every stage (.NET CLR, OS, Hardware).

So, generating a software PWM signal with nanosecond accuracy is not realistically possible, nor is using such a signal from your software environment.

You can have nanoseconds instead of microseconds as the unit of time in your 64 bit data structures, but the last few bits will likely be padded with 0s depending on the hardware and APIs used because there simply isn't enough accuracy or resolution available for reliable timekeeping.

The documentation for QPC(QueryPerformanceCounter ) using TSC (TimeStampCounter) in Windows calculates a resolution of ~333 nanoseconds for a x86 PC with a 3GHz processor clock speed (~30ns if you use the assembler instruction directly). QPC is Windows only, but .NET Core Stopwatch (AFAIK) uses it on Windows and most Linux kernels use TSC in their time implementations which .NET Core uses.

The closest you'll get to a nanosecond accuracy is by counting processor cycles using your own kernel module like this to access the counter directly or by using the ARM PMU API but that's a bit out of scope for .NET, I think.

I don't see a point in adding an API for something that is so specialized and misleading, especially if more useful stuff like specialized collections isn't going to be added to corefx.

How does Java manage to do it in System.nanoTime()?

System.nanoTime() has this warning in the API docs:
" This method provides nanosecond precision, but not necessarily nanosecond resolution (that is, how frequently the value changes) - no guarantees are made except that the resolution is at least as good as that of currentTimeMillis()."

System.nanoTime() is also implemented using the same APIs that Stopwatch uses on Windows and Linux both so the value will be same as Stopwatch.

These articles are very informative, I suggest you read them if you want to know more about timers:

@tannergooding

AFAIK SystemNative_GetTimestampResolution always returns 10^9 as resolution on Unix instead of using implementation specific clock_getres() (if HAVE_CLOCK_MONOTONIC is defined)

However not all implementations define it accurately - clock_getres() returns 1ns on RPi but the clock_gettime() implementation uses BCM 1MHz timer which has 1us resolution like I mentioned in my comment above.

I think the advice should be to not rely on clock_getres() or Stopwatch.Frequency at all since the clock precision might be assumed wrongly. Should I open a new issue for this, or is this intended behavior?

AFAIK SystemNative_GetTimestampResolution always returns 10^9 as resolution on Unix instead of using implementation specific clock_getres() (if HAVE_CLOCK_MONOTONIC is defined)

Regardless of the underlying resolution, clock_gettime returns a timespec that is specified in terms of nanoseconds, so the code is still correct. The code could be made slightly more complicated (and slower) if reporting the actual resolution and returning a correctly normalized timestamp was desired, but that doesn't buy much (IMO).

clock_getres() returns 1ns on RPi

The Linux implementation for clock_getres(CLOCK_MONOTONIC) (which is posix_get_hrtimer_res) will always return 1 once the high resolution timer has been successfully configured/enabled (likely for the same reason as above, which is that timespec is specified in terms of nanoseconds).

I'd like to add another reason for such an API to exist: it's working with multimedia and/or VoIP stuff. For example when playing voice over network, you need to provide audio packets in really proper intervals, or you get tearing.

So for windows the proper way to do this is to use Multimedia Timers.

But what to do for Linux? We will have to find while porting our app to dotnetcore and Linux, but It surely would be great to have an unified API.

@mikhail-barg, could you please elaborate?

Most modern applications (especially games, etc) utilize the underlying monotonic hardware timers for the system. This is QueryPerformanceFrequency and QueryPerformanceCounter on Windows, clock_gettime on Linux, and mach_absolute_time on macOS (and typically have resolution up to 1ns). The Multimedia Timers, to my knowledge, are considered legacy and only have up to 1ms resolution.

.NET exposes these values directly via the static Stopwatch.Frequency and Stopwatch.GetTimestamp methods (and as of the current .NET Core iterations, only supports Stopwatch.IsHighResolution being true.

Stopwatch.Frequency reports the number of "monotonic ticks" per second and Stopwatch.GetTimestamp reports the number of "monotonic ticks" that have elapsed since some point (generally computer boot). Using these two methods you can determine the number of seconds elapsed via:

long startTicks = Stopwatch.GetTimestamp();
// ...
long endTicks = Stopwatch.GetTimestamp();

long deltaTicks = end - start;
double elapsedSeconds = deltaTicks / (double)Stopwatch.Frequency;

The Stopwatch instance methods take care of normalizing the system ticks into DateTime/TimeSpan ticks, which are 1 tick per 100ns (which basically just requires another step or two on top of the above: https://source.dot.net/#System.Private.CoreLib/Stopwatch.cs,d42cb3f35f8b35dc).

@tannergooding

The Multimedia Timers, to my knowledge, are considered legacy and only have up to 1ms resolution.

As far as I understand, it's not that Multimedia Timers are deprecated. More like they have significant cost in terms of CPU usage and they were frequently abused for other needs besides media processing because of their high precision. See for example this. But up until now we found no way to get around using those for media processing.

.NET exposes these values directly via the static Stopwatch.Frequency and Stopwatch.GetTimestamp methods

That's really nice that Stopwatch could be used to measure time intervals, but the real need I'm talking about is to have a callback called at very precise intervals. And it's not clear how to create a timer based on Stopwatch that would have such a guarantee. See for example this attempt.

the current .NET Core iterations, only supports Stopwatch.IsHighResolution being true

That's nice to know, thanks.

@mikhail-barg

I wonder how you deal with GC pauses?

The hardware audio device has a buffer of a couple of ms which you need to keep filled. You don't need nanosecond precision to do this.
You do need a good timer (like StopWatch) to calculate clockdrift between sender and receiver.

I wonder how you deal with GC pauses?

Don't allocate when you are doing latency sensitive operations 😉 If you don't allocate you won't pause.

And when your app performs latency sensitive operations all the time, that is: never allocate.

the real need I'm talking about is to have a callback called at very precise intervals

@mikhail-barg, The point I was trying to make is that the Multimedia timers aren't very precise. That is, they only have accuracy up to 1 millisecond and while which might be suitable for some purposes, it won't be for all (in fact, called out above, people are asking for nanosecond resolutions, which is only possible via the mechanisms we already expose).

QueryPerformanceFrequency/Counter on the other hand are considered "high resolution" and have tick count that is less than 1 microsecond. https://docs.microsoft.com/en-us/windows/win32/sysinfo/acquiring-high-resolution-time-stamps goes into some of the differences between the various timing APIs.

That's really nice that Stopwatch could be used to measure time intervals

The underlying methods used by Stopwatch are frequently used for dispatching events at specific intervals. You just have to track the delta between each "frame" and fire when that delta accumulates to be the desired interval.
Your typical UI based application (or long running service) has some kind of loop keeping the executable alive. This loop is typically checking and dispatching events so that the main application stays alive. It is also an ideal place to track the accumulated delta and to fire off events needed at specific intervals (this is in fact how many of the other timers work under the covers).

@tmds

The hardware audio device has a buffer of a couple of ms which you need to keep filled. You don't need nanosecond precision to do this.

While this is true for hardware audio, it's not true for other scenarios, like VoIP where you need to send media via network.

@tannergooding

The underlying methods used by Stopwatch are frequently used for dispatching events at specific intervals. You just have to track the delta between each "frame" and fire when that delta accumulates to be the desired interval.
Your typical UI based application (or long running service) has some kind of loop keeping the executable alive. This loop is typically checking and dispatching events so that the main application stays alive. It is also an ideal place to track the accumulated delta and to fire off events needed at specific intervals

The thing here is that while it's possible to write a tight loop in a thread, it still would be affected by the thread scheduler running undercover. So you would be unable to assure precise timing for events.

Let's imagine you have a tight loop like this (some ugly pseudo-code):

long currentTimestamp = SuperHiresTimeSource.GetNs();
long nextTimeStamp = currentTimestamp + 20000; //20ms in future
while (currentTimestamp < nextTimeStamp)
{
   currentTimestamp = SuperHiresTimeSource.GetNs();
}

So, imagine that currentTimestamp at start is 0, nextTimeStamp is 20000.
And then at some moment of running a loop you get currentTimestamp == 19999.

So, what would be a next value set to currentTimestamp in next loop iteration?
Like what if thread re-scheduling would happen in between? Windows thread scheduler has a 16ms time slice size, so you have a real possibility of getting something like 35999, and this is really not precise at all.

Multimedia timers on the other hand do assure this precise timings.

The point I was trying to make is that the Multimedia timers aren't very precise. That is, they only have accuracy up to 1 millisecond and while which might be suitable for some purposes, it won't be for all (in fact, called out above, people are asking for nanosecond resolutions, which is only possible via the mechanisms we already expose).

While this is certainly true, I still fail to understand how to use the Stopwatch mechanisms to issue events at precise intervals, given that the looping code would be affected by thread scheduler.

While this is true for hardware audio, it's not true for other scenarios, like VoIP where you need to send media via network.

My experience tells me different. I've built low latency RTP-based voice-and video over IP systems for 5+ years.

Like what if thread re-scheduling would happen in between? Windows thread scheduler has a 16ms time slice size, so you have a real possibility of getting something like 35999, and this is really not precise at all.

I don't believe this is a concern in real world/modern applications. Games can run just fine at 144+ frames per second all while dealing with disk IO, networking, graphics rendering, audio buffering, etc (and still fully saturating a system).
Modern step timers may also take things like scheduling concerns into consideration and may trigger an event early if the elapsed time is "close enough". That is, in your case above if you have currentTimestamp == 19999 and it should trigger at 20000, a smart step timer may trigger it anyways, because you will clearly pass over the target timestamp by the next time it triggers and the less than a millisecond difference won't matter in practice. A common example of this is when dealing with vsync and rendering, where the target framerate might be 60 fps but the actual framerate is 59.94 NTSC.

Multimedia timers on the other hand do assure this precise timings.

AFAIK, the multimedia timers do not pre-empt the Windows scheduling service, instead it just creates and schedules a higher priority thread.

Well I'm having a hard time arguing, because I have only 2+ years of working with VoIP ;) And actually have no deep insight into windows scheduling.

Still what I do have, is a setup where we were trying to substitute multimedia timers with something based on Stopwatch and WaitOne calls — mostly because we are planning to move to Linux and are looking for a cross-platform solution.

For now we failed to get a stable performance. Instead we got audio tearing — not always — when there's no other media-related application is working in background. Which is rather weird by itself.

Anyway, I'd be grateful if you point me at some reliable code that could be used instead of multimedia timers. I'd be really glad to move there, for reasons explained above.

Was this page helpful?
0 / 5 - 0 ratings