Sentry-javascript: Timing issues using Performance API

Created on 14 May 2020  路  9Comments  路  Source: getsentry/sentry-javascript

_There could have been three separate issues since each browser has a slightly different misbehavior, though I decided to group everything together as the solution will likely address all problems at once._

tl;dr

Assumptions made in how the SDK computes timestamps are challenged by browsers. We need to revisit the implementation.

Introduction: How span duration and timestamps are computed

The SDK uses performance.now() to calculate span durations with high precision. When sending data to Sentry, we need to convert those durations to absolute timestamps, and in order to keep relative times accurate, we use a single reference timestamp. The preferred reference timestamp is performance.timeOrigin and the fallback is performance.timing.navigationStart.

When the Performance API is not available, the SDK eventually uses Date.now().

The assumption is that reference_timestamp + performance.now() should be close to new Date(), except when the computer's clock changed or the current tab went into background during a timing measurement (start/end span). The idea was that timestamps would retain relative correctness.

Firefox: timeOrigin != navigationStart

_Tested with Firefox 75.0 on macOS._

The equivalence between performance.timeOrigin and performance.timing.navigationStart is discussed in https://github.com/w3c/navigation-timing/issues/43.

However, https://bugzilla.mozilla.org/show_bug.cgi?id=1401406 reports a case in which those clocks do not agree. I've observed this behavior today.

image

  • 馃憤 navigationStart + performance.now() is aligned with the current date and time.
  • 馃憥 timeOrigin is stuck at yesterday's date, even when I create new browser tabs.
  • 馃憥 The odd timeOrigin in Firefox produces events with incorrect timestamps (arbitrarily in the past).

Safari: navigationStart + now far in the past

_Tested with Safari 13.1 (15609.1.20.111.8) on macOS_

Since Safari doesn't support performance.timeOrigin, the SDK falls back to performance.timing.navigationStart.

image

  • 馃憥 navigationStart + now is over 13 hours off from the current date.

Chrome: timeOrigin + now far in the past

_Tested with Chrome 81.0.4044.138 on macOS_

image

  • 馃憤 timeOrigin and navigationStart are the same.
  • 馃憥 timeOrigin + now is over 24h in the past.

Most helpful comment

I have experimented with different workarounds, so far it seems to be best to go back to Date.now().

A more complex implementation would use Date.now() and performance.now() to keep track of both timers and make adjustments when a transaction is finished, readjusting all spans, but that complexity doesn't seem justifiable.

All 9 comments

All 4 are correct for me on Chrome 83 / macOS.

I don't know if this is useful or not, but I could reproduce this issue on Firefox on macOS (10.15.5). I have now updated it to the latest version (Developer Edition), and the issue seems to be gone.

The difference in timing was 12 days.

I have the same problem with sentry (Same symptoms as https://github.com/getsentry/sentry-javascript/issues/2679) and I don't think that timeOrigin + performance.now() or navigationStart + performance.now() can ever be trusted to be the current date.

The easiest way to reproduce it is to run a script to log the values and how they differ from Date.now() on a computer and put it to sleep. At the sleep wakeup the date computed from the performance subsystem is wrong on all browsers, and for both timeOrigin and navigationStart.

Results (Tested on MacOS 10.15.5) for each browser I kept the values just before sleep and just after, the diff/diff2 fields show how much the time computed from performance drift from Date.now :

// Chrome 83
{
    dateNow: 1594219315498,
    diff: 11.7451171875,
    diff2: 7.23193359375,
    navigationStart: 1594219230275,
    perfNow: 85234.7450000234,
    precisionNow: 1594219315509.745,
    precisionNow2: 1594219315505.232,
}
{
    dateNow: 1594219353909,
    diff: -35292.705078125,
    diff2: -35297.21826171875,
    navigationStart: 1594219230275,
    perfNow: 88341.29500004929,
    precisionNow: 1594219318616.295,
    precisionNow2: 1594219318611.7817,
}

// Firefox dev edition 78
{
    dateNow: 1594219316904,
    鈥媎iff: 11,
    鈥媎iff2: 3,
    鈥媙avigationStart: 1594219244796,
    鈥媝erfNow: 72119,
    鈥媝recisionNow: 1594219316915,
    鈥媝recisionNow2: 1594219316907,
}
{
    dateNow: 1594219355211,
    diff: -35293,
    diff2: -35301,
    navigationStart: 1594219244796,
    perfNow: 75122,
    precisionNow: 1594219319918,
    precisionNow2: 1594219319910,
}

// Safari 13.1.1
{
    dateNow: 1594219315204,
    diff: 9,
    diff2: NaN,
    navigationStart: 1594219185914,
    perfNow: 129299,
    precisionNow: 1594219315213,
    precisionNow2: NaN,
}
{
    dateNow: 1594219354491,
    diff: -35295,
    diff2: NaN,
    navigationStart: 1594219185914,
    perfNow: 133282,
    precisionNow: 1594219319196,
    precisionNow2: NaN,
}

Code:

function dbg() {
    const perfNow = performance.now();
    const dateNow = Date.now();
    const navigationStart = performance.timing.navigationStart;
    const precisionNow = navigationStart + perfNow;
    const dateNowValue = dateNow.valueOf();
    const diff = precisionNow - dateNowValue;
    const timeOrigin = performance.timeOrigin;
    const precisionNow2 = timeOrigin + perfNow;
    const diff2 = precisionNow2 - dateNowValue;
    console.log({diff, diff2, precisionNow, precisionNow2, dateNowValue, dateNow, perfNow, navigationStart });
}

setInterval(dbg, 3000)

Bumping this -- we're seeing bugs triggered from firefox as coming from ~71 hours in the past.

Screen Shot 2020-08-13 at 1 47 28 PM

I found https://github.com/w3c/hr-time/issues/65. Over a year ago, a note about the behavior of monotonic clocks and performance.now() was added to the spec, but seems that browsers don't follow it. Even if they do in the future, we cannot trust performance.now() to derive correct timestamps.

OpenTelemetry tracks the same issue in their JS SDK in https://github.com/open-telemetry/opentelemetry-js/issues/852.

This blog post https://medium.com/@mihai/performance-now-sleep-16a5b774560c comes with a web page that can easily demo the skewing of the clocks when the computer sleeps.

http://persistent.info/web-experiments/performance-now-sleep/

I have experimented with different workarounds, so far it seems to be best to go back to Date.now().

A more complex implementation would use Date.now() and performance.now() to keep track of both timers and make adjustments when a transaction is finished, readjusting all spans, but that complexity doesn't seem justifiable.

Since some folks are following here, another update:

A difficulty with using Date.now() alone is that we still depend on the Performance API to compute timestamps for performance entries (all those spans: browser, resource, mark, paint, etc):

image

Having two clocks that are out of sync is no good. For example, we need to decide what performance entries are part of a navigation transaction.

Was this page helpful?
0 / 5 - 0 ratings