Lighthouse: feature: Measure a warm load

Created on 11 Aug 2016 · 5Comments · Source: GoogleChrome/lighthouse

WebPagetest has different metrics like Speed Index on first view and repeat view.

capture du 2016-08-11 12-17-25

What do you think about having some metrics computed for repeat view? It would help to find out if caching strategies using SW cache are pertinent. We could know if the PWA is _Lie-Fi proof™_!!

capture du 2016-08-11 12-19-45

cc @jakearchibald

P2 feature

Source

hsablonniere

👍1

Most helpful comment

I'm NOT looking to surface data the same way that WPT does. Here's a few propositions to continue the discussion :

Add a mention that specifies that FMP, TTI, SpeedIndex... are computed for first view
Add a section about HTTP caching that would report if important metrics (FMP, TTI, SpeedIndex...) are _significantly_ improved on repeat view.
- We need to compare first view and repeat view without SW enabled. A PWA has a SW but because it's progressive, it should have the best HTTP caching strategy for browsers that don't support SW.
Add a section about SW caching that would report if important metrics (FMP, TTI, SpeedIndex...) are _significantly_ improved on repeat view.
- We need to compare first view (while SW installs) and repeat view (with SW in control).

This would also push us not to check cache settings but actual results, here's a few ideas :

total loaded size
numbers of 200 reqs that become 304
numbers of 200 reqs that directly hit the cache

These propositions raise tricky questions :

How do we defined _significative_ improvement on metrics like FMP, TTI, SpeedIndex... ?
- Do we have some kind of percentage improvement like _HTTP caching should reduce FMP by X%_?
- Do we have some kind of fixed target for each metrics like _FMP on repeat view (with HTTP cache) should reach Xms_?
- ...else...
We actually only compare first load against repeat load of a page but some cache headers impact perfs during navigation. Can we analyse this? Should we? How?
- This would be tricky since we target a specific URL.

hsablonniere on 18 Aug 2016

👍3

All 5 comments

We currently do multiple loads, we just don't surface the data that way. Our Service Worker tests in particular happen after the initial load so that it's had chance to install, and we do tests (though they can always be more robust!) to try and determine that the user would get a reasonable offline experience.

When it comes to the performance testing, we currently focus on the initial load, since in many cases this is where the vast majority of improvements normally reside. Subsequent loads tend to be a function of Service Workers, and caching headers. I _think_ we have an audit planned (and if not, we should add it!) for caching headers to make sure that they look reasonable, but the key there will be not just having them for the sake of it, i.e. just checking headers, but surfacing how they impact things like Time to Interactive, First Paint etc.

paullewis on 15 Aug 2016

I'm NOT looking to surface data the same way that WPT does. Here's a few propositions to continue the discussion :

Add a mention that specifies that FMP, TTI, SpeedIndex... are computed for first view
Add a section about HTTP caching that would report if important metrics (FMP, TTI, SpeedIndex...) are _significantly_ improved on repeat view.
- We need to compare first view and repeat view without SW enabled. A PWA has a SW but because it's progressive, it should have the best HTTP caching strategy for browsers that don't support SW.
Add a section about SW caching that would report if important metrics (FMP, TTI, SpeedIndex...) are _significantly_ improved on repeat view.
- We need to compare first view (while SW installs) and repeat view (with SW in control).

This would also push us not to check cache settings but actual results, here's a few ideas :

total loaded size
numbers of 200 reqs that become 304
numbers of 200 reqs that directly hit the cache

These propositions raise tricky questions :

How do we defined _significative_ improvement on metrics like FMP, TTI, SpeedIndex... ?
- Do we have some kind of percentage improvement like _HTTP caching should reduce FMP by X%_?
- Do we have some kind of fixed target for each metrics like _FMP on repeat view (with HTTP cache) should reach Xms_?
- ...else...
We actually only compare first load against repeat load of a page but some cache headers impact perfs during navigation. Can we analyse this? Should we? How?
- This would be tricky since we target a specific URL.

hsablonniere on 18 Aug 2016

👍3

from the duped issue:

It may be useful to separate performance testing in two:

1) when the web-app is "first run" (before/when Service Worker is registered)
2) repeated run, after Service Worker registration (all resources are cached by the service worker in the first run)

paulirish on 13 Oct 2017

👍2

Huge +1. In fact, I think this should be implemented as scripting, like WPT does. While there is potential for improvements on a cold load, the nature of modern web means that there are many new pathways into a webapp, and cold, initial load isn't always relevant, or not the most relevant. For example, measuring (and demonstrating) and optimizing the speed of:

Installed WebAPKs
AMP page where the AMP runtime is already cached (which it should generally be)
AMP page when pre-rendered by Search
"PWA" where the service worker has been warmed up by . One partner I work with has AMP SEM landing pages and then a conversion form flow that can't be AMPified, but 90% of the time comes sequentially after the AMP LP. In this case it's important to see the value of and focus optimization effort on those things that will actually affect the majority of (Chrome) users.

There are a lot of scenarios to deal with here, so I'd like to see scripting like WPT. Maybe less complex, where you can perform some actions, and then start the lighthouse test without clearing the cache. Something as easy as:

LOAD https://site.com/landingpage.amp
WAIT DOMContentLoaded
WAIT SW-INSTALLED
START_LIGHTHOUSE https://site.com/form_flow_app CACHE_CLEAR=NO

But probably something that looks more Pupeteer than QuickBasic.

@vinamratasingal As discussed... but this came up again today so wanted to surface it here too.