Almanac.httparchive.org: Performance 2020

Created on 27 Jun 2020  Β·  75Comments  Β·  Source: HTTPArchive/almanac.httparchive.org

Part II Chapter 9: Performance

Content team

| Authors | Reviewers | Analysts | Draft | Queries | Results |
| ------- | --------- | -------- | ----- | ------- | ------- |
| @thefoxis | @dimension85 @borisschapira @estelle @zeman @rviscomi @obto @noamr @ashrith-kulai @Zizzamia @exterkamp | @max-ostapenko @dooman87 | Doc | *.sql | Sheet |

Content team lead: @thefoxis

Welcome chapter contributors! You'll be using this issue throughout the chapter lifecycle to coordinate on the content planning, analysis, and writing stages.

The content team is made up of the following contributors:

New contributors: If you're interested in joining the content team for this chapter, just leave a comment below and the content team lead will loop you in.

_Note: To ensure that you get notifications when tagged, you must be "watching" this repository._

Milestones

0. Form the content team

  • [x] Jul 6th: Project owners have selected an author to be the content team lead
  • [x] Jul 13th: The content team has at least one author, reviewer, and analyst (minimally viable team formed)

1. Plan content

  • [x] Jul 20th: The content team has completed the chapter outline in the draft doc
  • [x] Jul 27th: Analysts have triaged the feasibility of all proposed metrics

2. Gather data

  • [x] Jul 27th: Analysts have added all necessary custom metrics and drafted a PR to track query progress
  • Aug 1 - 31: August crawl
  • [x] Sep 7th: Analysts have queried all metrics and saved the output to the results sheet

3. Validate results

4. Draft content

  • [x] Nov 12th: Authors have completed the first draft in the doc
  • [x] Nov 26th: The content team has prototyped all data visualizations

5. Publication

  • [ ] Nov 26th: The content team has reviewed the final draft, converted to markdown, and filed a PR to add it to the 2020 content directory
  • Dec 9th: Target launch date
2020 chapter ASAP writing

Most helpful comment

@obto I think we'll be able to settle by end of the week. @rviscomi has been providing really helpful advice on what's possible πŸ‘ I think we're close to finalising the outline.

thanks for the pointers, @paulcalvano πŸ™Œ

@Zizzamia can you add some information to the doc about one of your ideas, namely: Element Timing API and the potential of creating custom metrics? it's not 100% my area of expertise, so if you'd like to propose how to tackle getting some data here, that'd be great.

otherwise, we have plenty of data there already with core web vitals, LH scores and comparisons of FCP + TTFB to 2019 almanac πŸ˜‡

All 75 comments

I am interested in the reviewers role for this topic - Phil

I'm afraid I'm not fluent enough in English to create the content. I can help reviewing, though.

@thefoxis thank you for agreeing to be the lead author for the Performance chapter! As the lead, you'll be responsible for driving the content planning and writing phases in collaboration with your content team, which will consist of yourself as lead, any coauthors you choose as needed, peer reviewers, and data analysts.

The immediate next steps for this chapter are:

  1. Establish the rest of your content team. Several other people were interested or nominated (see below), so that's a great place to start. The larger the scope of the chapter, the more people you'll want to have on board.
  2. Start sketching out ideas in your draft doc.
  3. Catch up on last year's chapter and the project methodology to get a sense for what's possible.

There's a ton of info in the top comment, so check that out and feel free to ping myself or @obto with any questions!

To everyone else who has been nominated:

@logicalphase
@Zizzamia
@noamr
@dimension85

we'd still love to have you contribute as a peer reviewer or coauthor as needed. Let us know if you're still interested!

@dimension85 @borisschapira thank you! I've added you both as reviewers.

I can also help out.

Thanks Estelle! I've also added you as a reviewer.

Excellent, thanks for adding me to the team - this is the first time I have been involved with this project and am looking forward to contributing.

Would love to co-author this section. Please let me know if I can help.

Happy to review again this year!

An all-star team again this year. Really excited for this πŸŽ‰

On Wed, 1 Jul 2020 at 21:52 Rick Viscomi notifications@github.com wrote:

@thefoxis https://github.com/thefoxis thank you for agreeing to be the
lead author for the Performance chapter! As the lead, you'll be responsible
for driving the content planning and writing phases in collaboration with
your content team, which will consist of yourself as lead, any coauthors
you choose as needed, peer reviewers, and data analysts.

The immediate next steps for this chapter are:

  1. Establish the rest of your content team. Several other people were
    interested or nominated (see below), so that's a great place to start. The
    larger the scope of the chapter, the more people you'll want to have on
    board.
  2. Start sketching out ideas in your draft doc
    https://docs.google.com/document/d/1EeUJ88PS8Ms9XUrNIpM2tpm5Ad_K682ZMVfc24TM02w/edit?usp=sharing
    .
  3. Catch up on last year's chapter
    https://almanac.httparchive.org/en/2019/performance and the project
    methodology https://almanac.httparchive.org/en/2019/methodology to
    get a sense for what's possible.

There's a ton of info in the top comment, so check that out and feel free
to ping myself or @obto https://github.com/obto with any questions!

To everyone else who has been nominated:

@hyperpress
@Zizzamia https://github.com/Zizzamia
@noamr https://github.com/noamr
@dimension85 https://github.com/dimension85

we'd still love to have you contribute as a peer reviewer or coauthor as
needed. Let us know if you're still interested!

Great! Would love to review.

β€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/HTTPArchive/almanac.httparchive.org/issues/905#issuecomment-652589289,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAAVT5QSPOAQUHH4N2SRL4TRZOAYLANCNFSM4OJ2CP2A
.

I would like to be a reviewer for this topic.

Happy to join as a reviewer, looking forward to collaborate with the all Content team. 🌲 πŸš€ πŸŒ•

πŸ‘‹ hi everyone!

thanks again @rviscomi and @obto for selecting me to lead this effort. looks like we have a solid team of reviewers but we're short on analysts. should we be looking for people swapping areas of responsibility or look for new people to join?

I'm going to re-read last years chapter and reflect on what has happened in perf within the year. I think it'd be nice to speak to some new developments in the space, for example core web vitals or the shift in performance score algorithm. but it also depends on what sort of trends/data we're able to discover via the archive. I'll jot some notes down in the google doc within the next couple days. JavaScript is also always good to cover since it affects perf/UX greatly, so we could be looking at TBT & TTI. none of these were really covered last year so there would be no duplication (the previous report mentioned FID though, which is only relevant to people using RUM solutions/not Lighthouse).

if anyone has any ideas/suggestions, I'm all πŸ‘‚

@thefoxis yeah I agree 100% ! πŸ”₯

Other points on top of my mind are:

What else? πŸ€”
One thing I am kind of curious to reflect is the different implications of TBT and LCP performance between SPA and SSR, and how websites are handling those performance trade-offs.

That's it for now, but I am sure the rest of the team has more angles we could deep dive πŸŒ²πŸš€πŸŒ•

Count me in as an analyst ;)

@max-ostapenko Added you as an analyst to the chapter :)

I added some notes to the doc clarifying my earlier points about lighthouse scoring and core web vitals. it's a bit hard to determine specific queries without seeing the data (I'll do more digging) but I think these will yield interesting findings.

I see a lot of questions surrounding the perf score, where it usually falls and now especially the new algorithm impact on the score. it's really a huge source of confusion to teams and I'd love to cover that first.

I have some reservations with core web vitals but I think with all the attention on them and their importance it'd be fascinating to uncover where those metrics usually land, especially by device type. I've been observing this data myself and you can see really fascinating findings in some cases.

@Zizzamia I also added your point about the element timing api as I think it ties nicely with general theme being new metrics and new way of measuring. do you have any further ideas and notes there? feel free to add to the doc :+1: we can also consider other subjects you've mentioned but I thought this particular one tied very obviously to the rest so I picked it first 😸

if anyone has any feedback or suggestions at this stage, let me know! πŸ™Œ

@thefoxis - this looks great.

I believe most of the 2019 Performance chapter used Chrome User Experience Report data, but I agree that including Lighthouse scoring and web vitals would be great to include in the chapter.

A while back I wrote a discussion forum post on analyzing lighthouse scores at a high level, which you can see here. https://discuss.httparchive.org/t/analyzing-lighthouse-scores-across-the-web/1600

The lighthouse data in the archive contains all the details for each individual metric as well. Here's an example of @rviscomi extracting the scoring details for a single audit here - https://discuss.httparchive.org/t/how-and-where-is-document-write-used-on-the-web/1006.

Happy to help if you are struggling with what data is available. For Lighthouse in particular, sometimes it helps to browse a JSON report to get an idea of what can be queried

@thefoxis You may have already, but here's a link to the metric results for last years performance chapter if you want to get an idea of what's possible: https://docs.google.com/spreadsheets/d/1zWzFSQ_ygb-gGr1H1BsJCfB7Z89zSIf7GX0UayVEte4/edit?usp=sharing

But also want to second what Paul said above. It's super helpful to look at the raw data that's being collected (and what isn't but should be). More than than happy to help you here as well.

Hey @thefoxis, looks like things are moving along pretty smoothly. Is there anything you need from me to keep things moving forward, and have the chapter outline and metrics settled on by the end of the week?

Hi @obto and @thefoxis, I would like to get my hands dirty in the analysis if you are still looking :)

@dooman87 marked you down as an analyst for the chapter :)

@dooman87 We also have many other chapters still looking for analysts. If any of them interest you, we'd love to have you help out there as well!

@obto I think we'll be able to settle by end of the week. @rviscomi has been providing really helpful advice on what's possible πŸ‘ I think we're close to finalising the outline.

thanks for the pointers, @paulcalvano πŸ™Œ

@Zizzamia can you add some information to the doc about one of your ideas, namely: Element Timing API and the potential of creating custom metrics? it's not 100% my area of expertise, so if you'd like to propose how to tackle getting some data here, that'd be great.

otherwise, we have plenty of data there already with core web vitals, LH scores and comparisons of FCP + TTFB to 2019 almanac πŸ˜‡

I'm interested in being a reviewer, not sure if this is already filled up :-)

@thefoxis What's the plan to start with queries? Are we going to create a ticket for each metric in the doc? Sorry for noob questions - trying to get sense of the workflow :)

@dooman87 The Chapter document is where each metric will be listed and progress will be tracked. Right now the next step is for analysts to look through all the metrics which have been proposed and verify they are possible to query.

Sorry for noob questions

No worries! In fact, @bazzadp just made a fantastic post about good next steps for analysts to take here: https://github.com/HTTPArchive/almanac.httparchive.org/issues/914#issuecomment-659205330 . I think you'd find it super helpful.

Also don't forget to join our #web-almanac slack so @paulcalvano can invite you to our Analysts channel. It's a great place to ask questions :)

Hi, I'm an Analyst in the SEO chapter (so is your @max-ostapenko). I've been practising my SQL for a common area, Core Web Vitals by device. I got it mostly working and then checked your own and saw I'd gone the same way πŸ‘

I thought it might be helpful in sharing? We're also wanting to look into this data by country. It should not need much rework to do that.

@dooman87 I've also written up a short summary of the recommended analysis workflow in the Analysts' Guide.

Hey @Tiggerito, yes it's a good idea to reuse queries whenever possible and this seems like a perfect opportunity for that. I'd suggest starting the draft PR described in the workflow above and adding the Core Web Vitals queries so we can take a look and add comments as needed.

Hey @Tiggerito, yes it's a good idea to reuse queries whenever possible and this seems like a perfect opportunity for that. I'd suggest starting the draft PR described in the workflow above and adding the Core Web Vitals queries so we can take a look and add comments as needed.

Instructions made sense. Hopefully I got it right.

@dooman87 I've also written up a short summary of the recommended analysis workflow in the Analysts' Guide.

That's look awesome, thanks! Just to check my understanding, we are going to have one branch/PR where all analysts for particular chapter will work, right? It would make sense for me given we will split work and shouldn't step on each other toes.

@thefoxis, I was reflecting on what we could monitor around Element Timing API and I think the best option we have is to take a couple of steps back and instead measure how many websites have the new PerformanceObserver in their JS minified files. Looking at this could help us understand how many websites are doing Performance Field Data measurements, which could be related to the use of some sort of field data library from Calibre, Web Vitals, Perfume.js, etc.

Then we can open up the conversation to other recent opportunities like measuring Element Timing API and we can mention all the other field data options.

@thefoxis @max-ostapenko @dooman87 do you think this is something we could add to the metrics?

@Zizzamia as I mentioned in the doc, I think at this point it's a bit late to add this to the scope since we're already working on analysing the data and I committed to a certain amount of work :) I'm happy to mention what was suggested in the doc but I don't want to commit to another sub-chapter at this point.

@thefoxis @max-ostapenko @dooman87 for the two milestones overdue on July 27 could you check the boxes if:

  • the outline has been reviewed and all feasible metrics have been identified
  • any necessary custom metrics have been created and you've created a draft PR to track which feasible metrics have had their queries implemented (we've updated the milestone description to clarify this)

Keeping the milestone checklist up to date helps us to see at a glance how all of the chapters are progressing. Thanks for helping us to stay on schedule!

I am interested in the reviewers role for this topic if we still have an open spot.

Hi @Soham-S-Sarkar. This chapter has 9 reviewers already, but I'll defer to @thefoxis if more are needed.

Meanwhile you can browse the chapter's outline to get a sense for the content direction.

@noamr @ashrith-kulai please request edit access to the planning doc to start your review of the initial outline.

thanks for interest @Soham-S-Sarkar but I think at this point we do not need more reviewers. it will be hard to process feedback with even more people involved.

@obto I won't be making any changes to the outline at this point, unless there are some metrics/data we can't obtain or the data proves non-conclusive/useful. the draft pr looks mostly complete but I defer to @max-ostapenko and @dooman87 to be able to provide more accurate progress status πŸ‘

I still need to finish my requests related to LH performance score. Will work on them this week.

Hi performers (is that right?),

The SEO chapter is wanting to get some Core Web Vitals data. You're web_vitals_by_device.sql covers the base of their requirements.

They are also interested in a report related a pages overall CWV score. I believe the overall score for a page is the worst case of good/ni/poor for all the factors.

Are you gathering CWV overall page scores at all?

I've updated the chapter metadata at the top of this issue to link to the public spreadsheet that will be used for this chapter's query results. The sheet serves 3 purposes:

  1. Enable authors/reviewers to analyze the results for each metric without running the queries themselves
  2. Generate data visualizations to be embedded in the chapter
  3. Serve as a public audit trail of this chapter's data collection/analysis, linked from the chapter footer

@rviscomi @obto do we need to move some deadlines around because of the webpagetest snafu with throttling? since we'll be waiting for september data to come in, instead of finishing off with the august dataset. I can still maybe start writing some parts before the data is fully available but at the end of the day, we'll need to full analysis. thoughts?

Why do we need to go forward to September? Why not go backwards to July?

Was Lighthouse 6 fully rolled out for July run? I think it was but could be wrong. Obviously you’d want that for this chapter so may have to wait if it was only half rolled out to the HTTP Archive crawlers for this run.

Are there any custom metrics for performance added for August run? If so they obviously won’t be there for July but can’t imagine there would be any for Lighthouse part anyway.

So maybe could get started on that and then rerun stuff for September to confirm your findings are still accurate and perhaps update the relevant stats for that?

Though when did the bug come into play? I was assuming only for August crawl as that’s when it was noticed but maybe affected July too (is it a Lighthouse 6 bug?) in which case going back a month is obviously a non-runner.

Was Lighthouse 6 fully rolled out for July run? I think it was but could be wrong. Obviously you’d want that for this chapter so may have to wait if it was only half rolled out to the HTTP Archive crawlers for this run.

From what I can see in sample data, which I believe was generated from July run, there is a mix of version 5 and 6.

@bazzadp has a good point. @dooman87 is also correct that there's a mix of versions, but the majority of tests are using v6:

version | count
-- | --
6.1.1 | 4,506,136
6.1.0 | 1,825,717
5.6.0 | 16,600

The offending bug (https://github.com/WPO-Foundation/wptagent/pull/366) was merged on July 21 and AFAICT the July crawl ended on the ~23rd, so assuming the HTTP Archive agents were immediately updated (@pmeenan is that true?) the bug would have only affected a small proportion of the crawl.

@thefoxis I'll defer to you if you want to use the July crawl, which would contain some buggy and older-versioned Lighthouse data. If you opt for the September crawl, we can absolutely move the deadlines to a more realistic date.

Yes, the agents update the agent code hourly and the Lighthouse code daily.

@rviscomi I'm easy! I think I'd defer to you to make the decision on which dataset would be most reliable (since you have much more experience here than I do 😸 ). happy to go either way πŸ‘

How about we pull only the v6 data from July, which would contain some of the unthrottled results (unless there's a way to detect and filter those out) and include that analysis in the initial draft, then we rerun the analysis when the September crawl is complete and update the draft as needed?

@thefoxis FYI this analysis by the Lighthouse team is closely aligned with the data you were looking to get: https://github.com/GoogleChromeLabs/lh-metrics-analysis/blob/gh-pages/reports/monthly/2020-07/report.md

cc @brendankenny

Yes! And I believe this backs up the patterns we all were seeing both in the overall distribution and what individual sites were experiencing.

Looking at the year-over-year charts (which are roughly Lighthouse 6.1 vs 5.1):

  • in the first chart, the overall distribution of scores didn't change much, actually seeing improvements of several points at the lower end of things (e.g. sites at the 10th percentile in LH 6.1 score about 5 points better than 10th percentile sites in LH 5.1):
    July 2019 vs July 2020 Performance Scores
  • In the second chart, which looks at the distribution of how scores changed for each _individual_ site, it's clear that there were significant changes for some sites, with 10% of sites losing 18 points or more, and another 10% of sites gaining 25 points or more (a lot of these changes are due to LCP's prominence in LH 6 scoring).
    July 2019 and July 2020 Performance Score differences

The discrepancy between the two breakdowns seems to be that the sites losing points are balanced out by sites gaining points in LH 6.

@max-ostapenko @dooman87 I saw that the queries PR got merged; yay! how are we going on the results sheet? or are you putting the data someplace I haven't looked? πŸ‘€

once that's there, I definitely should be kicking off the writing since it's behind the schedule 😬

@thefoxis I started putting together stat for LH. Please, have a look and let me know if it doesn't make sense. I think Max also started filling those in.

@thefoxis I've added the rest of data and the charts.

@thefoxis in case you missed it, we've adjusted the milestones to push the launch date back from November 9 to December 9. This gives all chapters exactly 7 weeks from now to wrap up the analysis, write a draft, get it reviewed, and submit it for publication. So the next milestone will be to complete the first draft by November 12.

However if you're still on schedule to be done by the original November 9 launch date we want you to know that this change doesn't mean your hard work was wasted, and that you'll get the privilege of being part of our "Early Access" launch.

Please see the link above for more info and reach out to @rviscomi or me if you have any questions or concerns about the timeline. We hope this change gives you a bit more breathing room to finish the chapter comfortably and we're excited to see it go live!

@obto thank you for letting me know! I didn't see it. I doubt I can do this by Nov 9, I've been sick for the past week and still I am not well (fortunately, it's not covid πŸ€ͺ ) so it delays my ability to write a bit. that being said, I shouldn't need a lot of time to produce a draft, I'll post here once it's ready πŸ‘

Sounds great, hope you feel better!

@thefoxis get well soon!
We are looking forward to receiving your feedback again.

@obto @rviscomi I'll have a complete draft ready sometime next week. there's a bit more to be written than I initially expected. of course, you can keep an eye on the draft in the doc, as I'm writing directly there.

@max-ostapenko @dooman87 I'd appreciate some help with the charts, since some are only representing a single metric versus complete readings and my spreadsheet skills turn out to fall a bit short here πŸ˜… I'm going off the 2019 almanac, so the charts for each metric are:

  • metric by device
  • metric by geo
  • metric by effective connection type

The connection type chart wasn't portraying quite what I wanted (as per 2019 chapter), I tried playing with it, but didn't go far. This is what I'd love to have for each metric:

Screen Shot 2020-11-11 at 1 19 45 pm

I'm not entirely sure how to generate the charts for geo + device for other metrics than LCP so if you could point me in the right direction or help out here, that'd be great 😸

Hi @thefoxis I'm excited to see this coming along! I'm also happy to help with the data viz, feel free to @ me on whichever charts you'd like help with.

@thefoxis As per your question: please check the chart here, if I understood you correctly: https://docs.google.com/spreadsheets/d/164FVuCQ7gPhTWUXJl1av5_hBxjncNi0TK8RnNseNPJQ/edit#gid=306222260
(I changed columns to the corresponding ones in a data range)

I'll appreciate if we can jump to sheet comments for exported results discussion. Please add comments on what needs adjustments and if any visualisations are missing.

@rviscomi can you point me to the CrUX's categorisations of good|needs improvement|poor? I know where those ranges live for LH/perf score calculation but I'm not quite sure if it's identical in CrUX's dataset πŸ€” I just want to be able to reference it for each metric since you can't tell from the charts.

@thefoxis the thresholds used by CrUX for the CWV are defined at https://web.dev/vitals/#core-web-vitals

We could also edit the legends of the charts to say something like "Good (<100ms)" and "Poor (>=300ms)".

@rviscomi gotcha! that's exactly what I had in mind :) what about FCP and TTFB values? where do those come from?

FYI I only have the geo sections (pending charts) + conclusion left, so if you want to read the draft as is and leave comments, you're more than welcome to! cc @obto

FCP and TTFB are more subjective and don't have official documentation. For reference the thresholds used by CrUX for fast/slow are:

FCP: 1.5s, 2.5s
TTFB: 0.5s, 1.5s

A quick word to say that I'm starting the review and posting comments on the doc.
I am French πŸ‡«πŸ‡· , and our culture teaches us to be quite direct. So I'll try to formulate things in a way that is appropriate, but I may not do it well. But I don’t mean to be rude. Even if I comment some things as food for thoughts, it's only out of constructive criticism. Because I find that the work that has been done in the data analysis and writing is already exceptional.

@borisschapira no sweat πŸ™‚ English is my second language too and the Poles are apparently also very direct so I know what you mean. any feedback appreciated! I already responded / acted on some of your comments. all good finds πŸ‘ thank you! ⚑

Thanks for looking at my comments @thefoxis! I had a small nitpick with CLS, and ended up reading the whole chapter because it was such a good read. Really well written all round πŸ’―

Thanks for the review @exterkamp. Could you add yourself to the contributors list? (alphabetical by first name)

A great read! I'm reviewing the material now, prepare for some comments :)

thanks @noamr and @exterkamp! good comments all around, I’m addressing it along @rviscomi’s feedback :) a few more bits to go, but I can see the end now πŸŽ‰ glad you enjoyed reading so far!

@obto @rviscomi I believe I addressed all existing feedback and added a conclusion. FYI I removed captions since I figured with the markdown format I'll just submit them with the PR; let me know if that's suitable. feeling pretty good about it! let me know if there's anything else.

@dimension85 @borisschapira @estelle @zeman @rviscomi @obto @noamr @ashrith-kulai @Zizzamia @exterkamp if you'd like to add any more/first thoughts to the draft, you're welcome to. I'm not sure how much ability I will have to address big shifts in content, but fixes / clarifications / smaller suggestions shouldn't be an issue with the schedule I reckon πŸ˜…

@thefoxis Sounds like a solid plan to me. I'll look at it again this evening to see if I pick up on anything before we start the editing process :)

@thefoxis I'm excited to see this moving along! Thank you for your hard work getting it to this point. Can you open a PR to submit the markdown version of the draft? (the final milestone in the initial checklist)

I've stubbed out the markdown file already with some early metadata and you can see the 2019 version for reference. I'm happy to help convert the data viz to the correct figure format, if you leave them as TODOs/placeholders in the markdown file. Please also update the chapter metadata to remove anyone who hasn't contributed, add your bio, add features quote/stats, etc.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rviscomi picture rviscomi  Β·  5Comments

AymenLoukil picture AymenLoukil  Β·  4Comments

MSakamaki picture MSakamaki  Β·  6Comments

rviscomi picture rviscomi  Β·  6Comments

rviscomi picture rviscomi  Β·  5Comments