Almanac.httparchive.org: Third Parties 2020

Created on 27 Jun 2020  ·  24Comments  ·  Source: HTTPArchive/almanac.httparchive.org

Part I Chapter 6: Third Parties

Content team

| Authors | Reviewers | Analysts | Draft | Queries | Results |
| ------- | --------- | -------- | ----- | ------- | ------- |
|@simonhearne | @tammyeverts @jzyang @exterkamp | @max-ostapenko | Doc | *.sql | Sheet |

Content team lead: @simonhearne

Welcome chapter contributors! You'll be using this issue throughout the chapter lifecycle to coordinate on the content planning, analysis, and writing stages.

The content team is made up of the following contributors:

New contributors: If you're interested in joining the content team for this chapter, just leave a comment below and the content team lead will loop you in.

_Note: To ensure that you get notifications when tagged, you must be "watching" this repository._

Milestones

0. Form the content team

  • [x] Jul 6th: Project owners have selected an author to be the content team lead
  • [x] Jul 13th: The content team has at least one author, reviewer, and analyst (minimally viable team formed)

1. Plan content

  • [x] Jul 20th: The content team has completed the chapter outline in the draft doc
  • [x] Jul 27th: Analysts have triaged the feasibility of all proposed metrics

2. Gather data

  • [x] Jul 27th: Analysts have added all necessary custom metrics and drafted a PR to track query progress
  • Aug 1 - 31: August crawl
  • [x] Sep 7th: Analysts have queried all metrics and saved the output to the results sheet

3. Validate results

4. Draft content

  • [ ] Nov 12th: Authors have completed the first draft in the doc
  • [ ] Nov 26th: The content team has prototyped all data visualizations

5. Publication

  • [ ] Nov 26th: The content team has reviewed the final draft, converted to markdown, and filed a PR to add it to the 2020 content directory
  • Dec 9th: Target launch date
2020 chapter ASAP writing

Most helpful comment

Great thank you @tammyeverts, it's great to have you on board again! I'm also super excited because this is the first chapter to have all three author/reviewer/analyst roles filled 🥳

All 24 comments

I'd like to participate as an analyst in this chapter.

@simonhearne thank you for agreeing to be the lead author for the Third Parties chapter! As the lead, you'll be responsible for driving the content planning and writing phases in collaboration with your content team, which will consist of yourself as lead, any coauthors you choose as needed, peer reviewers, and data analysts.

The immediate next steps for this chapter are:

  1. Establish the rest of your content team. The larger the scope of the chapter, the more people you'll want to have on board.
  2. Start sketching out ideas in your draft doc.
  3. Catch up on last year's chapter and the project methodology to get a sense for what's possible.

There's a ton of info in the top comment, so check that out and feel free to ping myself or @obto with any questions!

To anyone else interested, we'd still love to have you contribute as a peer reviewer, data analyst, or coauthor as needed. Let us know!

I'd love to be a reviewer for this chapter!

On Wed, Jul 1, 2020 at 2:43 PM Rick Viscomi notifications@github.com
wrote:

@simonhearne https://github.com/simonhearne thank you for agreeing to
be the lead author for the Third Parties chapter! As the lead, you'll be
responsible for driving the content planning and writing phases in
collaboration with your content team, which will consist of yourself as
lead, any coauthors you choose as needed, peer reviewers, and data analysts.

The immediate next steps for this chapter are:

  1. Establish the rest of your content team. The larger the scope of
    the chapter, the more people you'll want to have on board.
  2. Start sketching out ideas in your draft doc
    https://docs.google.com/document/d/1LRxXypzgi9lTjG5sMRPpaDvQ5OqHpqQ1Zyng01pTl3g/edit?usp=sharing
    .
  3. Catch up on last year's chapter
    https://almanac.httparchive.org/en/2019/third-parties and the project
    methodology https://almanac.httparchive.org/en/2019/methodology to
    get a sense for what's possible.

There's a ton of info in the top comment, so check that out and feel free
to ping myself or @obto https://github.com/obto with any questions!

To anyone else interested, we'd still love to have you contribute as a
peer reviewer, data analyst, or coauthor as needed. Let us know!


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/HTTPArchive/almanac.httparchive.org/issues/901#issuecomment-652661441,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADXPP2CB6DIFMW3FLIDLVG3RZOUZLANCNFSM4OJ2CNAQ
.

Great thank you @tammyeverts, it's great to have you on board again! I'm also super excited because this is the first chapter to have all three author/reviewer/analyst roles filled 🥳

Hey @simonhearne, just checking in:

  1. How is the the chapter coming along? We're tying to have the outline and metrics settled on by the end of the week so we have time to configure the Web Crawler to track everything you need.
  2. Can you remind your team to properly add and credit themselves in your chapter's Google Doc?
  3. Anything you need from me to keep things moving forward?

@simonhearne - Please see this thread https://discuss.httparchive.org/t/how-many-and-which-resources-have-timing-allow-origin-for-resource-timing/152/10.

I think it will be good idea to call out most used third parties without Timing-Allow-Origin headers in this year's third party chapter and hope some of these third parties start to pay attention. What do you think?

@simonhearne I'd love to help out with reviewing :)

Hey all, back(-ish) from PTO today.

Thanks to those offering help, could you please add yourself to the relevant line in the Content Team section of the doc:

@tammyeverts - Reviewer
@jzyang - Reviewer
@max-ostapenko - Analyst

I think this gives us enough folks to start preparing the content! I would appreciate feedback on the outline section, comments in the doc preferred.

For ongoing communication, what would you prefer:

👍🏻 use this issue
🚀 comments & chat in google doc
👀 slack channel in the HTTPArchive slack

Sent a file access request.

@simonhearne - moving this from slack. @rviscomi suggested this chapter for this topic.

It will be good to cover usage of Tag Manager in this chapter. For example -

  • % of sites using more than one tag manager (e.g. GTM / Adobe DTM / Signal on HomePage)
  • % of sites using multiple instances of same tag manager provider (Example - https://www.schuh.co.uk/ .. two different instances of GoogleTagManager). I think you are aware why they do this.
  • % of third parties initiated from Tag Managers Vs Directly on sites where TagManager is in use.

I see there is a Tag Manager category in Wappalyzer (only 5 tag mangers so far but it can be easily improved)

Also, we will see more Tag Managers taking server side forwarding approach this year which should improve performance so that can also be added in the chapter as something to look forward (https://twitter.com/simoahava/status/1222459714614841346?lang=en)

Thoughts?

Size of tag manager JS. They can quickly get out of control and get massive when old tags (that often aren’t ever even fired!) continue to hang around clogging up the JS.

@simonhearne @max-ostapenko for the two milestones overdue on July 27 could you check the boxes if:

  • the outline has been reviewed and all feasible metrics have been identified
  • any necessary custom metrics have been created and you've created a draft PR to track which feasible metrics have had their queries implemented (we've updated the milestone description to clarify this)

Keeping the milestone checklist up to date helps us to see at a glance how all of the chapters are progressing. Thanks for helping us to stay on schedule!

Thanks for a heads up.
Will go and update the state.

Is this section still in need of reviewers? I'd be happy to help.

Thanks @exterkamp! @simonhearne can you help onboard Shane?

I've updated the chapter metadata at the top of this issue to link to the public spreadsheet that will be used for this chapter's query results. The sheet serves 3 purposes:

  1. Enable authors/reviewers to analyze the results for each metric without running the queries themselves
  2. Generate data visualizations to be embedded in the chapter
  3. Serve as a public audit trail of this chapter's data collection/analysis, linked from the chapter footer

I learned about some really interesting work @patrickhulce has been doing on correlating third parties to Core Web Vitals performance. I suggested that could be an interesting area of exploration for this chapter and he's open to joining the team as a reviewer. @simonhearne is that something you'd be interested in?

Happy to help but I also understand if we don't want too many (repeat) cooks in the kitchen :) let me know what would be most helpful here!

@simonhearne first data and charts are ready for review.

Happy to help but I also understand if we don't want too many (repeat) cooks in the kitchen :) let me know what would be most helpful here!

@patrickhulce did you have any particular query in mind providing correlation analysis?

did you have any particular query in mind providing correlation analysis?

The correlation analysis @rviscomi was referring to in https://github.com/HTTPArchive/almanac.httparchive.org/issues/901#issuecomment-685964320 was separate from HTTP Archive and involved blocking, so I don't have any specific query suggestions for correlation.

My only suggestion I have on the current queries is that several of them focus on a metric that is completely normalized by the frequency and so it ends up yielding mostly obscure, really uncommon third-parties that might not be the most interesting to analyze (and have "unknown" categories as a result too). For example, it might be more useful for most readers to look at "most popular 100 sorted by median body size" or something instead of "top 100 by median body size".

If there are big ones missing from categorization we can also try to plug that gap, we have ~98% coverage by request count last I checked but coverage as % of all possible third-parties is much, much lower.

Great work here everyone!

@simonhearne in case you missed it, we've adjusted the milestones to push the launch date back from November 9 to December 9. This gives all chapters exactly 7 weeks from now to wrap up the analysis, write a draft, get it reviewed, and submit it for publication. So the next milestone will be to complete the first draft by November 12.

However if you're still on schedule to be done by the original November 9 launch date we want you to know that this change doesn't mean your hard work was wasted, and that you'll get the privilege of being part of our "Early Access" launch.

Please see the link above for more info and reach out to @rviscomi or me if you have any questions or concerns about the timeline. We hope this change gives you a bit more breathing room to finish the chapter comfortably and we're excited to see it go live!

I've added @exterkamp as a reviewer, per his offer to help in the Slack channel. Shane, can you create a PR to add your info to the 2020.json config file? The first draft is coming along but not done yet, so there are still opportunities to help. Thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

obto picture obto  ·  5Comments

AymenLoukil picture AymenLoukil  ·  4Comments

bazzadp picture bazzadp  ·  3Comments

bazzadp picture bazzadp  ·  4Comments

rviscomi picture rviscomi  ·  5Comments