Fenix: Telemetry counts for Follow-on and Organic Search

Created on 12 Nov 2019  ·  19Comments  ·  Source: mozilla-mobile/fenix

Meta bug

Need to count the in-web-content searches, such as follow-on search counts (continuing to search from an existing search). This relies on url loads against a whitelist of search engines of partner codes and URL formats - caveat, the whitelist only covers the top partners (and is incomplete, plus search engine urls may change).

Acceptance Criteria

  • [ ] Get Mobile search engines/codes/urls whitelist from Mike Connor @mconnormoz
  • [ ] Data review (here is the Desktop data review for organic/follow-on search)
  • [ ] Tests for this telemetry (see Desktop tests)
  • [ ] In-content searches create telemetry that matches the format <provider>.in-content:[sap|sap-follow-on|organic]:[code|none] (see Terms for definitions)

    • For example, an organic search might look like google.organic:in-content:firefox-b

    • Here is part of the Desktop implementation using url search param matching and follow-on cookies

Helpful context

  • Terms

    • sap: directly from search access point (I don't think we have any of these)

    • sap-follow-on: user continues to search from an existing search

    • organic: search that didn't come from a SAP

    • code: partner search code (or none if organic)

  • Blog post explaining Desktop follow-on search.

Talk to Mark Banner for additional help, who did the Desktop implementation.

E8 Search Telemetry engverified

Most helpful comment

As per @BranescuMihai 's comment, I'll mark this issue verified as fixed .
The remaining inconsistencies will be tracked in separate issues.

All 19 comments

I'm in the process of asking mconnor for these lists, if someone picks up this issue and the info isn't ready yet, please ping me!

Whoops, forgot to unassign myself here. I started work on the polish sprint just after assignment here, and have not done any work on it.

@liuche are these codes available or are we waiting on them? 😄

Sent the codes to sawyer!

From the desktop implementation, I see that in this task we need to access the url's cookies for our Bing telemetry. There are two ways to proceed, either a web-extension in AC with the Cookies API, or a GV API for cookies similar to CookieManager for WebView.

@pocmo What do you think would be a good approach? I saw that you worked on the cookies topic.

References: GV CookieManager request and discussion about the DownloadManager implementation

@st3fan could you help out here? We'd like to get these search telemetry probes in soon.

Reping for @st3fan and @pocmo :) This is something that is blocking the Search team getting necessary data.

Moved it to AC triage.

@BranescuMihai Do you think the desktop JS code could be shared with mobile?

@st3fan I'll share the docs with you, with links. Take a look at the Desktop implementation too (also is linked in comment 0).

Is this something that your team will pick up, or would you want @BranescuMihai's help on this going forward? Either is probably fine, and there is plenty of other Fenix work to be done.

Status update:
I've decided to go with the cookies API and created a web-extension, which is functional and correctly retrieves the cookies. Currently working on actually sending the pings, hopefully a PR will be up by Friday.

@st3fan the code that I used in the extension is mostly a duplicate of the web implementation.
For the same codebase, I think we need to make a call to an endpoint with an object (such as {url: "urlValue" cookies:[arrayOfCookies]}), and the backend would check against the whitelisted search engines, and then we get back an object such as {provider, type, code} , but that seems complicated and potentially expensive for only one duplicated class.

If there is indeed a common codebase in the Fennec repo done in a different way, maybe we can do the same, but for now I think we should just duplicate it.

@liuche on the desktop they use histograms for reporting this, however in the docs I see usage examples only in JS or C++, and I can't find a reasonable distribution in Glean either.
Should we use a labeled_counter for this as well, like in the other search-related metrics? If yes, what should be the key?

Yes! labeled_counter is the correct substitute for histograms. For your other question, I've asked and cc-ed you to a metrics person, but if they don't have any opinions, is there something from desktop that we can imitate? And if not, would in_content_search work, because we're collecting this search telemetry from in-content searching? Also happy for you to choose a key as well, since you've been closest to this implementation.

Is this a new labeled counter or an extension of the existing metrics_search_count? engine.in-content (or engine.in-content:[sap|...] if that's being tracked as well) would be best to match desktop so there's no need to look up the right key.

For reference, sources on desktop are grouped like this: https://github.com/mozilla/bigquery-etl/blob/master/sql/search_derived/search_clients_daily_v8/query.sql#L53-L100
But mobile definitely doesn't need to match exactly.

@Ben-Wu I used a new one, in the same group as the new ads metrics, browser.search.in_content, with labels like engine.in_content.[sap|...].[code|none]

Note: The label is slightly different from desktop because we cannot use : inside them, replaced with dots.

Second note: there are some differences between mobile/desktop regarding codes:
duckduckgo - codePrefixes = fpas instead of ff
baidu - queryParam = word instead of wd (the code param is also different but can't figure out which one is it, might be good for someone who knows that stuff to recheck all provider codes)

Hi, I've just checked this matter on Fenix Beta 5.0.0-beta.1 from 4/29 using a Google Pixel 3a (Android 10)

| Search engine | sap | sap-follow-on | organic |
| ------------- | ------------- | ------------- | ------------- |
| Google | ✔️Ping 24bf180b-255c-4bfd-9351-baea24cbb130 | ✔️ Ping 89b145c5-4923-4cec-b135-43e965f639c6 | ✔️ Ping 673c260f-c0fa-46be-ad5f-260c31394190 |
| Bing | ✔️ Ping 5a7c71bf-ec7b-46e2-a513-3f60c7311f49 | ❌ | ✔️ Ping b59dfd4a-dbd1-41b8-a3c0-9c181a13a512 |
| Duckduckgo | ✔️ Ping 38f6204b-76f5-4a69-838c-b07f6eb45c34 | ❓ records it as a sap | ✔️ Ping 4439d42d-ef98-4fad-a0ce-8fe0f1884581 |
| Yahoo | ❓ records it as a organic | ❌ | ✔️ Ping f341e31c-37ed-445f-9237-d8c0cafc100b |
| Baidu | ❌ | ❌ | ❌ |

✔️ The Metrics Ping d0694fa1-6617-4dc0-af4b-e8d2dea82fab properly generated:

"labeled_counter": {
          "browser.search.in_content": {
            "bing.in-content.organic.none": 1,
            "bing.in-content.sap.mozb": 1,
            "duckduckgo.in-content.organic.h_": 1,
            "duckduckgo.in-content.sap.fpas": 2,
            "google.in-content.organic.none": 1,
            "google.in-content.sap-follow-on.firefox-b-m": 1,
            "google.in-content.sap.firefox-b-m": 1,
            "yahoo.in-content.organic.none": 2

Logcat
Glean dahboard

@BranescuMihai - Please review and share your thoughts. ☺️
I'll remove the QA needed label until further notice

@AndiAJ so:

  • For Google everything seems fine
  • For Bing there is a problem: the sap-follow-on should work. I tracked down the problem and I will add a bug on this
  • For DuckDuckGo, it's ok for now that we don't have sap-follow-on, because we don't have those params available on Desktop either. However, it may not be ok that the code is not none for the organic search, will add a bug for it
  • For Yahoo, that's all the info we got, should be the same as on Desktop
  • For Baidu, the codes need to be rechecked, as none of them is similar to Desktop (for example word instead of wd)

As per @BranescuMihai 's comment, I'll mark this issue verified as fixed .
The remaining inconsistencies will be tracked in separate issues.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bbinto picture bbinto  ·  3Comments

softvision-miralobontiu picture softvision-miralobontiu  ·  3Comments

thelazyoxymoron picture thelazyoxymoron  ·  3Comments

andreicristianpetcu picture andreicristianpetcu  ·  3Comments

topotropic picture topotropic  ·  3Comments