Fenix: [Telemetry] `string_list` metric type is being misused

Created on 30 Jun 2020  ·  3Comments  ·  Source: mozilla-mobile/fenix

We just saw that #11118 and #11446: looks like string_list is being used to record single values that are bools.

This would make it hard for tools to understand the intent and pushes the burden of parsing data (which is now string data) on the analysis side, increasing both the cost and the likelihood of having bad data.

The code states the following:

        // We purposefully make all of our preferences the string_list format to make data analysis
        // simpler. While it makes things like booleans a bit more complicated, it means all our
        // preferences can be analyzed with the same dashboard and compared.

Setting aside the fact that that this will break GLAM, how does having a list of string having a single value of the wrong type make analysis simpler?

Telemetry 🐞 bug

Most helpful comment

We had a conversation over Slack with @sblatz and Marissa Gorlick. Let me try to capture what was said, for future reference.

The decision to use string_list was made in order to make all "data types the same" to ease future dashboards, in response to Marissa's request for standardization.

While the intent was noble, we all agreed that this was not the right path to pursue, as "standardization" does not necessarily imply "all the is sent as string", which is both a problem for tools and for storage.

The path forward here is:

  • boolean values (toggle) can be a boolean type
  • metric that only record 1 string values, are string type
  • metrics that record more than 1 string value (e.g. accessibility services) are string_list type

While this is a very important change to make, right now we're in code freeze so it can't happen right away.

Any standardization can happen at analysis side, if needed, through UDFs.

I'm happy yo provide support with reviews as needed!

All 3 comments

cc @sblatz looks like this was a recent addition - do you know why we did this?

We had a conversation over Slack with @sblatz and Marissa Gorlick. Let me try to capture what was said, for future reference.

The decision to use string_list was made in order to make all "data types the same" to ease future dashboards, in response to Marissa's request for standardization.

While the intent was noble, we all agreed that this was not the right path to pursue, as "standardization" does not necessarily imply "all the is sent as string", which is both a problem for tools and for storage.

The path forward here is:

  • boolean values (toggle) can be a boolean type
  • metric that only record 1 string values, are string type
  • metrics that record more than 1 string value (e.g. accessibility services) are string_list type

While this is a very important change to make, right now we're in code freeze so it can't happen right away.

Any standardization can happen at analysis side, if needed, through UDFs.

I'm happy yo provide support with reviews as needed!

cc @mdboom

Was this page helpful?
0 / 5 - 0 ratings