Cht-core: Update users-meta database to have consistent month in all records.

Created on 22 Aug 2020  路  6Comments  路  Source: medic/cht-core

An issue with telemetry data not saved under right month was fixed in 3.8.0 release.

Problem Statement
For any partners that were using cht-core prior to 3.8.0 and upgraded to 3.8.0 or later will leave the information in two different state:

  1. All feedback documents and telemetry data merged after 3.8.0 are in proper month (1-12) month format.
  2. Any telemetry data generated before 3.8.0 are in the month (0-11).

This leaves end users in confusion when they're querying medic-users-meta database for telemetry data.

Expected Behavior
Update all (previous) documents so that date and month format is consistent and correct for all documents within one db.

Bug

Most helpful comment

Just for transparency, telemetry documents have a metdata field that holds the app version when the doc was created.
https://docs.communityhealthtoolkit.org/apps/guides/performance/telemetry/#metadata

So, as a workaround, end users can check if metatada.versions.app is pre 3.8.0 and calculate the correct month within their data workflow.

All 6 comments

Just for transparency, telemetry documents have a metdata field that holds the app version when the doc was created.
https://docs.communityhealthtoolkit.org/apps/guides/performance/telemetry/#metadata

So, as a workaround, end users can check if metatada.versions.app is pre 3.8.0 and calculate the correct month within their data workflow.

@yrimal Unfortunately CouchDB is slow at running these sorts of updates. In this case it may be possible because it only affects a small number of docs (one per user per month). Another complication is if we add this migration to 3.11.0 then you still won't have access to normalized data until the project upgrades to 3.11.0. Does this suit your timeframe or do you need something sooner?

I'd like to look at workarounds (like the one Diana suggested) to avoid both of the problems above.

  1. Are you querying the data in CouchDB or Postgres?
  2. If CouchDB, are you using a view or just querying _all_docs?
  3. Do you need this for all projects, or would it be acceptable to write a script that could be manually executed on projects that require it?

Adding to v3.11 as a placeholder until further direction by @yrimal

Hi Gareth, @garethbowen , responding to your questions:

  1. I was querying Couchdb since not all telemetry data are synced for projects. We're still exploring an option on using telemetry data for metrics calculation and wanted to raise this issue.
  2. Couchdb, but just for exploration. Long term plan is to query postgres once this information is synced to postgres.
  3. Definitely a script would work until we are able to fix this issue.

It won't be much difficult for instances we host to update through script. But as existing partners also move to self-hosted environment, this would be helpful to have a consistent data so that either we or partners can have a single query to pull required information.

@MaxDiz , This doesn't impact any partners who started using Cht-Core for the first time after 3.8.0. If we have idea on number of partners that are self-hosted, that may be helpful on deciding whether we will require to solve this issue or not.

We're pushing a view update to medic-couch2pg that normalizes these values (https://github.com/medic/medic-couch2pg/pull/91), and adds an additional unit (+1) to months when the version of the recorded telemetry is lower than 3.8.0.

That means that if we update records in the database directly, this view will start failing with this error: https://github.com/medic/medic-couch2pg/issues/87 (because we're going to add +1 to 12 at some point).

Right. I think that view update as well as the ability to query this data in postgres makes this change unnecessary.

@yrimal What do you think? Can we close this issue?

Was this page helpful?
0 / 5 - 0 ratings