Cht-core: Evaluate third party telemetry tools

Created on 27 Oct 2020  路  11Comments  路  Source: medic/cht-core

This issue is to setup a test instance and deploy 3rd party telemetry tools for evaluation.

_Purpose of the evaluation_
The purpose of this evaluation is to understand if a 3rd party telemetry tool can help us to turn various types of telemetry data into actionable insights for product decision making and app deployment improvement outcomes. We are interested in evaluating whether they can provide necessary visibility into the health of CHT systems and trends in app usage and user behavior.

_Functional areas we are interested in evaluating_
1) Automation of client-side error alerts
2) Visualization of metrics to understand historic trends, correlate diverse factors, and measure changes in performance, consumption, or error rates
3) Tracing of a deployed applications data flow and progression to identify user activities that help to build context for system errors and changes in user interaction
4) Ease of extensibility and team use for retrieving, interacting with, and reporting on data from individual and aggregate project deployments

_System requirements_
Must Haves:

  • Low device data usage
  • Low device CPU usage
  • Low device memory usage to contain any app performance impact
  • Device offline storage capabilities
  • Open source solution that can be integrated into the CHT tech stack and be offered to partners as a self-hosted solution
  • Automation of reports
  • Automation of alerts
  • Evaluate data trends over time
  • Feature level usage and A/B testing
  • Ability to define user cohorts

Nice to Haves:

  • Accessible to non/less-technical team members (ie does not require understanding of sql data queries)
  • Ability to setup defined workflows so can follow user activity through complete sequences
  • Analyze both magnitude and ratios
  • Download to csv

_Additional context_
Related initiative: Client-side telemetry and monitoring for improved app deployment and ability to analyze how users engage with the product contains additional context and requirements from across the Organization

Monitoring 2 - Medium Investigation

Most helpful comment

Email off to Elastic (of which I got no email confirmation after submitting the web form, but fingers crossed). Note I cite a forum post in the email that suggest our desired offline model is not supported:

Medic Mobile is considering adding some third party telemetry to our offline first application. We're 100% open source, so you can see the ticket we're working on here (https://github.com/medic/cht-core/issues/6696).

Two questions:

  1. Based off how you read the ticket, does your APM product seem like the best match for what we're looking for?
  2. Does APM support an offline first model? We're looking to track metrics in an Android WebView which is offline most of the time. Our app uses PouchDB <-> CouchDB to solve for this currently. I'm concerned that this is not supported after reading this post.

Thanks!
-mrjones

All 11 comments

Assigning to @mrjones-plip based on conversation with @abbyad

Tools that we would like to deploy for evaluation:

Thanks @MaxDiz & @abbyad! I'll dive info this after my CMoHM CI effort is wrapped up in CMoHM#120 (private repo)

FYI @michaelkohn

@MaxDiz - In reviewing this ticket, it looks to be research to solve specifically the problems outlined in #6679, yes? I didn't see 6679 mention in this ticket (or vice versa), so I just wanted to be sure we're all on the same page ;)

For this:

setup a test instance and deploy 3rd party telemetry tools for evaluation

What does that look like? A local dev instance I run and expose via DIY ngrok? A more formal, less ephemeral, SRE blessed instance to which they hand over root access to? Or something all together different? (Maybe @abbyad's ears are burning?)

Tools that we would like to deploy for evaluation: https://count.ly/

So, interesting note, count.ly is blocked by default lists in the Pi-hole project, specifically the Unified hosts = (adware + malware) list in StevenBlack Hosts. This hopefully will be minor as it looks to be just niche privacy nerds like me that bump into this (eg Quad9, Google and CloudFlare all resolve this OK on 9.9.9.9, 8.8.8.8 and 1.1.1.1 respectively). Will be entirely moot if we're self hosting.

Correct re #6679 -- my error for not referencing.
@abbyad may have ideas about instance setup. If not, we can talk with SRE together.

Awesome sauce, thanks @MaxDiz !

Re hosting: I'll start with a local dev instance and see how much headway I make - this is all very new to me and I don't know how easily 3rd party tools will integrate with ease vs heavier changes to CHT.

Ah - a little slow on the uptake on this - sorry. The "deploy" very clearly states "deploy 3rd party telemetry". I'll check these two out using personal resources I have my disposal (home lab FTW!) and see where I end up.

The offline support pre-req came up in Eng stand today, so I did some initial research on Countly and Elastic's offerings.

*Note that we cited Elastic Observability as the product of interest, but I _think_ we actually want Application Performance Monitoring (APM)? Elastic's offerings and features are confusing enough that I'll just contact them and see what they recommend. So handy that this ticket is public!

FYI @abbyad @craig-landry @garethbowen

Email off to Elastic (of which I got no email confirmation after submitting the web form, but fingers crossed). Note I cite a forum post in the email that suggest our desired offline model is not supported:

Medic Mobile is considering adding some third party telemetry to our offline first application. We're 100% open source, so you can see the ticket we're working on here (https://github.com/medic/cht-core/issues/6696).

Two questions:

  1. Based off how you read the ticket, does your APM product seem like the best match for what we're looking for?
  2. Does APM support an offline first model? We're looking to track metrics in an Android WebView which is offline most of the time. Our app uses PouchDB <-> CouchDB to solve for this currently. I'm concerned that this is not supported after reading this post.

Thanks!
-mrjones

@craig-landry - per our call today, I thought maybe you'd have two cents to offer on this ticket? Mainly in the product selection

Following on our Dec-2020 Roadmap Planning meeting, we are requesting support from a current funding partner for initial system design and tooling selection. We should hear back from them by mid-Jan. @mrjones-plip let's sync again before moving forward.

Was this page helpful?
0 / 5 - 0 ratings