Cht-core: Evaluate third party telemetry tools

Created on 27 Oct 2020 · 11Comments · Source: medic/cht-core

This issue is to setup a test instance and deploy 3rd party telemetry tools for evaluation.

_Purpose of the evaluation_
The purpose of this evaluation is to understand if a 3rd party telemetry tool can help us to turn various types of telemetry data into actionable insights for product decision making and app deployment improvement outcomes. We are interested in evaluating whether they can provide necessary visibility into the health of CHT systems and trends in app usage and user behavior.

_Functional areas we are interested in evaluating_
1) Automation of client-side error alerts
2) Visualization of metrics to understand historic trends, correlate diverse factors, and measure changes in performance, consumption, or error rates
3) Tracing of a deployed applications data flow and progression to identify user activities that help to build context for system errors and changes in user interaction
4) Ease of extensibility and team use for retrieving, interacting with, and reporting on data from individual and aggregate project deployments

_System requirements_
Must Haves:

Low device data usage
Low device CPU usage
Low device memory usage to contain any app performance impact
Device offline storage capabilities
Open source solution that can be integrated into the CHT tech stack and be offered to partners as a self-hosted solution
Automation of reports
Automation of alerts
Evaluate data trends over time
Feature level usage and A/B testing
Ability to define user cohorts

Nice to Haves:

Accessible to non/less-technical team members (ie does not require understanding of sql data queries)
Ability to setup defined workflows so can follow user activity through complete sequences
Analyze both magnitude and ratios
Download to csv

_Additional context_
Related initiative: Client-side telemetry and monitoring for improved app deployment and ability to analyze how users engage with the product contains additional context and requirements from across the Organization

Monitoring 2 - Medium Investigation

Source

MaxDiz

Most helpful comment

Email off to Elastic (of which I got no email confirmation after submitting the web form, but fingers crossed). Note I cite a forum post in the email that suggest our desired offline model is not supported:

Medic Mobile is considering adding some third party telemetry to our offline first application. We're 100% open source, so you can see the ticket we're working on here (https://github.com/medic/cht-core/issues/6696).

Two questions:

Based off how you read the ticket, does your APM product seem like the best match for what we're looking for?

Does APM support an offline first model? We're looking to track metrics in an Android WebView which is offline most of the time. Our app uses PouchDB <-> CouchDB to solve for this currently. I'm concerned that this is not supported after reading this post.

Thanks!
-mrjones

mrjones-plip on 13 Nov 2020

👍2

All 11 comments

Assigning to @mrjones-plip based on conversation with @abbyad

MaxDiz on 27 Oct 2020

Tools that we would like to deploy for evaluation:

MaxDiz on 27 Oct 2020

Thanks @MaxDiz & @abbyad! I'll dive info this after my CMoHM CI effort is wrapped up in CMoHM#120 (private repo)

FYI @michaelkohn

mrjones-plip on 27 Oct 2020

@MaxDiz - In reviewing this ticket, it looks to be research to solve specifically the problems outlined in #6679, yes? I didn't see 6679 mention in this ticket (or vice versa), so I just wanted to be sure we're all on the same page ;)

For this:

setup a test instance and deploy 3rd party telemetry tools for evaluation

What does that look like? A local dev instance I run and expose via DIY ngrok? A more formal, less ephemeral, SRE blessed instance to which they hand over root access to? Or something all together different? (Maybe @abbyad's ears are burning?)

Tools that we would like to deploy for evaluation: https://count.ly/

So, interesting note, count.ly is blocked by default lists in the Pi-hole project, specifically the Unified hosts = (adware + malware) list in StevenBlack Hosts. This hopefully will be minor as it looks to be just niche privacy nerds like me that bump into this (eg Quad9, Google and CloudFlare all resolve this OK on 9.9.9.9, 8.8.8.8 and 1.1.1.1 respectively). Will be entirely moot if we're self hosting.

mrjones-plip on 10 Nov 2020

Correct re #6679 -- my error for not referencing.
@abbyad may have ideas about instance setup. If not, we can talk with SRE together.

MaxDiz on 10 Nov 2020

Awesome sauce, thanks @MaxDiz !

Re hosting: I'll start with a local dev instance and see how much headway I make - this is all very new to me and I don't know how easily 3rd party tools will integrate with ease vs heavier changes to CHT.

mrjones-plip on 10 Nov 2020

👍1

Ah - a little slow on the uptake on this - sorry. The "deploy" very clearly states "deploy 3rd party telemetry". I'll check these two out using personal resources I have my disposal (home lab FTW!) and see where I end up.

mrjones-plip on 11 Nov 2020

The offline support pre-req came up in Eng stand today, so I did some initial research on Countly and Elastic's offerings.

Countly: Seems to natively support via the documented Offline Mode.
Elastic APM*: Seems of questionable support. I found a flag called async-hooks which the docs very clearly say is based on experimental node features. There's some interesting tickets, specifically someone trying to get it to work with AWS lamda which in turn references another open ticket in core APM.

*Note that we cited Elastic Observability as the product of interest, but I _think_ we actually want Application Performance Monitoring (APM)? Elastic's offerings and features are confusing enough that I'll just contact them and see what they recommend. So handy that this ticket is public!

FYI @abbyad @craig-landry @garethbowen

mrjones-plip on 12 Nov 2020

Medic Mobile is considering adding some third party telemetry to our offline first application. We're 100% open source, so you can see the ticket we're working on here (https://github.com/medic/cht-core/issues/6696).

Two questions:

Based off how you read the ticket, does your APM product seem like the best match for what we're looking for?

Does APM support an offline first model? We're looking to track metrics in an Android WebView which is offline most of the time. Our app uses PouchDB <-> CouchDB to solve for this currently. I'm concerned that this is not supported after reading this post.

Thanks!
-mrjones

mrjones-plip on 13 Nov 2020

👍2

@craig-landry - per our call today, I thought maybe you'd have two cents to offer on this ticket? Mainly in the product selection

mrjones-plip on 23 Nov 2020

Following on our Dec-2020 Roadmap Planning meeting, we are requesting support from a current funding partner for initial system design and tooling selection. We should hear back from them by mid-Jan. @mrjones-plip let's sync again before moving forward.

MaxDiz on 22 Dec 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings