I have an aggregator service that listens to client requests, talks to a few upstream services, and consolidates their responses, then returns the aggregated response to the client. I want to visualize the server activity in a way that Every active session has a horizontal line on the timeline starting from the point where the request was received by the server to the point where it was returned to the client. A hover-over on any line segment should reveal the tame taken by each sub-process to complete the task. Y-axis can perhaps be annotated by IP addresses of clients, however, I will have to consider how to stack multiple overlapping requests from the same IP address. I would also like to visualize number of requests being served any any time, as well ass the average time taken by last N requests or in last T seconds.
Will it be possible to prepare a plugin for this that? If so, how would I prepare the data?
The chart you are looking for cannot be done by netdata, currently.
But these are charts that can be supported:
You need just a counter. Every time you receive a request you increment the counter. You don't need anything more for that. Netdata will figure out how many requests per second you serve.
If you have a limited number of clients (<20-30), you could also maintain such a counter per client and have a stacked chart (the clients will be the dimensions). If you have hundreds of clients a stacked chart is not practical.
again an incrementing counter.
I you keep such a counter per upstream service you could also have a stacked chart (the dimensions will be the upstream services you use).
Every time you get a response from an upstream server you add its duration to a variable. If the incremental difference of this number (ie. Value@t2 - Value@t1) is divided by the count of responses received, you will have the average response time for the given timeframe (t2 - t1).
Again, by maintaining such a number per upstream provider, you can have a chart with each upstream provider as a dimension.
Similarly to the above, every time you respond to a client you add the duration it took to a number. Dividing the incremental difference of this number to the incremental difference of the number of responses, gives the average response rate in any timeframe.
Of course you can add more such incremental counters in your application to measure other aspects of it.
You can also add absolute numbers, like "currently active clients requests" (i.e. every time you recieve a request increment a counter, every time you respond to a request decrement the same counter), "currently active upstream requests", etc. If you have different types of requests, you could have multiple such counters (one per type) to be visualized as a stacked chart.
In your application you should not care about data collection frequency. Just increment/decrement the counters at fixed places in your code. Of course take care of concurrency (if your application is multi-threaded) while manipulating the counters.
Once you have all these counters in your application, you will need to find a way to let netdata collect them. Netdata can start a program of yours as a plugin. This plugin will do these:
stdout
the counters in the format netdata expectsThat's all. You will have full performance monitoring for your application, by just adding a few counters to it.
I edited my response. Please read it on github.
I am closing this. If you need help, just post it here.