Node_exporter: Add Per Process Information Capture

Created on 20 Jan 2016  路  8Comments  路  Source: prometheus/node_exporter

There is a lot of very valuable per process information available at:

/proc/<pid>/status

as well as IO metrics

/proc/<pid>/io

I would suggest by default aggregating all data for the process with same name to avoid blowing up time series - so there is only one "mysqld" or "java" data point corresponding to all of the data together.

It is often very helpful to see the history of top active process on the system in terms of CPU, IO Usage or Memory. There are many cases which can be connected to some "side load" for example cron job starting script which takes all the memory

Most helpful comment

We'd not accept code to add this feature to the node exporter, as it's an application rather than a machine-level metric. The node exporter is only for pure machine metrics, it's a non-goal to have the node exporter as a general clearing house for metrics.

All 8 comments

Process names aren't always a good logical grouping. This is why people are focused on containers and cgroups. They're a much more logical way to monitor subsystem resource usage.

https://github.com/google/cadvisor provides generic container monitoring, including prometheus metris.

We've discussed this several times previously, and have no plans to support this. If you want to monitor individual processes you should either directly instrument them, or write an exporter.

Metrics based on pids are very difficult to deal with in practice (one java data point is useless if you've more than one java process for example), can't be handled in a generic way, risk Prometheus stability and don't follow the Prometheus philosophy.

Incidentally, we are okay with adding /proc/pid stats to exporters via the standard exports. The haproxy exporter does this for example.

Thanks for clarification on your position.
Whatever I would wish many customers of ours are not using containers at this point and having process information has proven helpful in many troubleshooting instances.

Does the fact you're not interested in this metrics means you're not interested implementing it, or you would not accept if we implement the patch to node_exporter providing such option as being disabled by default ?

Yeah, pretty sure it wouldn't go into the node exporter at all due to the above-mentioned problems with tracking things by their PIDs or process names. Better to do this kind of special-purpose tracking yourself outside of the node exporter (either via the textfile collector or some other means).

Specifically for MySQL, it would make sense to understand mysqld's pidfiles and collect process information on those from the MySQL exporter.

We'd not accept code to add this feature to the node exporter, as it's an application rather than a machine-level metric. The node exporter is only for pure machine metrics, it's a non-goal to have the node exporter as a general clearing house for metrics.

@brian-brazil @juliusv How about get the proc pid from supervisord? Since the node_exporter already have service status from supervisord.

Was this page helpful?
0 / 5 - 0 ratings