As raised at https://github.com/prometheus/node_exporter/issues/1201#issuecomment-540264723, it isn't possible to get --collector.systemd.enable-restarts-metrics without pulling in other, very numerous metrics that come with --collector.systemd.
For example, on our prometheus deployment, metric node_systemd_unit_state represents 14% of our total metric key space. It's wasteful, as we only need node_systemd_service_restart_total (which is 2% of key space).
non-solution: "filter at the server side". We want to prevent these sizable, unneeded metrics from affecting the leaves and intermediaries of the system.
questionable-solution: "use systemd exporter". Why must we learn, configure, and deploy another exporter when node_exporter already has the metric we need?
Proposal: add a new flag to disable the default metrics of --collector.systemd (method collectUnitStatusMetrics of the collector)
cc: @pgier
questionable-solution: "use systemd exporter". Why must we learn, configure, and deploy another exporter when node_exporter already has the metric we need?
We are planning to deprecate all "process supervisor" collectors and remove them in node_exporter 2.0. Once https://github.com/povilasv/systemd_exporter/issues/6 is complete, we will be marking the systemd collector officially deprecated. So keep that in mind.
Thank you.
Since it's hopefully a small change, I'd like to proceed by making a node_exporter PR and coordinating with systemd_exporter maintainer on parity and the command line API.
I don't think we should add that since we're going to deprecate the exporter anyway.
Please don't deprecate the systemd metrics until the systemd_exporter has reached a 1.0.0 release.
Most helpful comment
We are planning to deprecate all "process supervisor" collectors and remove them in node_exporter 2.0. Once https://github.com/povilasv/systemd_exporter/issues/6 is complete, we will be marking the systemd collector officially deprecated. So keep that in mind.