Prometheus-operator: node_exporter should use recording rules to publish node name

Created on 9 Nov 2017  路  29Comments  路  Source: prometheus-operator/prometheus-operator

Quoting @brancz from https://github.com/coreos/prometheus-operator/issues/135#issuecomment-341131667:

As I mentioned above the problem is solvable with recording rules today. I have been playing with this lately and will probably publish the recording rules and updated dashboards accordingly, although I believe the node name is most useful here, as it's modifiable to contain whatever you want and typically defaults to the hostname.

Creating this issue to track this! Thank you.

Most helpful comment

Coincidentally I'm actually working on this currently. Updates coming soonish.

All 29 comments

Any updates on this? The stats are quite unusable with IPs...

I apologize for still not having gotten to this, we are currently reworking our Grafana dashboards, once we have the transition done, we will be able to take care of this easily.

Still looking forward to getting this done (although I've already implemented a bunch of workarounds, none of them seem to be completely fixing the issue)

Will recording rules result in some performance problems?
IMHO, it could also be implemented by: https://github.com/coreos/prometheus-operator/issues/135#issuecomment-383617438.

Can we have an update on this issue?

@brancz If you want you can point me to your WIP implementation. Happy to do the last missing steps.

Coincidentally I'm actually working on this currently. Updates coming soonish.

Still no progress on this?

Actually this is now done, the dashboards allow you to select by node name now instead of node IP, and there are recording rules to do this in the expression browser as well. Closing here :tada:.

Should I update to master to get this? Could you point me at the commit plz?

I've just deleted my prometheus stack (whole monitoring namespace) and created it with newest manifests from repo and Nodes dashboard still shows virtual IP addresses.

It uses whatever the name of your node objects is. The same as when you do kubectl get nodes.

Should I somehow regenerate manifests to get this ?

My nodes are named after AWS private dns, but as I said dashboard still shows node-exporters pod IP addresses.

Somewhat depends on how you installed your stack, but even if you just apply the manifests in contrib/kube-prometheus/manifests/ against your cluster you should see this in the "K8s / USE Method / Node" dashboard.

Correct, USE Method / Node shows proper node name but 'Nodes' dashboard still uses old values. ;-) This is why I got confused/

Yes, there is still some cleaning up/migrating to do but, the goal is to have one set of dashboards that can be shared among the whole community! :slightly_smiling_face: This effort is being worked on in this repo: https://github.com/kubernetes-monitoring/kubernetes-mixin

@brancz where is the magic line to convert the instance to node name? I want to do the same for my dashboards, but can't find it so far..

@dimm0 for example the "K8s / USE Method / Node" dashboard does selection based on kube_node_info. Then queries this recording rule: https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/297f40ffc7dc07b690bf715f3952a8b729ba9197/dashboards/use.libsonnet#L90

And the recording rule is defined here: https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/master/rules/rules.libsonnet#L128-L134

Great, thanks!

Still trying to figure this out. Your examples go from pod and namespace, also I see conversion possible based on the private pod IPs.
In my case there are daemons providing the GPU stats, running directly on the nodes, and I have the instance="real_ip:9114" labels. I can remove the port, but so far I found no metrics that allow conversion of real ip to nodename.
Is this https://github.com/coreos/prometheus-operator/issues/135#issuecomment-383617438 helpful in my case then?

It somewhat depends on how these daemons are deployed. Are they just a daemonset? If that's the case, then they are just pods, and you can use the kube_pod_info metric to figure out the relationship between private IP and node name: https://github.com/kubernetes/kube-state-metrics/blob/master/Documentation/pod-metrics.md

Daemons are not running in a container at all, they are deployed on the nodes. I'm manually setting the endpoints for those:

https://gitlab.com/dimm/prp_k8s_config/blob/master/monitoring/rules.jsonnet#L145-211

https://prometheus.nautilus.optiputer.net/targets#job-gpu-mon

https://grafana.nautilus.optiputer.net/d/lhzgeA3zk/all-cuda-gpu?

You have two options, you can either specify entirely your own Prometheus scrape config using the additionalScrapeConfigs feature, or you re-use the kubelet endpoints object and set the port you want to scrape manually by just specifying the targetPort as the port number you want to scrape.

you can either specify entirely your own Prometheus scrape config using the additionalScrapeConfigs feature

Can I add just my scrape config, or I have to repeat the whole config you have already? And what exactly should I add to it?

or you re-use the kubelet endpoints object and set the port you want to scrape manually by just specifying the targetPort as the port number you want to scrape

How would that help having the node name show up in dashboard?

Can I add just my scrape config, or I have to repeat the whole config you have already? And what exactly should I add to it?

Whatever you put into prometheus.spec.additionalScrapeConfig will be added to the scrape_config section of your Prometheus configuration.

Ok, great. But I still don't have the __meta_kubernetes_node_name field after setting targetRef.kind to Node...

https://github.com/coreos/prometheus-operator/issues/135#issuecomment-383617438

What is depicted in https://github.com/coreos/prometheus-operator/issues/135#issuecomment-383617438 is not true, as the PR never got merged. The following meta labels will be added with the respective values:

  • __meta_kubernetes_endpoint_address_target_kind: will be set to Node
  • __meta_kubernetes_endpoint_address_target_name: will be set to the nodes name

I don't see those either..

I solved this by running GPU exporters as pods.

Was this page helpful?
0 / 5 - 0 ratings