Victoriametrics: [vmagent] error logs showing both skipped target and the kept one

Created on 24 Jul 2020  路  9Comments  路  Source: VictoriaMetrics/VictoriaMetrics

Is your feature request related to a problem? Please describe.
I try to get rid of "skipping duplicate scrape target with identical labels" error log spam
but the logs only provide me the skipped target and not the one it was skipped for.
So its hard to find out, whats wrong in there

Describe the solution you'd like
Add also the description of the kept target e.g.

timestamp error VictoriaMetrics/lib/promscrape/scraper.go:269 skipping duplicate scrape target with identical labels; endpoint=http://xxx.xxx.xxx.xxx:10254/metrics, labels={app="nginx-ingress", component="controller", . . .}; _for; endpoint=http://yyy.yyy.yyy.yyy:10254/metrics, labels={app="nginx-ingress", component="controller", . . .};_ make sure service discovery and relabeling is set up properly

Describe alternatives you've considered
sry, no idea

Additional context
Add any other context or screenshots about the feature request here.

enhancement vmagent

Most helpful comment

vmagent should show original labels in duplicate scrape target error message after commit 71ea4935de4b09e552c979faaa04c977462fb6c8 . These labels could simplify detecting the root cause in relabeling, which led to duplicate targets.

This commit also adds -promscrape.suppressDuplicateScrapeTargetErrors command-line flag for suppressing such error messages for those who don't want fixing the original issue in relabel configs.

This functionality can be tested by building vmagent from the commit 71ea4935de4b09e552c979faaa04c977462fb6c8 according to these docs.

All 9 comments

I think it would be great to print all the original __meta_* labels for the skipped target. Then it will be easier determining the root cause which led to identical scrape targets with identical labels.

BTW, the most frequent case for duplicate scrape targets in kubernetes service discovery is multiple open ports per pod. See https://victoriametrics.github.io/vmagent.html#troubleshooting for the solution.

Note that Prometheus silently accepts duplicate scrape targets with identical labels. This results in duplicate scrapes and duplicate datapoints in the storage.

Agreed, I'm aware of that, the funny thing is, that the pod with multiple open ports (myapp-svc) doesn't cause any errors, but if I add your solution mentioned under trouleshoot (keep_if_equal prom port vs container port) it drops all (for me important) metrics of myapp-svc pod.

  • the errors just rely to kube-dns and nginx-ingress-controller.
    even if I drop all scrap_config jobs just leaving kubernetes-pods(ingress) or kubernetes-service-endpoints(kube-dns) it still produces the error. but if I drop them within the standart kubernetes_sd_configs in the helm chart and just add a static config to them there is no error, so targets dont overlab with other jobs at all.
  • allover it would be great to have any options to debug this errors and have more visibility on that toppic :)

The following relabeling rule could help debugging the issue:

- action: labelmap
  regex: "__meta_(.+)"
  replacement: "meta_${1}"

It must be placed at the end of target relabeling rules.

It adds all the __meta_* labels to the target, so you could inspect these labels for duplicated targets at http://vmagent:8429/targets page.

just placed at the end of target relabeling rules or should the existing -action: labelmap be replaced with that?

just placed at the end of target relabeling rules or should the existing -action: labelmap be replaced with that?

This depends on which labels are mapped or dropped with existing labelmap, labeldrop or labelkeep actions. If unsure, try just replacing all these relabeling rules with a single labelmap rule outlined above.

The following relabeling rule could help debugging the issue:

- action: labelmap
  regexp: "__meta_(.+)"
  replacement: "meta_${1}"

It must be placed at the end of target relabeling rules.

It adds all the __meta_* labels to the target, so you could inspect these labels for duplicated targets at http://vmagent:8429/targets page.

Thanks @valyala . It is really helpful.
There is a typo in your example which was hard to spot: regexp should be regex.

@ssh2n maybe it will also help you.
I placed above example in the end of relabel_configs: of several jobs.
After fixing that typo I was able to see all labels in the logs.
In my case duplicate targets had only one difference: __meta_kubernetes_pod_container_init: "true"
I fixed it by adding following rule from the troubleshooting guide to the end of "kubernetes-pods" job:

- action: drop
  source_labels: [__meta_kubernetes_pod_container_init]

There is a typo in your example which was hard to spot: regexp should be regex.

Thanks for spotting the typo! Fixed it in the original comment.

vmagent should show original labels in duplicate scrape target error message after commit 71ea4935de4b09e552c979faaa04c977462fb6c8 . These labels could simplify detecting the root cause in relabeling, which led to duplicate targets.

This commit also adds -promscrape.suppressDuplicateScrapeTargetErrors command-line flag for suppressing such error messages for those who don't want fixing the original issue in relabel configs.

This functionality can be tested by building vmagent from the commit 71ea4935de4b09e552c979faaa04c977462fb6c8 according to these docs.

Features mentioned above have been included in vmagent and single-node VictoriaMetrics starting from v1.44.0. Closing the feature request as done.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dima-vm picture dima-vm  路  3Comments

sh0rez picture sh0rez  路  3Comments

Serrvosky picture Serrvosky  路  3Comments

prdatur picture prdatur  路  3Comments

n4mine picture n4mine  路  3Comments