Beats: Metricbeat hints builder discovers same hosts multiple times

Created on 1 May 2019  路  13Comments  路  Source: elastic/beats

I was able to reproduce this on HEAD and the underlying problem is caused due to this check:
https://github.com/elastic/beats/blob/master/metricbeat/autodiscover/builder/hints/metrics.go#L173

It can be reproduced as follows:

Come up with a pod spec that has two containers, one exposing port 8080 and one exposing no ports at all.

Have annotations like:

co.elastic.metrics/module: prometheus
co.elastic.metrics/hosts: ${data.host}:8080

This will cause the container to be polled twice as it gets discovered once because of the explicit port definition on the container spec of container 1. Secondly since container 2 has no port definition and there is a port defined on the annotation.

Temporary work around is to pin the annotation to a single container.

Metricbeat Integrations Platforms bug containers in progress

All 13 comments

@exekias any thoughts on this? my suggestion would be to force containers to expose ports explicitly to be able to monitor them.

The part I don't understand is: Shouldn't autodiscover detected that the final configuration is the same and launch just one instance of the module?

It doesn鈥檛 seem to work that way. We should investigate the same then.

It wouldn鈥檛 be the same as the meta would be different right ?

That sounds correct, ey @ChrsMark I think you have played with autodiscover recently, any chance you could confirm this behavior?

Hey, yeap could check it soonish!

Hey I was able to reproduce this behaviour.

2 runners launched

Pod config:

apiVersion: v1
kind: Pod
metadata:
  name: two-containers-prometheus
  annotations:
    co.elastic.metrics/module: prometheus
    co.elastic.metrics/hosts: ${data.host}:8080
spec:

  restartPolicy: Never

  containers:
  - name: prometheus-container
    image: prom/prometheus
    ports:
      - containerPort: 8080

  - name: redis-container
    image: redis

Metricbeat logs:

2019-11-26T08:55:07.348Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:166    Got a start event: map[config:[0xc000ccb290] host:172.17.0.8 id:582f6f2e-677a-4691-88a4-f64b61fc65e3.prometheus-container kubernetes:{"annotations":{"co":{"elastic":{"metrics/hosts":"${data.host}:8080","metrics/module":"prometheus"}},"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"co.elastic.metrics/hosts\":\"${data.host}:8080\",\"co.elastic.metrics/module\":\"prometheus\"},\"name\":\"two-containers-prometheus\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"prom/prometheus\",\"name\":\"prometheus-container\",\"ports\":[{\"containerPort\":8080}]},{\"image\":\"redis\",\"name\":\"redis-container\"}],\"restartPolicy\":\"Never\"}}\n"}}},"container":{"id":"d5f46003c6cc5960a22e77ff3877487ae585f1d9db90fcc03e90aa79c5aaee33","image":"prom/prometheus","name":"prometheus-container","runtime":"docker"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"582f6f2e-677a-4691-88a4-f64b61fc65e3"}} meta:{"kubernetes":{"container":{"image":"prom/prometheus","name":"prometheus-container"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"582f6f2e-677a-4691-88a4-f64b61fc65e3"}}} port:8080 provider:e6c5fbe1-560f-439a-96ea-4bdf830d2cac start:true]
2019-11-26T08:55:07.348Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:191    Generated config: map[enabled:true hosts:[172.17.0.8:8080] metricsets:[collector] module:prometheus period:1m timeout:3s]
2019-11-26T08:55:07.348Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:253    Got a meta field in the event
2019-11-26T08:55:07.348Z    DEBUG   [autodiscover]  cfgfile/list.go:62  Starting reload procedure, current runners: 0
2019-11-26T08:55:07.348Z    DEBUG   [autodiscover]  cfgfile/list.go:80  Start list: 1, Stop list: 0
2019-11-26T08:55:07.348Z    DEBUG   [autodiscover]  cfgfile/list.go:101 Starting runner: prometheus [metricsets=1]
2019-11-26T08:55:07.349Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:166    Got a start event: map[config:[0xc00092e9f0] host:172.17.0.8 id:582f6f2e-677a-4691-88a4-f64b61fc65e3.redis-container kubernetes:{"annotations":{"co":{"elastic":{"metrics/hosts":"${data.host}:8080","metrics/module":"prometheus"}},"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"co.elastic.metrics/hosts\":\"${data.host}:8080\",\"co.elastic.metrics/module\":\"prometheus\"},\"name\":\"two-containers-prometheus\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"prom/prometheus\",\"name\":\"prometheus-container\",\"ports\":[{\"containerPort\":8080}]},{\"image\":\"redis\",\"name\":\"redis-container\"}],\"restartPolicy\":\"Never\"}}\n"}}},"container":{"id":"7632494bbf7fd078a3551c7d6c3847b13d2a7f3c1092e925a1e6fce3b9f226d5","image":"redis","name":"redis-container","runtime":"docker"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"582f6f2e-677a-4691-88a4-f64b61fc65e3"}} meta:{"kubernetes":{"container":{"image":"redis","name":"redis-container"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"582f6f2e-677a-4691-88a4-f64b61fc65e3"}}} provider:e6c5fbe1-560f-439a-96ea-4bdf830d2cac start:true]
2019-11-26T08:55:07.349Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:191    Generated config: map[enabled:true hosts:[172.17.0.8:8080] metricsets:[collector] module:prometheus period:1m timeout:3s]
2019-11-26T08:55:07.349Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:253    Got a meta field in the event
2019-11-26T08:55:07.349Z    DEBUG   [autodiscover]  cfgfile/list.go:62  Starting reload procedure, current runners: 1
2019-11-26T08:55:07.350Z    DEBUG   [autodiscover]  cfgfile/list.go:80  Start list: 1, Stop list: 0
2019-11-26T08:55:07.350Z    DEBUG   [autodiscover]  cfgfile/list.go:101 Starting runner: prometheus [metricsets=1]

Pin the annotation to one container

Pod config:

  annotations:
    co.elastic.metrics/module: prometheus
    co.elastic.metrics.prometheus-container/hosts: ${data.host}:8080

Metricbeat launches only one runner:

Generated config: map[enabled:true hosts:[172.17.0.5:8080] metricsets:[collector] module:prometheus period:1m timeout:3s]
2019-11-26T09:05:41.450Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:253    Got a meta field in the event
2019-11-26T09:05:41.450Z    DEBUG   [autodiscover]  cfgfile/list.go:62  Starting reload procedure, current runners: 0
2019-11-26T09:05:41.451Z    DEBUG   [autodiscover]  cfgfile/list.go:80  Start list: 1, Stop list: 0
2019-11-26T09:05:41.456Z    DEBUG   [autodiscover]  cfgfile/list.go:101 Starting runner: prometheus [metricsets=1]
2019-11-26T09:05:41.456Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:166    Got a start event: map[config:[0xc000deae40] host:172.17.0.5 id:dfacd6e3-0b4d-47a1-b1f5-d694c2a0592c.redis-container kubernetes:{"annotations":{"co":{"elastic":{"metrics":{"prometheus-container/hosts":"${data.host}:8080"},"metrics/module":"prometheus"}},"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"co.elastic.metrics.prometheus-container/hosts\":\"${data.host}:8080\",\"co.elastic.metrics/module\":\"prometheus\"},\"name\":\"two-containers-prometheus\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"prom/prometheus\",\"name\":\"prometheus-container\",\"ports\":[{\"containerPort\":8080}]},{\"image\":\"redis\",\"name\":\"redis-container\"}],\"restartPolicy\":\"Never\"}}\n"}}},"container":{"id":"9d5221ad274adde255ee22cf27a71abdfcdfa85c711f7fa23272e9d0324ae5b8","image":"redis","name":"redis-container","runtime":"docker"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"dfacd6e3-0b4d-47a1-b1f5-d694c2a0592c"}} meta:{"kubernetes":{"container":{"image":"redis","name":"redis-container"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"dfacd6e3-0b4d-47a1-b1f5-d694c2a0592c"}}} provider:7481c04b-7042-402a-9be6-683580dface8 start:true]
2019-11-26T09:05:41.456Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:191    Generated config: map[enabled:true metricsets:[collector] module:prometheus period:1m timeout:3s]
2019-11-26T09:05:41.456Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:253    Got a meta field in the event
2019-11-26T09:05:41.456Z    ERROR   [autodiscover]  autodiscover/autodiscover.go:205    Auto discover config check failed for config &{{<nil> } <nil> 0xc000b59b40}, won't start runner: 1 error: host parsing failed for prometheus-collector: error parsing URL: empty host

I think this is a bug, we should not be launching the same configuration twice for the same Pod. Something must be failing, as I was under the impression of this being checked here: https://github.com/elastic/beats/blob/d8cd8c437fe7891addb56f3cd7a048aed422b644/libbeat/autodiscover/autodiscover.go#L211

eventID

In this case this cannot work since we have different events for the two different containers of the same Pod:

2019-11-26T09:48:44.093Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:210    eventID: 878f170c-c892-4735-8b4f-60bf707fd01d:266e2380-d3b6-423b-a257-0cb428235fc6.prometheus-container
2019-11-26T09:48:44.093Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:211    hash: 2490033313956307158

2019-11-26T09:48:44.107Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:210    eventID: 878f170c-c892-4735-8b4f-60bf707fd01d:266e2380-d3b6-423b-a257-0cb428235fc6.redis-container
2019-11-26T09:48:44.107Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:211    hash: 2490033313956307158

This is something that we want in some cases like Filebeat, where we want to handle each container explicitly.

In Metricbeat we can skip this since all containers in the same Pod have the same IP. So this current issue can be resolved if we create the eventIDs without the container name as a suffix.

What we should make sure is if unifying the eventID in metricbeat events will have any side-effects in metadata etc. I will open a PR with this approach and discuss it there.

Pinging @elastic/integrations-platforms (Team:Platforms)

I think this is a bug, we should not be launching the same configuration twice for the same Pod. Something must be failing, as I was under the impression of this being checked here:

https://github.com/elastic/beats/blob/d8cd8c437fe7891addb56f3cd7a048aed422b644/libbeat/autodiscover/autodiscover.go#L211

Hey @exekias, @vjsamuel, @jsoriano ! Think this might be resolved now by #18564.

I tried to verify the fix and it seems that worked for me. Deploying:

apiVersion: v1
kind: Pod
metadata:
  name: two-containers-prometheus
  annotations:
    co.elastic.metrics/module: prometheus
    co.elastic.metrics/hosts: ${data.host}:8080
spec:

  restartPolicy: Never

  containers:
  - name: prometheus-container
    image: prom/prometheus
    ports:
      - containerPort: 8080

  - name: redis-container
    image: redis

Here is what I get:

2020-05-21T12:30:08.672Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:196    Generated config: {
  "enabled": true,
  "hosts": [
    "xxxxx"
  ],
  "metricsets": [
    "collector"
  ],
  "module": "prometheus",
  "period": "1m",
  "timeout": "3s"
}
2020-05-21T12:30:08.672Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:258    Got a meta field in the event
2020-05-21T12:30:08.672Z    DEBUG   [autodiscover]  cfgfile/list.go:62  Starting reload procedure, current runners: 0
2020-05-21T12:30:08.673Z    DEBUG   [autodiscover]  cfgfile/list.go:80  Start list: 1, Stop list: 0
2020-05-21T12:30:08.673Z    DEBUG   [autodiscover]  cfgfile/list.go:101 Starting runner: RunnerGroup{prometheus [metricsets=1]}
2020-05-21T12:30:08.674Z    DEBUG   [module]    module/wrapper.go:127   Starting Wrapper[name=prometheus, len(metricSetWrappers)=1]
2020-05-21T12:30:08.674Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:174    Got a start event: map[config:[0xc00102f3e0] host:172.17.0.7 id:cf08c19e-c973-4f82-8461-a61e106fa0a0.redis-container keystore:0xc00004da00 kubernetes:{"annotations":{"co":{"elastic":{"metrics/hosts":"${data.host}:8080","metrics/module":"prometheus"}},"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"co.elastic.metrics/hosts\":\"${data.host}:8080\",\"co.elastic.metrics/module\":\"prometheus\"},\"name\":\"two-containers-prometheus\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"prom/prometheus\",\"name\":\"prometheus-container\",\"ports\":[{\"containerPort\":8080}]},{\"image\":\"redis\",\"name\":\"redis-container\"}],\"restartPolicy\":\"Never\"}}\n"}}},"container":{"id":"8696be13ef36f460c48f74942955c3213c2040bf4235c4d3ca5f0b80ebce4677","image":"redis","name":"redis-container","runtime":"docker"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"cf08c19e-c973-4f82-8461-a61e106fa0a0"}} meta:{"kubernetes":{"container":{"image":"redis","name":"redis-container"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"cf08c19e-c973-4f82-8461-a61e106fa0a0"}}} provider:877348a2-a905-4e4d-ae1f-10589fb0eea2 start:true]
2020-05-21T12:30:08.674Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:196    Generated config: {
  "enabled": true,
  "hosts": [
    "xxxxx"
  ],
  "metricsets": [
    "collector"
  ],
  "module": "prometheus",
  "period": "1m",
  "timeout": "3s"
}
2020-05-21T12:30:08.674Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:258    Got a meta field in the event
2020-05-21T12:30:08.674Z    DEBUG   [autodiscover]  cfgfile/list.go:62  Starting reload procedure, current runners: 1
2020-05-21T12:30:08.675Z    DEBUG   [autodiscover]  cfgfile/list.go:80  Start list: 0, Stop list: 0
2020-05-21T12:30:08.674Z    DEBUG   [module]    module/wrapper.go:181   prometheus/collector will start after 2.336432641s
2020-05-21T12:30:11.012Z    DEBUG   [module]    module/wrapper.go:189   Starting metricSetWrapper[module=prometheus, name=collector, host=172.17.0.7:8080]

Morever testing with 2 target containers:

apiVersion: v1
kind: Pod
metadata:
  name: two-containers-prometheus
  annotations:
    co.elastic.metrics/module: prometheus
    co.elastic.metrics/hosts: ${data.host}:9090
    co.elastic.metrics.redis-container/module: redis
    co.elastic.metrics.redis-container/hosts: ${data.host}:6379
spec:

  restartPolicy: Never

  containers:
  - name: prometheus-container
    image: prom/prometheus
    ports:
      - containerPort: 9090

  - name: redis-container
    image: redis
    ports:
      - containerPort: 6379

Here is the result:

2020-05-21T12:44:44.103Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:196    Generated config: {
  "enabled": true,
  "hosts": [
    "xxxxx"
  ],
  "metricsets": [
    "collector"
  ],
  "module": "prometheus",
  "period": "1m",
  "timeout": "3s"
}
2020-05-21T12:44:44.103Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:258    Got a meta field in the event
2020-05-21T12:44:44.103Z    DEBUG   [autodiscover]  cfgfile/list.go:62  Starting reload procedure, current runners: 0
2020-05-21T12:44:44.104Z    DEBUG   [autodiscover]  cfgfile/list.go:80  Start list: 1, Stop list: 0
2020-05-21T12:44:44.104Z    DEBUG   [autodiscover]  cfgfile/list.go:101 Starting runner: RunnerGroup{prometheus [metricsets=1]}
2020-05-21T12:44:44.104Z    DEBUG   [module]    module/wrapper.go:127   Starting Wrapper[name=prometheus, len(metricSetWrappers)=1]
2020-05-21T12:44:44.104Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:174    Got a start event: map[config:[0xc0003a04e0] host:172.17.0.7 id:b290597a-7950-4fa2-a85a-3f64b1637b41.redis-container keystore:0xc00004da00 kubernetes:{"annotations":{"co":{"elastic":{"metrics":{"redis-container/hosts":"${data.host}:6379","redis-container/module":"redis"},"metrics/hosts":"${data.host}:9090","metrics/module":"prometheus"}},"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"co.elastic.metrics.redis-container/hosts\":\"${data.host}:6379\",\"co.elastic.metrics.redis-container/module\":\"redis\",\"co.elastic.metrics/hosts\":\"${data.host}:9090\",\"co.elastic.metrics/module\":\"prometheus\"},\"name\":\"two-containers-prometheus\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"prom/prometheus\",\"name\":\"prometheus-container\",\"ports\":[{\"containerPort\":9090}]},{\"image\":\"redis\",\"name\":\"redis-container\",\"ports\":[{\"containerPort\":6379}]}],\"restartPolicy\":\"Never\"}}\n"}}},"container":{"id":"c4513734dbd609928b16b0483c3f7b0c443d38af19aad8815e894f97d00e0542","image":"redis","name":"redis-container","runtime":"docker"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"b290597a-7950-4fa2-a85a-3f64b1637b41"}} meta:{"kubernetes":{"container":{"image":"redis","name":"redis-container"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"b290597a-7950-4fa2-a85a-3f64b1637b41"}}} port:6379 provider:877348a2-a905-4e4d-ae1f-10589fb0eea2 start:true]
2020-05-21T12:44:44.104Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:196    Generated config: {
  "enabled": true,
  "hosts": [
    "xxxxx"
  ],
  "metricsets": [
    "info",
    "keyspace"
  ],
  "module": "redis",
  "period": "1m",
  "timeout": "3s"
}
2020-05-21T12:44:44.104Z    DEBUG   [autodiscover]  autodiscover/autodiscover.go:258    Got a meta field in the event
2020-05-21T12:44:44.104Z    DEBUG   [module]    module/wrapper.go:181   prometheus/collector will start after 9.856799733s
2020-05-21T12:44:44.104Z    DEBUG   [autodiscover]  cfgfile/list.go:62  Starting reload procedure, current runners: 1
2020-05-21T12:44:44.104Z    DEBUG   [autodiscover]  cfgfile/list.go:80  Start list: 1, Stop list: 0
2020-05-21T12:44:44.104Z    DEBUG   [autodiscover]  cfgfile/list.go:101 Starting runner: RunnerGroup{redis [metricsets=1], redis [metricsets=1]}
2020-05-21T12:44:44.105Z    DEBUG   [module]    module/wrapper.go:127   Starting Wrapper[name=redis, len(metricSetWrappers)=1]
2020-05-21T12:44:44.105Z    DEBUG   [module]    module/wrapper.go:127   Starting Wrapper[name=redis, len(metricSetWrappers)=1]
2020-05-21T12:44:44.105Z    DEBUG   [module]    module/wrapper.go:181   redis/info will start after 3.463596093s
2020-05-21T12:44:44.105Z    DEBUG   [module]    module/wrapper.go:181   redis/keyspace will start after 6.610549433s
2020-05-21T12:44:47.569Z    DEBUG   [module]    module/wrapper.go:189   Starting metricSetWrapper[module=redis, name=info, host=172.17.0.7:6379]
2020-05-21T12:44:50.718Z    DEBUG   [module]    module/wrapper.go:189   Starting metricSetWrapper[module=redis, name=keyspace, host=172.17.0.7:6379]
2020-05-21T12:44:53.961Z    DEBUG   [module]    module/wrapper.go:189   Starting metricSetWrapper[module=prometheus, name=collector, host=172.17.0.7:9090]

Let me know what you think folks about closing this one :).

Let me know what you think folks about closing this one :).

I think it can be closed, yes. Nice fix!

actually, this issue isnt completely fixed with this change. we will randomly pick what container name to put in even though there is only one configuration that is running. We need a minor change to say that, if the port is not exposed, then it needs to use metadata that doesnt have container.name on it.

Was this page helpful?
0 / 5 - 0 ratings