What did you do?
We have seen a problem with resolved alerts that are sent too many times.
We use one instance of prometheus and one instance of the alertmanager (no cluster).
What did you expect to see?
We are expecting to receive a message from the alertmanager with the servers down (alerts in firing status) and the servers available again (alerts in resolved status).
What is particularly interesting is when all the servers are available again (alerts group in resolved status).
What did you see instead? Under which circumstances?
When a server or several servers are available again, we receive several calls with the different resolved alerts.
We should receive the alert resolved information only once.
We only have one alert received by message (even if two alerts are resolved, see the last action).
Environment
System information:
Linux 3.10.0-957.1.3.el7.x86_64 x86_64
Alertmanager version:
0.16.1
Prometheus version:
2.7.1
Alertmanager configuration file:
global:
resolve_timeout: 8737h
route:
group_by: ['alertname']
group_wait: 0s
group_interval: 60s
repeat_interval: 8737h
receiver: 'web.hook'
receivers:
- name: 'web.hook'
webhook_configs:
- url: 'http://localhost:9991/v1/alerts/'
send_resolved: true
Please, note that :
global:
scrape_interval: 5s
evaluation_interval: 20s
scrape_configs:
- job_name: 5c5d8e41c680703f3a26fb68-monitoring-1
scrape_interval: 60s
scrape_timeout: 10s
metrics_path: /metrics
static_configs:
- targets:
- server01
- server02
- server03
- server04
- server05
relabel_configs:
- source_labels:
- __address__
target_label: __param_target
- source_labels:
- __param_target
target_label: instance
- target_label: __address__
replacement: 'http-scenario-exporter:9900'
rule_files:
- rules/5c5d8e41c680703f3a26fb68-monitoring-1.yml
groups:
- name: 5c5d8e41c680703f3a26fb68-monitoring-1
rules:
- alert: 5c5d8e41c680703f3a26fb68-monitoring-1-0
expr: >-
http_scenario_ok{job='5c5d8e41c680703f3a26fb68-monitoring-1'}
== 0
for: 60s
labels: {}
annotations: {}
Here is how we reproduce this issue (note that the instanceName tag is set by the exporter).
server01, server02, server03, server04, server05 are up.
I stop server05 :
{
"level": 10,
"time": "2019-02-11T13:51:25.895Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server05",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:51:25.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server05",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
I stop server04 :
{
"level": 10,
"time": "2019-02-11T13:57:45.994Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server04",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:57:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server05",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:51:25.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
I restart server5 :
{
"level": 10,
"time": "2019-02-11T14:01:25.998Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server04",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:57:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server05",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:51:25.888Z",
"_endsAt": "2019-02-11T14:01:25.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
I stop server03 : (here I receive two calls (!) : one with server04+server03 firing and the other with server04+server03 firing and server05 resolved !)
{
"level": 10,
"time": "2019-02-11T14:03:46.001Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server04",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:57:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:04:06.002Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server04",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:57:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server05",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:51:25.888Z",
"_endsAt": "2019-02-11T14:01:25.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
I restart server04 : (here, I receive a lot of calls (!). Always with server03 firing.
Sometimes with server04 resolved and sometimes with server05 resolved.)
{
"level": 10,
"time": "2019-02-11T14:07:46.002Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server04",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:57:45.888Z",
"_endsAt": "2019-02-11T14:07:45.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:08:06.004Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server05",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:51:25.888Z",
"_endsAt": "2019-02-11T14:01:25.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:09:06.005Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server04",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:57:45.888Z",
"_endsAt": "2019-02-11T14:07:45.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:09:26.004Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server05",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:51:25.888Z",
"_endsAt": "2019-02-11T14:01:25.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1",
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:10:26.005Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server04",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:57:45.888Z",
"_endsAt": "2019-02-11T14:07:45.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:11:46.005Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server04",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:57:45.888Z",
"_endsAt": "2019-02-11T14:07:45.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1",
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:12:06.005Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server05",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T13:51:25.888Z",
"_endsAt": "2019-02-11T14:01:25.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:13:06.013Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server04",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T13:57:45.888Z",
"_endsAt": "2019-02-11T14:07:45.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:13:26.007Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server05",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T13:51:25.888Z",
"_endsAt": "2019-02-11T14:01:25.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:14:26.007Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server04",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T13:57:45.888Z",
"_endsAt": "2019-02-11T14:07:45.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:14:46.007Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server05",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T13:51:25.888Z",
"_endsAt": "2019-02-11T14:01:25.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:15:46.008Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server04",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T13:57:45.888Z",
"_endsAt": "2019-02-11T14:07:45.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:16:06.012Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server05",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T13:51:25.888Z",
"_endsAt": "2019-02-11T14:01:25.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
{
"level": 10,
"time": "2019-02-11T14:17:06.009Z",
"pid": 1,
"hostname": "8f80b4c90c57",
"module": "DispatcherService",
"method": "dispatchAlert",
"_receiver": "web\\.hook",
"_status": "firing",
"_externalURL": "http://5743407ed2a5:9093",
"_version": "4",
"_groupKey": "{}:{alertname=\"5c5d8e41c680703f3a26fb68-monitoring-1-0\"}",
"_alerts": [{
"_status": "resolved",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server04",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations":
{
},
"_startsAt": "2019-02-11T13:57:45.888Z",
"_endsAt": "2019-02-11T14:07:45.888Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
},
{
"_status": "firing",
"_labels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"instanceName": "server03",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_annotations": {
},
"_startsAt": "2019-02-11T14:03:45.888Z",
"_endsAt": "0001-01-01T00:00:00.000Z",
"_generatorURL": "http://ddf2804e6999:9090/graph?g0.expr=http_scenario_ok%7Bjob%3D%225c5d8e41c680703f3a26fb68-monitoring-1%22%7D+%3D%3D+0&g0.tab=1"
}],
"_groupLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0"
},
"_commonLabels": {
"alertname": "5c5d8e41c680703f3a26fb68-monitoring-1-0",
"datacenter": "central",
"job": "5c5d8e41c680703f3a26fb68-monitoring-1"
},
"_commonAnnotations": {
},
"v": 1
}
Hi,
It seems that by lowering the --rules.alert.resend-delay to 5s, there is no longer this problem. Or if I keep the resend delay at 1m and I increase the group_interval, it seems to solve the problem too.
I can not understand why the resolved alerts already sent by the alertmanager are resent later. I do not understand either why prometheus needs to repeatedly send alerts to the alertmanager. Is it not enough to send the alert once when the value of the evaluation of the rule expr changes?
Are there any recommendations concerning the values to set up for scrape_interval, evaluation_interval, resolved_timeout, repeat_interval, group_interval, resend_delay?
I'll dig into this soon but I might have a clue why it is happening.
Hello,
Any news about this topic ?
Most helpful comment
I'll dig into this soon but I might have a clue why it is happening.