Hi, I'm confused with the two parameters, group_interval and repeat_interval.
Basically I have one alert lasts for a long time and prometheus evaluate rules every 15 s, here is the test I tried:
Hey,
I'll write down my understanding, and then ask for @fabxc or @beorn7 to confirm.
group_interval:
As an example: A single alert rule (in prometheus) fires with a label set {foo="bar", instance="instanceA"}. Alertmanager sends an alert and is added to a new alert group. Then, the same alert rule fires again with a different label set {foo="bar", instance="instanceB"}. This is also added to the same alert group. Alertmanager won't send an additional alert out until after the time in group_interval.
repeat_interval:
If an alert fires, wait this long before sending the same alert again.
That's my understanding, but @fabxc knows better as he implemented it.
@stuartnelson3 @beorn7 Sorry for late response and thanks for the reply.
If group_interval is the time interval between two alerts within the same group, I should get an email notification every 16 seconds when I set group_interval to be 1s(15 seconds for prometheus to evaluate rules and 1 second for group interval). However, I got the notification every 1 minutes if I also set repeat_interval to be 1m, not 16 seconds.
So maybe there are some rules underneath?
I'll take another try, thank you guys. @stuartnelson3 @beorn7
I tried again with group_interval = 1s and repeat_interval = 1h. I got the log message in alertmanager as "flushing [test_alert[68d0412][active] ..." every 1 second, but I didn't get the email for this interval.
That's the correct behavior. The group interval describes how long alerts are grouped for until they are sent out in badges batches. My guess is that you are experimenting with an example alert that always is the same. Therefore the Alertmanager groups those for the group interval, but because they have already been sent out and the repeat interval is not elapsed yet, it will not sent them out yet. Only once the repeat interval has elapsed and the alerts are still firing, an actual notification will be sent out.
@brancz That's exactly what I am experimenting. I think I've got the idea. will close this issue. Thanks.
Hi,
Can somebody please clarify, in this situation
As an example: A single alert rule (in prometheus) fires with a label set {foo="bar", instance="instanceA"}. Alertmanager sends an alert and is added to a new alert group. Then, the same alert rule fires again with a different label set {foo="bar", instance="instanceB"}. This is also added to the same alert group. Alertmanager won't send an additional alert out until after the time in group_interval.
when Alertmanager sends again after group_interval elapsed, it will notify about both alerts or about {foo="bar", instance="instanceB"} only?
Most helpful comment
Hey,
I'll write down my understanding, and then ask for @fabxc or @beorn7 to confirm.
group_interval:As an example: A single alert rule (in prometheus) fires with a label set
{foo="bar", instance="instanceA"}. Alertmanager sends an alert and is added to a new alert group. Then, the same alert rule fires again with a different label set{foo="bar", instance="instanceB"}. This is also added to the same alert group. Alertmanager won't send an additional alert out until after the time ingroup_interval.repeat_interval:If an alert fires, wait this long before sending the same alert again.