Kibana: [Stack Monitoring] Alerting Phase -1

Created on 8 Aug 2019  路  8Comments  路  Source: elastic/kibana

This ticket tracks the work which needs to be completed to achieve Phase -1 which is outlined in the proposal document.

To complete this phase, we need to build out the plumbing to connect to the Stack Monitoring application to the Kibana Alerting Framework.

All watches need to be present and functional using the new framework:

Meta Monitoring enhancement

Most helpful comment

The effort is going well here. I don't have a PR ready yet, but I hope to have it this week. (Update: Draft PR available)

Some updated notes on this effort:

  • We need to figure out how we handle the state of alerts firing - with Watcher, we write to the .monitoring-alerts-* index, but I think we can avoid an additional index by leveraging the persisted state for actions. We are blocked on this because we need a way to access this state, see https://github.com/elastic/kibana/issues/48442
  • We need to figure out the right way to disabling cluster alerts (watches). I've outlined some thoughts on this issue
  • I'm thinking we'll want to progressively add these into master (instead of one big merge) and if so, we should think about if we want to disable these until they are all in, or do we want to enable at least one from the start and have it co-exist with the other watches?
  • With watcher, we require users to specify an email address to receive alerts in their kibana.yml - we can continue this trend, or we can allow them to specify it in the UI when they enable Kibana alerts, and then we store it in a saved object or something.

All 8 comments

Pinging @elastic/stack-monitoring

Update here.

I found a couple of blockers while taking a first stab at this and raised them here: https://github.com/elastic/kibana/issues/45571

The effort is going well here. I don't have a PR ready yet, but I hope to have it this week. (Update: Draft PR available)

Some updated notes on this effort:

  • We need to figure out how we handle the state of alerts firing - with Watcher, we write to the .monitoring-alerts-* index, but I think we can avoid an additional index by leveraging the persisted state for actions. We are blocked on this because we need a way to access this state, see https://github.com/elastic/kibana/issues/48442
  • We need to figure out the right way to disabling cluster alerts (watches). I've outlined some thoughts on this issue
  • I'm thinking we'll want to progressively add these into master (instead of one big merge) and if so, we should think about if we want to disable these until they are all in, or do we want to enable at least one from the start and have it co-exist with the other watches?
  • With watcher, we require users to specify an email address to receive alerts in their kibana.yml - we can continue this trend, or we can allow them to specify it in the UI when they enable Kibana alerts, and then we store it in a saved object or something.

Nice work @chrisronline 馃挭 Can't wait to see it!

We need to figure out how we handle the state of alerts firing - with Watcher, we write to the .monitoring-alerts-* index

Once "Kibana Alerting" is live are we completely deprecating/removing the current/old Alerting?

I think we might still want a new index, just in case some setups still have the old .monitoring-alerts-* with legacy documents (or for some reason we need to support both ES and Kibana alerting). We can abbreviate it with something like -kb like we do -mb for Metricbeat.

I'm thinking we'll want to progressively add these into master (instead of one big merge)

馃挴

With watcher, we require users to specify an email address to receive alerts in their kibana.yml

I prefer in the Kibana UI, just because it's more UI friendly, and they can modify the info without restarting, but I don't mind continuing the yml trend.

Thanks for the thoughts @igoristic!

Once "Kibana Alerting" is live are we completely deprecating/removing the current/old Alerting?

I guess it depends on if we want a slow rollout of these migrations. If so, we will be living in a world where both are running and exist at the same time (not for the same alert check, but we'll have some watcher based cluster alerts and some kibana alerts)

I think we might still want a new index, just in case some setups still have the old .monitoring-alerts-* with legacy documents (or for some reason we need to support both ES and Kibana alerting). We can abbreviate it with something like -kb like we do -mb for Metricbeat.

You don't think we can accomplish the same UI from just using the state provided by the alerting framework? I think that's really all we need since we'll store data in there that tells us when the alert fired and if it's been resolved yet.

I prefer in the Kibana UI, just because it's more UI friendly, and they can modify the info without restarting, but I don't mind continuing the yml trend.

Yea I agree the UI route is better, but if we do a slow rollout, it might be confusing for folks who already have the kibana.yml config set - I think we need to make a call on the slow rollout and that will help inform us of how to handle these other issues.

You don't think we can accomplish the same UI from just using the state provided by the alerting framework? I think that's really all we need since we'll store data in there that tells us when the alert fired and if it's been resolved yet.

I guess I don't really know how the current implantation well enough to validate my concern. My worry is that if an ES Alert is triggered it'll be added to the index which will then be picked up by both ES Alerts and KB Alerts which might duplicate some actions like sending two emails etc...

I just think a new index can help avoid any of this issues we might not yet foresee (maybe for the same reason Metricbeat has its own -mb indices?)

This is all based on speculation though

I guess I don't really know how the current implantation well enough to validate my concern. My worry is that if an ES Alert is triggered it'll be added to the index which will then be picked up by both ES Alerts and KB Alerts which might duplicate some actions like sending two emails etc...

Ah, I see the confusion here.

Part of this work involves disabling (or blacklisting per @cachedout's idea) the cluster alert when we enable the Kibana alert. We'd never have a situation (intentionally) where _both_ the cluster alert for xpack license expiration, and the Kibana alert for xpack license expiration are running at the same time.

I'm thinking we'll want to progressively add these into master (instead of one big merge) and if so, we should think about if we want to disable these until they are all in, or do we want to enable at least one from the start and have it co-exist with the other watches?

I think that gradually merging these and leaving them disabled until we are ready to switch the new alerting on in the application is the right thing to do. It gives us time to develop and test the alerts while minimizing the disruption for the user.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

cafuego picture cafuego  路  3Comments

mark54g picture mark54g  路  3Comments

LukeMathWalker picture LukeMathWalker  路  3Comments

Ginja picture Ginja  路  3Comments

tbragin picture tbragin  路  3Comments