Terraform-provider-azurerm: Support alerts based on Log analytics queries

Created on 29 Jul 2019  ·  24Comments  ·  Source: terraform-providers/terraform-provider-azurerm

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

Create Alerts based on Log analytics queries. as documented here https://docs.microsoft.com/en-us/azure/azure-monitor/platform/alerts-log
The corresponding API
https://docs.microsoft.com/en-us/rest/api/monitor/scheduledqueryrules/createorupdate
Currently only possible to create alerts based on Azure monitor metrics.
https://www.terraform.io/docs/providers/azurerm/r/monitor_metric_alert.html

New or Affected Resource(s)

  • azurerm_monitor

Potential Terraform Configuration

enhancement servicmonitor

Most helpful comment

I've just started working on this. Not sure yet how long it would take to get a PR together. Maybe a couple weeks. Anyone else already taking this on?

All 24 comments

This is really needed as Azure does not expose all metrics as default .

This is possible by using log analytics workspace with customised queries and create alert.

But terraform does n't support creating alerts based on log anayltics queries.

Please help to prioritise this enhancement in Terraform.

I've just started working on this. Not sure yet how long it would take to get a PR together. Maybe a couple weeks. Anyone else already taking this on?

Is there any update on this?

@mcdafydd any updates on this? thanks.

Hi all,

Two months flies by. Sorry about that. I had started working on this and put it down for a bit while #4638 was getting merged. That's done and I'm ready to get back into this now. I'll commit what I have into my fork as soon as I can and try to get the PR created. I'll aim for the end of the week.

It's not completed yet, but I've made solid progress in my fork. Data source and doc is almost done. Still need to finish the resource and tests. I may not have a PR today, but should have it submitted before Thursday.

Edit: Removed reference to early working branch. Updated examples below to match the PR submission.

I could use some feedback on the approach. Right now, I've got everything under a single new resource called azurerm_monitor_scheduled_query_rules, matching the API endpoint. The resource definition is getting kinda long as it needs to support two different styles of actions with different required and optional parameters. I also decided to flatten the action, schedule, and source blocks.

I'm wondering if it would be better to break these out into two resources called azurerm_monitor_alerting_action and azurerm_monitor_log_to_metric_action.

An AlertingAction looks like this right now:

resource "azurerm_scheduled_query_rule" "example" {
  name                   = format("%s-queryrule", var.prefix)
  location               = azurerm_resource_group.example.location
  resource_group_name    = azurerm_resource_group.example.name

  action_type              = "Alerting"
  azns_action {
    action_group           = []
    email_subject          = "Email Header"
    custom_webhook_payload = "{}"
  }
  data_source_id           = azurerm_application_insights.example.id
  description              = "Scheduled query rule Alerting Action example"
  enabled                  = true
  frequency                = 5
  query                    = "requests | where status_code >= 500 | summarize AggregatedValue = count() by bin(timestamp, 5m)"
  query_type               = "ResultCount"
  severity                 = 1
  time_window              = 30
  trigger {
    threshold_operator     = "GreaterThan"
    threshold              = 3
    metric_trigger {
      operator            = "GreaterThan"
      threshold           = 1
      metric_trigger_type = "Total"
      metric_column       = "timestamp"
    }
  }
}

and a LogToMetricAction plan looks like this right now:

resource "azurerm_scheduled_query_rule" "example3" {
  name                   = format("%s-queryrule3", var.prefix)
  location               = azurerm_resource_group.example.location
  resource_group_name    = azurerm_resource_group.example.name

  action_type            = "LogToMetricAction"
  criteria               = [{
      metric_name        = "Average_% Idle Time"
      dimensions         = [{
        name             = "dimension"
        operator         = "GreaterThan"
        values           = ["latency"]
      }]
  }]
  data_source_id         = azurerm_application_insights.example.id
  description            = "Scheduled query rule LogToMetric example"
  enabled                = true
}

Are there some common guidelines I could follow for new resources like this?

Thanks!

@mcdafydd - will this be released in 1.39.0 ? Thanks.

:crossed_fingers: I hope so! That's not my decision to make. I will keep checking the pull request and make sure it stays mergeable and issue-free.

Still waiting for this to be released, any chances for this to be added to 1.40.0 release?

Also waiting for this, any updates?

@mcdafydd is there any way to push for review your pull request? I'm waiting on this.

This is only my second PR to terraform, so I wouldn't expect I had a lot of pull on the matter. I'm with ya though, excited to be able to create both alert actions and monitoring queries in code.

@tombuildsstuff @katbyte do you think 1.42.0 will likely be a successful version to review the PR for this feature? If there's anything I can do in the meantime, I'm of course willing to help. The only slight concern I have is that if any significant refactoring is required for #5053 before it can be approved, I could definitely use some extra lead time to make sure I can get any requests completed in time for release.

Thanks for any help!

I wonder why this keeps being constantly pushed to later versions. Especially since the PR has all checks passed and no conflicts. I guess resourcing issue? :/

The reason why I am commenting on that log-analytics-query-based alerts and metrics are one of the few things that our current TF automation can't handle. So I am waiting on this keenly. :)

Thanks @katbyte for doing an initial review! I should be able to submit updates for all the issues by the end of the week.

Looking forward to getting this feature. Converting my resources one by one to Terraform and was on to working alerting today but alas, scheduled query rules do not exist. Thank you mcdafydd for all your work on this.

We're getting close now, @DanielFrei64. Now that we have complete Action Groups, after Scheduled Query Rules I think a good next step would be Action Rules.

Also looking forward to getting this feature as well as the action rules to tie everything together. Thanks for working on this.

2.0.0 just released, will this be included soon?

This has been released in version 2.1.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 2.1.0"
}
# ... other configuration ...

Thanks for implementing this , Do we have any document reference for configuration?

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

Was this page helpful?
0 / 5 - 0 ratings