Terraform: 0.7.0 datadog_monitor require_full_window doesn't work

Created on 8 Aug 2016 · 23Comments · Source: hashicorp/terraform

Terraform Version

0.7.0

Affected Resource(s)

Please list the resources as a list, for example:

datadog_monitor

Terraform Configuration Files

resource "datadog_monitor" "sqs_dl_queue_change" {
    name = "DL Q Depth Change"

    type = "query alert"

    query = "change(min(last_1h),last_1h):sum:aws.sqs.approximate_number_of_messages_visible{queuename:my_dl_queue_name} by {queuename} > 0"

    thresholds {
        critical = 0
    }

    require_full_window = false

    notify_no_data = false

    renotify_interval = 60

    include_tags = true

    message = <<EOL
My Alert Message
EOL

    escalation_message = <<EOL
My Escalation Message
EOL
}

Expected Behavior

Datadog Monitor 'Require Full Window' set to 'Do Not Require'

Actual Behavior

Datadog Monitor 'Require Full Window' set to 'Require'

Steps to Reproduce

terraform apply

References

#6738

bug providedatadog

Source

eedwardsdisco

Most helpful comment

Hi @ojongerius

ok, so i just figured out what the issue is here - it's our old friend d.GetOk

You have the following code:

if attr, ok := d.GetOk("include_tags"); ok {
        o.SetIncludeTags(attr.(bool))
    }

d.GetOk is broken for default values :( So the comment across the func reads as follows:

// GetOk returns the data for the given key and whether or not the key
// has been set to a non-zero value at some point.

I have just opened #12168 that deals with it and verifies that we are good

stack72 on 22 Feb 2017

❤2

All 23 comments

Confirmed that if I go into the datadog monitor GUI and change from 'Require' to 'Do Not Require' and then run a terraform plan, datadog shows no needed changes.

If I then go into datadog monitor GUI and change back to 'Require' and save, then run a terraform plan, datadog does reflect the needed change:

~ datadog_monitor.sqs_dl_queue_change
    require_full_window: "true" => "false"

Running terraform apply at this point does satisfy the change to terraform, but the datadog monitor GUI still reflects 'Require' instead of 'Do Not Require'.

Thus possibly the API change is not causing the same change to be visible on Datadog's Monitor GUI, however, if I manually change it in the GUI, the GUI persists the visible change.

eedwardsdisco on 8 Aug 2016

I opened a support case with Datadog and an engineer confirmed that using the Datadog API correctly creates a monitor with 'Do Not Require' set in the GUI as expected.

Thus I suspect this problem is indeed on Terraform's side.

eedwardsdisco on 9 Aug 2016

Confirmed that the monitor is not actually set to 'Do Not Require' full window. I had an issue with a monitor created using terraform and had to resolve it by manually changing the setting in the GUI.

eedwardsdisco on 9 Aug 2016

@ojongerius thought's on this? AFAICR, you implemented the feature :)

stack72 on 18 Aug 2016

This is true.

I'm not sure what's going on here, and tests pass. I did notice the API behaves unexpectedly. Prodding with the Python client show that the API does not return the value for require_full_window (which defaults to True) unless it has been set before.

Not specifying a value creates the monitor, which default to settingrequire_full_window to True, but the API is silent about this. Go's default value for type bool is false which might explain this behaviour.

This is all speculation. If no one gets around this by Tuesday I have a stab at fixing it.

ojongerius on 18 Aug 2016

I've had a closer look and confirmed my suspicions. The behaviour I see agrees with what @eedwardsdisco describes. The tests pass because the default value for Boolean is false.

A solution would be to use pointers, and I think it is worthwhile to refactor the API we use to do this. I've pushed changes that fix this particular bug in https://github.com/atlassian/terraform/commit/89bd09f0bdb7d71bdc21d5c4bfaf8e05deca6707 but I suspect we can fix more latent issues by using pointers structs we use for (un)marshalling.

I've opened an issue at Datadog because I do think their behaviour is a little odd, and I'll work on getting fixes in https://github.com/zorkian/go-datadog-api

Test output for the commit:

github.com/hashicorp/terraform  datadog_monitor_full_window ✗                                                                                                                                                         5d ⚑  ⍉
▶ make testacc TEST=./builtin/providers/datadog 
==> Checking that code complies with gofmt requirements...
/Users/ojongerius/gocode/bin/stringer
go generate $(go list ./... | grep -v /terraform/vendor/)
2016/08/24 18:24:51 Generated command/internal_plugin_list.go
TF_ACC=1 go test ./builtin/providers/datadog -v  -timeout 120m
=== RUN   TestProvider
--- PASS: TestProvider (0.00s)
=== RUN   TestProvider_impl
--- PASS: TestProvider_impl (0.00s)
=== RUN   TestAccDatadogMonitor_Basic
--- PASS: TestAccDatadogMonitor_Basic (90.81s)
=== RUN   TestAccDatadogMonitor_RequireFullWindowUnSet
--- PASS: TestAccDatadogMonitor_RequireFullWindowUnSet (87.62s)
=== RUN   TestAccDatadogMonitor_Updated
--- PASS: TestAccDatadogMonitor_Updated (71.08s)
=== RUN   TestAccDatadogMonitor_TrimWhitespace
--- PASS: TestAccDatadogMonitor_TrimWhitespace (66.42s)
=== RUN   TestAccDatadogTimeboard_update
--- PASS: TestAccDatadogTimeboard_update (75.53s)
PASS
ok      github.com/hashicorp/terraform/builtin/providers/datadog        391.475s

Happy to raise a PR when I've solved this issue upstream.

ojongerius on 24 Aug 2016

Hey, DD dev here. I looked at this from our point of view and indeed there are "3 possible values".
If you explicitly set the require_full_window to True or False, then it will use this parameter explicity.
If you don't set it, then it will select the "best" default behavior which is explained here:

require_full_window a boolean indicating whether this monitor needs a full window of data before it's evaluated. We highly recommend you set this to False for sparse metrics, otherwise some evaluations will be skipped.
Default: True for "on average", "at all times" and "in total" aggregation. False otherwise.

and in our code:

if requires_full is None:
            # If not defined, fallback to our default check.
            requires_full = _requires_full_window(params...)

So we actually need 3 values: explicitly set to True or False and a default.

If we start forcing a set on require_full_window at monitor creation time, then if you ever modify the monitor to something that would have a different default, then we can't know if you explicitly set require_full_window previously and edit this for you.
If we return the value require_full_window defaulted it defaulted to, the api will not be consistent anymore with things that sync your monitors (ex: this & that). i.e: edit_monitor(get_montior(monitor_id)) will explicitly set the require_full_window to the value returned.

We mostly code in Go too and yeah their decision to not include a null value and default everything sometimes (it's super rare though) bits us back too.
Hope this helps.

Viq111 on 24 Aug 2016

Thanks @Viq111! I've opened an issue at go-datadog-api to consider updating the library. Until that work has been done we can fix this particular issue by using a fork. @stack72 thoughts?

ojongerius on 25 Aug 2016

Any workarounds coming for this? Would love to not have to go in and manually fix my monitors after every apply.

eedwardsdisco on 12 Sep 2016

Hi @eedwardsdisco sorry this has taken a while. I have a dirty workaround in https://github.com/atlassian/terraform/commit/89bd09f0bdb7d71bdc21d5c4bfaf8e05deca6707 but was hoping we could decide and push out a more structural upstream fix in https://github.com/zorkian/go-datadog-api/issues/56.

I'll see if I can get a workaround happening for your particular case while we work on a nice long term solution for the API library.

ojongerius on 14 Sep 2016

@eedwardsdisco / @stack72 I've raised https://github.com/hashicorp/terraform/pull/8836 to address this.

As stated I'd need to validate this, it's the end of the day here in OZ. If no one gets around to doing acceptance tests (which I think you have scheduled) I can verify this tomorrow.

Cheers,

ojongerius on 14 Sep 2016

Made some updates, appears you need to tell Govendor which branch you are after ;)

ojongerius on 15 Sep 2016

Thanks for working on this!

eedwardsdisco on 17 Sep 2016

@ojongerius any luck on the workaround PR getting validated? Thanks!

eedwardsdisco on 26 Sep 2016

@ojongerius Sorry to bump on this, but require_full_window is still broken and I'm curious if you had a chance to follow up with the work around?

eedwardsdisco on 21 Oct 2016

Hey @eedwardsdisco sorry I've been swamped with work.

While validating my changes I noticed it is a step in the right direction but needs a little work. GetOk currently only returns configs values if they are not default, which does not for this particular case. I have uncommitted changes that I'll validate tomorrow.

Hope to have pushed changes fixing this once and for all by the end of this week.

ojongerius on 7 Dec 2016

👍1

@stack72 / @mitchellh what's the best way to handle situations where a REST endpoint returns nil, true or false? I have some code that works in handling this the right way when creating new monitors, but I'm not sure on how to set it back to "unset" when the the user removes the option from the config ie. For example:

This config will leave it unset (when using *bool in the API)

resource "datadog_monitor" "foo" {
  type              = "query alert"
  name              = "require_full_window_not_set"
  query             = "avg(last_5m):avg:system.cpu.idle{*} > 5"
  message           =  "Test"
  // require_full_window = true
}

We can then set it with:

resource "datadog_monitor" "foo" {
  type              = "query alert"
  name              = "require_full_window_not_set"
  query             = "avg(last_5m):avg:system.cpu.idle{*} > 5"
  message           =  "Test"

 require_full_window = true // enabled
}

But if we remove it the state will record this as false

resource "datadog_monitor" "foo" {
  type              = "query alert"
  name              = "require_full_window_not_set"
  query             = "avg(last_5m):avg:system.cpu.idle{*} > 5"
  message           =  "Test"

 // require_full_window = true state will now say this is set to false, the default value for bool
}

ojongerius on 8 Dec 2016

@ojongerius I see there's a suggested fix for this waiting for some 👀 at https://github.com/zorkian/go-datadog-api/issues/56#issuecomment-246154160. Would that help move this along?

ocxo on 11 Jan 2017

Hey @stack72 do you have any suggestions? We can make the API (as we are discussing in https://github.com/zorkian/go-datadog-api/issues/56 ) support nil, true or false, but how would we do that in the provider?

ojongerius on 14 Feb 2017

@stack72 any updates?

This was a big part of the 0.7.0 release and we're now approaching 0.9.0 and it still doesn't work.

eedwardsdisco on 14 Feb 2017

@fromonesrc thanks for the heads up 😄 @eedwardsdisco sorry this has been biting you for so long..

As of https://github.com/hashicorp/terraform/commit/23103166669afe2b25c094c5b5aff152ac53bc3e we now have access to v2 of the library. This allows us to distinguish of a a value is set or not.

Including this issue we now have 3 issues (https://github.com/hashicorp/terraform/issues/10264 and its dupe https://github.com/hashicorp/terraform/issues/10881 ) open in this project, and multiple (potential) Terraform contributors have been struggling to make things work as expected while adding new features (https://github.com/zorkian/go-datadog-api/pull/87 and https://github.com/zorkian/go-datadog-api/issues/75).

This hacky code:

package main

import "gopkg.in/zorkian/go-datadog-api.v2"
import "fmt"
import "os"

func createTestMonitor(client *datadog.Client, expected string, options datadog.Options) {
    tags := []string{}
    tags = append(tags, "foo")
    tags = append(tags, "bar")

    m := &datadog.Monitor{
        Type:    datadog.String("metric alert"),
        Query:   datadog.String("avg(last_1h):avg:aws.ec2.cpu{environment:foo,host:foo} by {host} > 2"),
        Name:    datadog.String(fmt.Sprintf("include_tags and require_full_window options set to %s", expected)),
        Message: datadog.String("it's broken"),
        Tags:    tags,
        Options: &options,
    }
    m, err := client.CreateMonitor(m)
    if err != nil {
        fmt.Println("Creating a monitor failed when it shouldn't")
        return 
    }

    fmt.Printf("Monitor ID: %d ", m.GetId())
    fmt.Printf("Monitor Name: %s\n", m.GetName())
    if _, ok := m.GetOptionsOk(); ok {
        if v, ok := m.Options.GetIncludeTagsOk(); ok {
            fmt.Printf("Include Tags set to %t\n", v)
        } else {
            fmt.Println("Include Tags not set")
        }
        if v, ok := m.Options.GetRequireFullWindowOk(); ok {
            fmt.Printf("Require Full Window set to %t\n", v)
        } else {
            fmt.Println("Require Full Window not set")
        }
    }
}

func main() {
    client := datadog.NewClient(os.Getenv("DATADOG_API_KEY"), os.Getenv("DATADOG_APP_KEY"))

    o := datadog.Options{}
    o.SetIncludeTags(false)
    o.SetRequireFullWindow(false)
    createTestMonitor(client, "false", o)

    o.SetIncludeTags(true)
    o.SetRequireFullWindow(true)
    createTestMonitor(client, "false", o)

    o = datadog.Options{}
    createTestMonitor(client, "not set", o)
}

shows how using the library returns the correct values:

▶ go run test.go
Monitor ID: 1636548 Monitor Name: include_tags and require_full_window options set to false
Include Tags set to false
Require Full Window set to false
Monitor ID: 1636549 Monitor Name: include_tags and require_full_window options set to false
Include Tags set to true
Require Full Window set to true
Monitor ID: 1636550 Monitor Name: include_tags and require_full_window options set to not set
Include Tags not set
Require Full Window not set

The acceptance test in Terraform looks fine too:

=== RUN   TestAccDatadogMonitor_Updated
--- PASS: TestAccDatadogMonitor_Updated (9.62s)

However, when manually testing creation using a simple config, the following config:

resource "datadog_monitor" "foo" {
  name = "name for monitor foo"
  type = "metric alert"
  message = <<EOF
some message Notify: @hipchat-channel
EOF
  escalation_message = <<EOF
the situation has escalated @pagerduty
EOF
  query = <<EOF
avg(last_1h):avg:aws.ec2.cpu{environment:foo,host:foo} by {host} > 2
EOF
  thresholds {
    ok = "0.0"
    warning = "1.0"
    critical = "2.0"
  }

  notify_no_data = false
  renotify_interval = 60

  notify_audit = false
  timeout_h = 60
  require_full_window = false
  include_tags = false
  tags = ["foo", "bar"]
}

Results in a monitor without those values set:

{'created': '2017-02-21T04:45:50.612177+00:00',
 'created_at': 1487652350000,
 'creator': {'email': '[email protected]',
  'handle': '[email protected]',
  'id': 483389,
  'name': 'Otto Jongerius'},
 'deleted': None,
 'id': 1636483,
 'message': 'some message Notify: @hipchat-channel',
 'modified': '2017-02-21T04:58:42.979375+00:00',
 'multi': True,
 'name': 'name for monitor foo',
 'options': {'escalation_message': 'the situation has escalated @pagerduty',
  'locked': False,
  'new_host_delay': 300,
  'notify_audit': False,
  'notify_no_data': False,
  'renotify_interval': 60,
  'silenced': {},
  'thresholds': {'critical': 2.0, 'ok': 0.0, 'warning': 1.0},
  'timeout_h': 60},
 'org_id': 96999,
 'overall_state': 'No Data',
 'query': 'avg(last_1h):avg:aws.ec2.cpu{environment:foo,host:foo} by {host} > 2',
 'tags': ['foo', 'bar'],
 'type': 'metric alert'}

I'm obviously doing something wrong here, and my question to someone more intimate with Terraform is: how do we get this working as expected in a provider? Other boolean values work as expected, but I'm not sure how to beat the options include_tags and require_full_window into submission.

@stack72 are you or someone else able to figure this one out? It can't be too hard and I can't imagine I'm the first one running into similar issues.

ojongerius on 21 Feb 2017