Terraform-provider-aws: aws_msk_cluster.bootstrap_brokers has nondeterministic order

Created on 26 May 2020  路  6Comments  路  Source: hashicorp/terraform-provider-aws

Affected Resource(s)

aws_msk_cluster

Terraform Configuration Files

bootstrap_brokers attribute of aws_msk_cluster returns the brokers in nondeterministric order, forcing unnecessary configuration changes in resources depending on this value.

I think the values returned by AWS API should be sorted by the Terraform resource.

needs-triage

Most helpful comment

@MateuszStefek i have also noticed this and in addition, it only returns back 3 brokers. If the cluster has more than 3, you get a non-deterministic set of them each time. I attempted to sort the brokers lexicographically but it only reduced the unnecessary changes, not eliminated them. The true issue here my guess is with the AWS API and Terraform is a victim of the poor design on the AWS side.

All 6 comments

@MateuszStefek i have also noticed this and in addition, it only returns back 3 brokers. If the cluster has more than 3, you get a non-deterministic set of them each time. I attempted to sort the brokers lexicographically but it only reduced the unnecessary changes, not eliminated them. The true issue here my guess is with the AWS API and Terraform is a victim of the poor design on the AWS side.

My colleagues and me found a way to circumvent this issue by using this script to be useful as a workaround until the issue is resolved:

provider "shell" {}

data "shell_script" "brokers" {
  lifecycle_commands {
    read = <<-EOF
      aws kafka list-nodes --cluster-arn '${aws_msk_cluster.msk-kafka-cluster.arn}' | jq '{ nodes: [.NodeInfoList[] | .BrokerNodeInfo.Endpoints[] | "\(.):9092" ]}'
    EOF
  }
}
locals {
  kafka_nodes = [
    for broker in sort(jsondecode(data.shell_script.brokers.output["nodes"])):
      trimsuffix(broker, ":")
  ]
}

Hope this helps. We also figured out that it only returns 3 either we have the cluster being configured to contain 4 nodes in total. That made it really difficult to get reproducible and deterministic builds.

As another workaround, I ended up making 3 CNAMEs in route53 that were deterministic and have my TF code update those records on each apply. Then all my resources that depend on MSK just reference those CNAMEs which do not change. This limits the churn in my infrastructure

We have laid out our route53 entries in terraform as well, and thus we had the same issue over there which we solved by using the new local.kafka_nodes. Did you create the CNAME entries manually, then?

@isaias-b no, made them with TF similar to:

locals {
  broker_host_list = "${split(",", replace(element(concat(aws_msk_cluster.kafka_cluster.*.bootstrap_brokers, list("")), 0), ":9092", ""))}"
}

resource "aws_route53_record" "broker_dns_record" {
  count   = "${min(var.broker_node_count, 3) : 0}"
  zone_id = "${data.aws_route53_zone.hz.zone_id}"
  name    = "${"broker-${count.index}.${var.domain}"}"
  type    = "CNAME"

  records = [
    "${local.broker_host_list[count.index]}",
  ]

  ttl = 60
}

I originally did a sort albeit directly in TF vs using a script callout, but as the order of the brokers is not only non-deterministic but the set of 3 brokers that is returned is also non-deterministic, sorting did not really help much. With the above route53 type of thing in place, I can have the rest of the TF reference the CNAMEs so the churn is limited to pretty much just updating route53 records on each apply and the rest of the infrastructure is agnostic to that

Fwiw - in the AWS Console > MSK page, when you view your cluster and click "View Client Information" the bootstrap broker list has the same symptom - nondeterministic and always outputs 3 brokers. In our configuration we have 6 across 2 subnets - but always get 3 in some random order.

Was this page helpful?
0 / 5 - 0 ratings