Terraform-provider-aws: Elasticache Redis Multi AZ not enabled

Created on 10 Jun 2020  路  24Comments  路  Source: hashicorp/terraform-provider-aws

Hi,

I am trying to get an Elasticache (Redis) instance up and running (cluster mode disabled) and for that I was starting with the first two examples from documentation:
https://www.terraform.io/docs/providers/aws/r/elasticache_replication_group.html

but it seems that even if everything is provisioned and working, the Multi AZ is still Disabled. I have the Failover enabled. Any idea what is wrong?

I am 90% sure I tried these two examples a few weeks ago and Multi AZ was enabled, as far as I remember. After having the cluster up & running I can enable Multi AZ from AWS console, but I would like to have it in terraform, of course.

Thank you very much!

Here is the code I am using:
~~~
resource "aws_elasticache_replication_group" "example-1" {
automatic_failover_enabled = true
availability_zones = ["eu-central-1a", "eu-central-1b"]
replication_group_id = "tf-rep-group-1"
replication_group_description = "test description"
node_type = "cache.t3.micro"
number_cache_clusters = 2
parameter_group_name = "default.redis5.0"
port = 6379
}

resource "aws_elasticache_replication_group" "example-2" {
automatic_failover_enabled = true
availability_zones = ["eu-central-1a", "eu-central-1b"]
replication_group_id = "tf-rep-group-2"
replication_group_description = "test description"
node_type = "cache.t3.micro"
number_cache_clusters = 2
parameter_group_name = "default.redis5.0"
port = 6379

lifecycle {
ignore_changes = ["number_cache_clusters"]
}
}

resource "aws_elasticache_cluster" "replica" {
count = 1

cluster_id = "tf-rep-group-1-${count.index}"
replication_group_id = "${aws_elasticache_replication_group.example-2.id}"
}
~~~

And here are two screenshots after terraform created the resources:

ElastiCache Management Console 2020-06-10 16-18-43

ElastiCache Management Console 2020-06-10 16-09-37

bug servicelasticache

Most helpful comment

@breathingdust could I respectfully ask for your reasoning on labelling this an enhancement? From a user-experience POV, this is a bug. The resource doesn't do what the docs say it will do.

All 24 comments

Hello,

I have the same problem. In my case I am trying to create a Multi-AZ auto-failover system (cluster mode), and this is my configuration:

resource "aws_elasticache_parameter_group" "redis_parameter_group" {
  name   = "${var.env}-${var.profile}-redis-parameter-group"
  family = "redis5.0" #"redis2.8"

  parameter {
    name  = "maxmemory-policy"
    value = "noeviction"
  }
}

resource "aws_elasticache_replication_group" "test_redis_pers" {
    automatic_failover_enabled    = true
    replication_group_id          = "test-persistance"
    replication_group_description = "test persisance redis"
    node_type                     = "cache.t3.micro"
    engine                        = "redis"
    engine_version                = "5.0.6"
    parameter_group_name          = "${aws_elasticache_parameter_group.redis_parameter_group.name}"
    subnet_group_name             = "${aws_elasticache_subnet_group.redis_subnet_group.name}"
    security_group_ids            = ["${aws_security_group.elastic_cache.id}"]
    port                          = 6379
    maintenance_window            = "sat:07:00-sat:08:00"
    snapshot_window               = "05:00-06:00"
    snapshot_retention_limit      = 1

    cluster_mode {
      replicas_per_node_group       = 1
      num_node_groups               = 1
    }
    lifecycle {
      ignore_changes = ["number_cache_clusters"]
  }
}

redis-config

We have the same issue also.

I ran into this yesterday.
After digging a little bit, it seems that AWS only recently added a 'MultiAZ' attribute to the cli; see https://github.com/boto/botocore/commit/5fd5baa152a057c161840b3ef60a34d5177b4563#diff-be2136ac185370e57999c1dbe65ba6b5.
I'm guessing that the attribute was also only added to the aws api recently, as well. And I'm guessing that, prior to that, the api simply set Multi-AZ to whatever AutomaticFailoverEnabled was set to. And after that, it simply defaulted Multi-AZ to false. (Which is not backwards compatible, IMO, but maybe AWS doesn't see it that way.)

So, one solution could be to add a new multi_az attribute to aws_elasticache_replication_group. This would reflect the (new?) reality that multi-az is independent from automatic-failover-enabled, from AWS' point of view.

Another solution might be to begin explicitly sending MultiAZ as true when automatic_failover_enabled is true. This would preserve the old behavior for terraform users. But I'm not sure how feasible that is.

For now, this workaround seems to work ok:

resource "aws_elasticache_replication_group" "test" {
  automatic_failover_enabled    = true
  engine                        = "redis"
  engine_version                = "5.0.6"
  node_type                     = "cache.t3.micro"
  number_cache_clusters         = 2
  parameter_group_name          = "default.redis5.0"
  subnet_group_name  = aws_elasticache_subnet_group.subnet_group.name
  replication_group_description = "A automatic_failover_enabled=true replication group that should also be multi-az, but is not, without help."
  replication_group_id          = "should-be-multi-az"

  provisioner "local-exec" {
    environment = {
      REPLICATION_GROUP_ID = aws_elasticache_replication_group.test.replication_group_id
    }
    command = <<-EOT
      aws elasticache modify-replication-group \
        --replication-group-id $REPLICATION_GROUP_ID \
        --multi-az-enabled \
        --apply-immediately
    EOT
  }

}

Note: this requires a very recent awscli version.

We have this issue too, and we opened a ticket in AWS, they said that the GO SDK was already updated
CreateReplicationGroupInput
Hope you guys could fix it ASAP!
Thank you very much!

@acerest did AWS say that the updated GO SDK will restore the old behavior, or that it simply has a MultiAZ field that you can use?

@acerest did AWS say that the updated GO SDK will restore the old behavior, or that it simply has a MultiAZ field that you can use?

@mattdrees Sorry for replying late, it's just a parameter

const (
// MultiAZStatusEnabled is a MultiAZStatus enum value
MultiAZStatusEnabled = "enabled"
// MultiAZStatusDisabled is a MultiAZStatus enum value
MultiAZStatusDisabled = "disabled"
)

I'm having this issue now and my old cloud formation is having the same problem if applied today.

It seems that in CFN they split out the field to "MultiAZEnabled" explicitly. Without digging deeper, I think terraform is not passing that along (or allowing you to set it). I think it's changed on the AWS api side to have two explicit fields instead of one implicitly setting the other.

The go sdk includes this too:
https://docs.aws.amazon.com/sdk-for-go/api/service/elasticache/

I added a draft PR for this change, but it's dependent on aws-sdk-go 1.32.7 being merged first

I added a draft PR for this change, but it's dependent on aws-sdk-go 1.32.7 being merged first

I see that this one has been merged already. Is there are some more problems?
image

Wouldn't there need to be a PR to the Terraform repo itself updating it to use that version? Just because it's been released by AWS doesn't mean Terraform automatically uses it. So it's dependant on https://github.com/terraform-providers/terraform-provider-aws/pull/13803 from what I can tell.

13803 is merged, now.

Is multi_az going to be added, or is it going to be implied by automatic_failover_enabled? The docs seem to say that automatic_failover_enabled does imply MultiAZ: https://www.terraform.io/docs/providers/aws/r/elasticache_replication_group.html#automatic_failover_enabled

automatic_failover_enabled - (Optional) Specifies whether a read-only replica will be automatically promoted to read/write primary if the existing primary fails. If true, Multi-AZ is enabled for this replication group. If false, Multi-AZ is disabled for this replication group. Must be enabled for Redis (cluster mode enabled) replication groups. Defaults to false.

So, will there be a new PR to update the tf resource to take multi-az as an attribute?

+1 for multi-az attribute !

+1 for multi-az

If you'd like to see this fixed, you might consider adding your "thumbs up" to @goodspellar 's PR: https://github.com/terraform-providers/terraform-provider-aws/pull/13909

Running into this same problem today.

@breathingdust could I respectfully ask for your reasoning on labelling this an enhancement? From a user-experience POV, this is a bug. The resource doesn't do what the docs say it will do.

Everything @mattdrees said. This issue has been open for long enough now.

My bad @mattdrees! I've corrected the labeling.

Lots of changes to auto failover/Multi-AZ parts of the AWS docs in this commit from June, but I can't quite figure out why Multi-AZ / auto-failover is now two separate options, or how it's meant to behave... I'm fairly sure before that there was only one option, called Multi-AZ in the console but auto-failover in the API.

Right, there was only one api attribute before, and they added a second (MultiAZEnabled) api attribute.

I can't find docs on the intended behavior change.
Maybe you can now have auto-failover within a single AZ? Maybe to save to save on network costs? :shrug:

Here's a workaround for time being:

resource "null_resource" "nr" {
  triggers = {
    cache = aws_elasticache_replication_group.cache.id
  }
  provisioner "local-exec" {
    command = "aws elasticache modify-replication-group --replication-group-id ${aws_elasticache_replication_group.cache.id} --multi-az-enabled --apply-immediately"
  }
}

This will update MultiAZ setting with aws cli after aws_elasticache_replication_group is created.

Keep in mind that your aws_elasticache_replication_group must have multiple availability zones (set via availability_zones).
Also note that aws cli version must be recent enough to support --multi-az-enabled flag.

Was this page helpful?
0 / 5 - 0 ratings