Currently, in order to achieve blue/green deployment with worker groups (i.e. updating to a new AMI), I have to add a new worker group with the updated AMI, let them spin up, drain the old nodes so pods transition, then scale down the old worker group (set min/max/desired to 0).
This is not a terrible way of doing it but the problem is that the old ASG (and related resources) sticks around forever and there doesn't seem to be a way to clean up the old stuff without major surgery. If I change the AMI 3 times, I now have 3 worker groups - 2 inactive and scaled to 0 and one active.
Is there a better way of doing this with this module? There's a distinct possibility I'm missing some fundamental terraform concepts but this seems like a complex issue to me. My code ends up looking like this after a new worker group is fully deployed and the old is scaled down (you can see how even semi-frequent deployments would make this list long and leave a lot of trailing garbage):
map(
"name", "k8s-worker-179fc16f",
"ami_id", "ami-179fc16f",
"asg_desired_capacity", "0",
"asg_max_size", "0",
"asg_min_size", "0",
),
map(
"name", "k8s-worker-67a0841f",
"ami_id", "ami-67a0841f",
"asg_desired_capacity", "5",
"asg_max_size", "8",
"asg_min_size", "5",
"instance_type","${lookup(var.worker_sizes, "${terraform.workspace}")}",
"key_name", "${aws_key_pair.infra-deployer.key_name}",
"root_volume_size", "48"
)
I have seen other code around the internet that does blue/green ASGs but those are for much simpler use-cases IMO - a create_before_destroy and letting it rip would bring a K8s cluster down. I have no qualms with multiple apply steps - its the cleanup part that I'm after.
Hey @hobbsh - totally valid points here. Let's see if we can't find a way.
Thinking about this a bit, the situation wouldn't be so bad if we had a load balancer enforcing health in tandem with create_before_destroy and a minimum healthy instance count. Alas, not in this brave new world...
I haven't quite put together how removing a dead asg entry from the list ends up being disruptive especially if you've explicitly named your groups as you show above. Does terraform tear down everything and recreate the ones that remain?
@brandoconnor Yes terraform sees the index change and recreates everything. Terraform will see the list indexes have changed and reassign the resource names based on the new indexes. It's a bit confusing to me why a one element worker_group list does not have [0] appended to it in the resource name, it shows up as just module.eks.aws_autoscaling_group.workers. Only after adding a second worker_group does the index appear, but that may be irrelevant and something I haven't noticed in Terraform until now.
I have been thinking about possibly adding a flag like delete = true so all that's left of an old ASG map is map("delete", "true") but that would require reworking the count parameter on all the ASG resources. I have also done targeted destroys but then that gets messy with resources wanting to recreate since the ASG still exists in the worker_groups list, again requiring some sort of flag to tell Terraform not to recreate. Maybe more things will be possible when Terraform v0.12 is released.
I dug back in the terraform history and found an example of what deleting the original ASG looks like (this was before I explicitly named with the AMI ID but iirc it did not make a difference - I've killed several worker groups by accident in staging this way):
~ module.eks.aws_autoscaling_group.workers
launch_configuration: "staging-k8s-worker2018072300313993610000000b" => "${element(aws_launch_configuration.workers.*.id, count.index)}"
- module.eks.aws_autoscaling_group.workers[1]
-/+ module.eks.aws_launch_configuration.workers (new resource required)
id: "staging-k8s-worker2018072300313993610000000b" => <computed> (forces new resource)
associate_public_ip_address: "false" => "false"
ebs_block_device.#: "0" => <computed>
ebs_optimized: "true" => "true"
enable_monitoring: "true" => "true"
iam_instance_profile: "staging20180723003138205600000007" => "staging20180723003138205600000007"
image_id: "ami-179fc16f" => "ami-c82004b0" (forces new resource)
instance_type: "m4.large" => "m4.large"
key_name: "infra-deployer" => "infra-deployer"
name: "staging-k8s-worker2018072300313993610000000b" => <computed>
name_prefix: "staging-k8s-worker" => "staging-k8s-worker"
root_block_device.#: "1" => "1"
root_block_device.0.delete_on_termination: "true" => "true"
root_block_device.0.iops: "0" => "0"
root_block_device.0.volume_size: "20" => "48" (forces new resource)
root_block_device.0.volume_type: "gp2" => "gp2"
security_groups.#: "2" => "2"
security_groups.1093865381: "sg-cf611ebf" => "sg-cf611ebf"
security_groups.3825257995: "sg-45093e3b" => "sg-45093e3b"
user_data_base64: "REDACTED "
- module.eks.aws_launch_configuration.workers[1]
Thanks for that output... not the best of situations but let's see what we can't come up with.
Not the most practical solution but perhaps a slight improvement on what you've got - consider reusing the orphaned map left by a blanked out worker group entry such that a pair of entries are dedicated to a service in a blue-green fashion. A deployment becomes somewhat heavy in that it requires 3 distinct changes but it's at least somewhat better than leaving a long trail of deployment corpses.
State 1: Blue is live; green offline
map(
"name", "k8s-worker-blue",
"ami_id", "ami-179fc16f",
"asg_desired_capacity", "0",
"asg_max_size", "0",
"asg_min_size", "0",
),
map(
"name", "k8s-worker-green",
"ami_id", "ami-67a0841f",
"asg_desired_capacity", "5",
"asg_max_size", "8",
"asg_min_size", "5",
"instance_type","${lookup(var.worker_sizes, "${terraform.workspace}")}",
"key_name", "${aws_key_pair.infra-deployer.key_name}",
"root_volume_size", "48"
)
State 2: transitional, Both blue and green live. When containers are balanced, monitor, verify, and drain green.
map(
"name", "k8s-worker-blue",
"ami_id", "ami-179fc16f",
"asg_desired_capacity", "5",
"asg_max_size", "8",
"asg_min_size", "5",
"instance_type","${lookup(var.worker_sizes, "${terraform.workspace}")}",
"key_name", "${aws_key_pair.infra-deployer.key_name}",
"root_volume_size", "48"
),
map(
"name", "k8s-worker-green",
"ami_id", "ami-67a0841f",
"asg_desired_capacity", "5",
"asg_max_size", "8",
"asg_min_size", "5",
"instance_type","${lookup(var.worker_sizes, "${terraform.workspace}")}",
"key_name", "${aws_key_pair.infra-deployer.key_name}",
"root_volume_size", "48"
)
State 3: spin down green; blue takes all traffic
map(
"name", "k8s-worker-blue",
"ami_id", "ami-179fc16f",
"asg_desired_capacity", "5",
"asg_max_size", "8",
"asg_min_size", "5",
"instance_type","${lookup(var.worker_sizes, "${terraform.workspace}")}",
"key_name", "${aws_key_pair.infra-deployer.key_name}",
"root_volume_size", "48"
),
map(
"name", "k8s-worker-green",
"ami_id", "ami-67a0841f",
"asg_desired_capacity", "0",
"asg_max_size", "0",
"asg_min_size", "0",
)
If this seems like a sound enough pattern for executing an update on worker node clusters, it probably makes sense to add a quick blurb in the readme.
The good news is the unit of deployment in a k8s-centric system shouldn't be the worker nodes themselves so having to roll out a refreshed worker group often doesn't seem likely, though it's an eventuality given that AMIs all need updates and retirement.
@brandoconnor thanks for the thoughts! I was pretty caught up in having one worker group and no extra resources that I kinda glossed over the concept of alternating the worker groups (instead of creating a new one and trying to delete all old ones completely). I'll give this a shot next time I need to roll out a new AMI (probably in a few weeks).
Compared to something like kubespray, this module combined with a managed controlplane allows much greater flexibility (I can create a mirror cluster in a different region in 20 minutes!) so I really appreciate the work put in here! This is probably good to close and hopefully helps other people in a similar situation.
Awww man, you're too kind @hobbsh ! This is the first I'd heard the project contrasted with kubespray. Perhaps a dedicated doc on certain operational aspects like this is warranted. I'll keep that in mind moving forward.
Most helpful comment
Awww man, you're too kind @hobbsh ! This is the first I'd heard the project contrasted with kubespray. Perhaps a dedicated doc on certain operational aspects like this is warranted. I'll keep that in mind moving forward.