After running terraform apply, terraform reports:
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
~ update in-place
-/+ destroy and then create replacement
Terraform will perform the following actions:
~ module.eks.aws_autoscaling_group.workers
launch_configuration: "test-eks-cluster-02019010915335535610000000c" => "${element(aws_launch_configuration.workers.*.id, count.index)}"
-/+ module.eks.aws_launch_configuration.workers (new resource required)
id: "test-eks-cluster-02019010915335535610000000c" => <computed> (forces new resource)
associate_public_ip_address: "false" => "false"
ebs_block_device.#: "0" => <computed>
ebs_optimized: "true" => "true"
enable_monitoring: "true" => "true"
iam_instance_profile: "test-eks-cluster20190109153353893600000008" => "test-eks-cluster20190109153353893600000008"
image_id: "ami-0a9006fb385703b54" => "ami-0a9006fb385703b54"
instance_type: "t3.micro" => "t3.large" (forces new resource)
key_name: "" => <computed>
name: "test-eks-cluster-02019010915335535610000000c" => <computed>
name_prefix: "test-eks-cluster-0" => "test-eks-cluster-0"
root_block_device.#: "1" => "1"
root_block_device.0.delete_on_termination: "true" => "true"
root_block_device.0.iops: "0" => "0"
root_block_device.0.volume_size: "100" => "100"
root_block_device.0.volume_type: "gp2" => "gp2"
security_groups.#: "1" => "1"
security_groups.1557716484: "sg-0000ab3ece5ffdca6" => "sg-0000ab3ece5ffdca6"
user_data_base64: "base64 data" => "base 64 data"
Plan: 1 to add, 1 to change, 1 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
module.eks.aws_launch_configuration.workers: Creating...
associate_public_ip_address: "" => "false"
ebs_block_device.#: "" => "<computed>"
ebs_optimized: "" => "true"
enable_monitoring: "" => "true"
iam_instance_profile: "" => "test-eks-cluster20190109153353893600000008"
image_id: "" => "ami-0a9006fb385703b54"
instance_type: "" => "t3.large"
key_name: "" => "<computed>"
name: "" => "<computed>"
name_prefix: "" => "test-eks-cluster-0"
root_block_device.#: "" => "1"
root_block_device.0.delete_on_termination: "" => "true"
root_block_device.0.iops: "" => "0"
root_block_device.0.volume_size: "" => "100"
root_block_device.0.volume_type: "" => "gp2"
security_groups.#: "" => "1"
security_groups.1557716484: "" => "sg-0000ab3ece5ffdca6"
user_data_base64: "" => "base63data"
module.eks.aws_launch_configuration.workers: Creation complete after 1s (ID: test-eks-cluster-020190109162625609700000001)
module.eks.aws_autoscaling_group.workers: Modifying... (ID: test-eks-cluster-02019010915341192810000000e)
launch_configuration: "test-eks-cluster-02019010915335535610000000c" => "test-eks-cluster-020190109162625609700000001"
module.eks.aws_autoscaling_group.workers: Modifications complete after 0s (ID: test-eks-cluster-02019010915341192810000000e)
module.eks.aws_launch_configuration.workers.deposed: Destroying... (ID: test-eks-cluster-02019010915335535610000000c)
module.eks.aws_launch_configuration.workers.deposed: Destruction complete after 1s
module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = "test-eks-cluster"
subnets = ["subnet-1", "subnet-2"]
tags = {
Environment = "test"
}
worker_groups = [
{
instance_type = "t3.micro" -> "t3.large"
asg_max_size = 3
}
]
vpc_id = "vpc"
}
To change the EC2 instance type
Most likely no... 😁
This maybe useful:
```$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
gitlab-managed-apps install-helm 0/1 Pending 0 1h
kube-system aws-node-8qf8c 1/1 Running 0 2h
kube-system coredns-7554568866-fbv4x 1/1 Running 0 2h
kube-system coredns-7554568866-nbfsl 1/1 Running 0 2h
kube-system kube-proxy-v45wx 1/1 Running 0 2h
kube-system kubernetes-dashboard-5dd89b9875-f7wf8 0/1 Pending 0 1h
The pending status I think is because of this:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m51s (x362 over 63m) default-scheduler 0/1 nodes are available: 1 Insufficient pods.
```
Micro is super tiny. It likely was never given enough resources for that pod to begin with. Start with a medium.
Also, Changing the instance type of the existing ASG is likely going to do things you don't like. The better way to do it would be to make a second worker group with the different size, then either get rid of the first worker group or scale it to 0.
This module does not re-create the ASG when making changes to the Launch Configuration. After making such changes you then need to recycle the instances.
Whether this is desired behaviour will likely get mixed responses. It gives more control over draining and migrating pods in a k8s-friendly manor without needing to run blue/green deployments. Unfortunately you've found it also means changes to the terraform create drifts in the running environment
then either get rid of the first worker group or scale it to 0.
Removing or inserting items, apart from the last one, in an indexed resource in TF causes really bad things to happen.
I don't understand. What's the actual problem here? The fact you have pods Pending state?
The problem I was experiencing was that after changing the instance_type the work would still stay at the old type. I do think that the pods being in a Pending state (best guess is memory constants) is what caused the new Launch group to not spin up a new EC2 instance. I'd expect a new instance to spin up and be able to drain the current worker and migrate the pods to the new.
I just wasn't sure what is the expected behavior and in the future how to change the instance type without downtime.
So maybe it's just more of a question of what would be the best practice of changing the instance_type in the future that is more "battle tested"
Ah OK, I'll try to explain then!
When you change the instance type and run TF, a new Launch Configuration (LC) is created with the new type and the Autoscaling Group (ASG) is updated to use the new LC. Nothing changes at this point. Existing instances remain running. Only when a new instance is created by the ASG will the new LC be used and this new instance will be of the new type.
If you want to replace all instances in the cluster with the new type then you need to scale up the amount of instances in the ASG, then drain the old smaller instances, then terminate them.
If you have pending pods that can't run because there is not enough CPU, then you could also run the cluster-autoscaler. This will increase the amount of instances in the cluster to accommodate new pods.
Thanks!! Got it. This helps a lot. :)
—
MM
On 10 Jan 2019, at 16:37, Max Williams notifications@github.com wrote:
Ah OK, I'll try to explain then!
When you change the instance type and run TF, a new Launch Configuration (LC) is created with the new type and the Autoscaling Group (ASG) is updated to use the new LC. Nothing changes at this point. Existing instances remain running. Only when a new instance is created by the ASG will the new LC be used and this new instance will be of the new type.
If you want to replace all instances in the cluster with the new type then you need to scale up the amount of instances in the ASG, then drain the old smaller instances, then terminate them.
If you have pending pods that can't run because there is not enough CPU, then you could also run the cluster-autoscaler. This will increase the amount of instances in the cluster to accommodate new pods.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
Great!
I will close this as it's not really an issue with this module or even with Terraform.
Most helpful comment
Ah OK, I'll try to explain then!
When you change the instance type and run TF, a new Launch Configuration (LC) is created with the new type and the Autoscaling Group (ASG) is updated to use the new LC. Nothing changes at this point. Existing instances remain running. Only when a new instance is created by the ASG will the new LC be used and this new instance will be of the new type.
If you want to replace all instances in the cluster with the new type then you need to scale up the amount of instances in the ASG, then drain the old smaller instances, then terminate them.
If you have pending pods that can't run because there is not enough CPU, then you could also run the cluster-autoscaler. This will increase the amount of instances in the cluster to accommodate new pods.