Terraform-aws-eks: EKS Node Groups not showing in AWS EKS Console when using Spot Instance Example

Created on 13 Apr 2020 · 9Comments · Source: terraform-aws-modules/terraform-aws-eks

I'm submitting a...

[X] bug report
[ ] feature request
[ ] support request - read the FAQ first!
[ ] kudos, thank you, warm fuzzy

What is the current behavior?

When I apply the EKS Spot Instance example, I don't see any EKS Node Groups on my AWS EKS Console.
NOTE: Although, Node Group is not being listed in AWS Console, I can see nodes joined EKS cluster via kubectl get nodes command.

If this is a bug, how to reproduce? Please include a code sample if relevant.

Just apply the EKS Spot Instance example and check EKS Console to confirm Spot Instance Node Groups aren't being listed.

What's the expected behavior?

I was expecting Spot Instances Node Groups to be listed in AWS Console just like any other Worker Node Group.

Affected module version:
OS: Windows
Terraform version: Terraform v0.12.24

Any other relevant info

Source

ruiengana

👍4

Most helpful comment

I think we can say EKS does support Managed Spot Instances (EKSCTL have it) and I can create one via AWS Console.

To manually create a Spot Managed Group, do this:
1 - Open your EKS Cluster config in AWS Console.
2 - Create a Node Group. Just accept defaults, we will edit this later on.
3 - Go to EC2 and find Auto Scaling Group linked to Node Group, and edit it.
4 - In edit options, change Fleet Composition from "Adhere to the launch template" to "Combine purchase options and instances". Adapt rest of ASG to match your Spot Instance config.

Would be great to manage this via terraform :)

ruiengana on 13 Apr 2020

👍2

All 9 comments

You are using "worker_groups_launch_template" which is not creating EKS managed node groups, but the classic approach to launch kubernetes worker nodes on EKS by using autoscaling groups (and in this case a launch template to set the configuration, including the script that does in fact add the nodes to the cluster). If you want to see those, go to EC2>Auto Scaling Groups, they will be there

Managed node groups (the new ones, the ones you will see in EKS console) are created using this example: https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/examples/managed_node_groups/main.tf

I have now knowledge on how to relate spot instances to managed node groups, maybe it can't be done or maybe someone else can help.

Update: Check here https://github.com/aws/containers-roadmap/issues/583 it is not possible. If you want spot, use classic approach.

jaimehrubiks on 13 Apr 2020

I think we can say EKS does support Managed Spot Instances (EKSCTL have it) and I can create one via AWS Console.

Would be great to manage this via terraform :)

ruiengana on 13 Apr 2020

👍2

This is similar to other existing problems that Managed Node Groups have. For example, we can't use managed node groups because they don't propagate tags to their ASGs. Then of course I could edit the ASG and add the tags, and then remove the existing instances so that new instances do in fact inherit those tags.

jaimehrubiks on 13 Apr 2020

I think we can say EKS does support Managed Spot Instances (EKSCTL have it) and I can create one via AWS Console.

Would be great to manage this via terraform :)

Unfortunately, until the EKS Node Group API natively supports spot and it gets implemented in the terraform provider, there isn't much we can do. Some enterprising folks could probably do some horror using a local-exec provisioner block and calls to the awscli. It wouldn't really be a terraform-native or recommended approach.

Hacking something up in the console or via direct API calls against the "managed" ASG isn't Node Groups supporting a feature. What happens when the service decides to replace the ASG as part of an upgrade or feature roll out in the future? Personally, I wouldn't dare interact with the ASGs directly.

eksctl has a lot more freedom as it's written in a full programming language. I haven't used it before and have just taken a look through their documentation. I think your claim that they support spot via the EKS Node Group is incorrect. You have to be careful as their nodeGroup configuration block refers to directly created ASGs which can easily support spot, like this module's worker_groups. Their managedNodeGroup feature makes no mention of spot as it's not natively supported by the API.

Managed Node Groups still have some way to go before being widely usable.

dpiddockcmp on 14 Apr 2020

👍1

I am also facing a similar issue. All pods remain in pending state unless I comment out the spot instance lines. Am I missing anything?

module "eks" {
  source                    = "terraform-aws-modules/eks/aws"
  version                   = "11.1.0"
  cluster_name              = var.cluster_name
  cluster_version           = "1.16"
  subnets                   = module.vpc.private_subnets
  vpc_id                    = module.vpc.vpc_id

instances.md
  worker_groups = [
    {
      name = "spot-1"
      # spot_price          = "0.199"
      instance_type = "m4.xlarge"
      asg_max_size  = 1
      # kubelet_extra_args  = "--node-labels=kubernetes.io/lifecycle=spot"
      suspended_processes = ["AZRebalance"]
    },
    {
      name = "spot-2"
      # spot_price          = "0.20"
      instance_type = "m4.xlarge"
      asg_max_size  = 1
      # kubelet_extra_args  = "--node-labels=kubernetes.io/lifecycle=spot"
      suspended_processes = ["AZRebalance"]
    }
  ]
}

vikas027 on 2 May 2020

the latest bootstrap scripts doesn support "/" in label. Try using simply lifecycle=spot in node labels

rverma-jm on 10 May 2020

👍1

@vikas027 From EKS 1.16, kubelet doesn't support some labels. Please see https://github.com/terraform-aws-modules/terraform-aws-eks/issues/856#issuecomment-623155710 and docs for spot instances.

kubelet_extra_args = "--node-labels=node.kubernetes.io/lifecycle=spot"

barryib on 11 May 2020

👍1

Following up with what @barryib the kubelet error that prevented the node from joining:

May 11 16:47:36 ip-10-0-1-157 kubelet: --node-labels in the 'kubernetes.io' namespace must begin with an allowed prefix (kubelet.kubernetes.io, node.kubernetes.io) or be in the specifically allowed set (beta.kubernetes.io/arch, beta.kubernetes.io/instance-type, beta.kubernetes.io/os, failure-domain.beta.kubernetes.io/region, failure-domain.beta.kubernetes.io/zone, failure-domain.kubernetes.io/region, failure-domain.kubernetes.io/zone, kubernetes.io/arch, kubernetes.io/hostname, kubernetes.io/instance-type, kubernetes.io/os)

It affects other areas as well. For instance I was using --node-labels=kubernetes.io/lifecycle=normal to ensure some pods only ran on non-spot instances which was also preventing the worker group from coming up. Changing to node.kubernetes.io 'fixed' the issue