Autoscaler: EKS CA doesn't see nodeSelector labels

Created on 14 Mar 2019  路  20Comments  路  Source: kubernetes/autoscaler

Hi,
I am currently using the following:
EKS 1.11
Cluster-Autoscaler: 1.13.2 (also tried 1.3.8 and 1.12.0)

When I try to schedule new pod with a node selector, it doesn't trigger scale-up and claims that the ASG's GeneralPredicates predicate mismatch. This happens when I try to use my custom defined label with the pods (ie. app=web). However, if I use a system node selector like "beta.kubernetes.io/instance-type=m5.large", it works just fine. I have added the appropriate tags to the ASG according to the documentation and given the correct IAM permissions. Any ideas on what else to try?

areprovideaws cluster-autoscaler

Most helpful comment

When I created ASG 2, I also created the appropriate ASG tags with the "k8s.io/cluster_autoscaler/node_template/label/app" and

Hi @frankgu968 , your label is incorrect.. Hyphen should be used instead of underline..

should be
k8s.io/cluster-autoscaler/node-template/label/app : web

I double verified it's scale from 0 working as expected on v1.3.8. But, AWS does not propagate labels to nodes, so new created node doesn't have this label which will leads to predicate failed. I will fix this issue.

If you use it for other purpose, like GPU label rather than node affinity in scheduling, that will work.

All 20 comments

I think I fixed the issue since my cluster now scales properly. Not sure if this is expected behavior:

I created an ASG with initial "desired running = 0" to avoid ASGs from spawning a machine every time I run my build scripts. CA doesn't scale up when I try to use custom nodeSelector labels.

I manually scale the ASG to >=1 instance, or begin with desired running = 1, and the node gets registered with the cluster. After a while of inactivity, the node gets scaled down. If I now try to use my custom nodeSelector labels, the correct scale-up behavior is observed.

Is this because CA needs to "register" an ASG node before it can observe its node labels?

UPDATE: If I delete the cluster-autoscaler pod that has "seen" the nodes, the new pod will not correctly scale the nodes according to the node labels.

cc @Jeffwan

/assign @Jeffwan

@frankgu968
If you run EKS cluster and set desired running = 0, CA pod will be evicted.. Users need to manage node groups by themselves and scale down to 0 is not supported on EKS.

For the customize tag, because first time node labels was created by your provision tools (no matter
eksctl, awscli. etc). CA has no control on it. If you like to get ride of mismatch issue, you can consider to add label when you create clusters. eksctl and awscli both have support for it.

@Jeffwan ah! I should clarify my configurations:
I have 2 ASGs:
ASG 1: always has machine running and runs the CA pods
ASG 2: this is the one with min:0 max:8

When I created ASG 2, I also created the appropriate ASG tags with the "k8s.io/cluster_autoscaler/node_template/label/app" and tags for auto-discovery. However, the custom node labels, such as "app=something" don't seem to be picked up by the CA until 1 node from this ASG has run at least once; but the k8s generated labels like instance-type are recognized just fine.

Can you also please clarify what you mean by "add label when you create clusters"?

@frankgu968 I see. This is a scale up from 0 case. Is the right permission like DescribeLaunchTemplate or DescribeLaunchConfiguration given for

Please also share scale up logs and I can help reproduce the issue later today.

Yep I have all the proper IAMs. I dont have the logs right now but can upload them tomorrow. They basically loop through all the node groups, and then claims that General Predicate failed. The scale up proceeds to fail with the message "pod didn't trigger scale up"

When I created ASG 2, I also created the appropriate ASG tags with the "k8s.io/cluster_autoscaler/node_template/label/app" and

Hi @frankgu968 , your label is incorrect.. Hyphen should be used instead of underline..

should be
k8s.io/cluster-autoscaler/node-template/label/app : web

I double verified it's scale from 0 working as expected on v1.3.8. But, AWS does not propagate labels to nodes, so new created node doesn't have this label which will leads to predicate failed. I will fix this issue.

If you use it for other purpose, like GPU label rather than node affinity in scheduling, that will work.

@Jeffwan got it! Has this changed from previous releases? I had copied the label directly from the documentation before... For now I have gotten around the label propagation problem by specifying it directly in the ASG kubelet arguments.

@frankgu968 That would work. Great you figure it out. If you use EKS, you can pass to bootStrapArguments.

--kubelet-extra-args --node-labels=app=web

I had a similar problem. Manually scaled my ASG up to 1 and then the cluster autoscaler worked from there.

Running on EKS
Kubernetes version 1.12.7
Cluster autoscaler 1.12.5

@nicholasgcoles I solved it by applying the correct label to my ASG as @Jeffwan had suggested earlier in this thread

Get ping from another folks in the community. I don't see any issues from v1.12.6. (I always try latest patch version for minor version)

Few things you need to take care

  1. You have to label nodes when you create node group. Either using the --kubelet-extra-args --node-labels=app=web in cloud formation or use eksctl like this
eksctl create nodegroup \
--cluster <cluster_name> \
--name <new_node_group_scale_from_0> \
--node-type m4.xlarge \
--node-labels="foo=bar" \
--nodes 0 \
--nodes-min 0 \
--nodes-max 4

  1. Make new Node Group auto-discovered by tagging ASG
aws autoscaling \
    create-or-update-tags \
    --tags \
    ResourceId=$ASG_NAME,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/enabled,Value=true,PropagateAtLaunch=true \
    ResourceId=$ASG_NAME,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/$NAME,Value=true,PropagateAtLaunch=true
  1. Make sure your ASG has label tag. This guarantees CA can scale this node group from 0.
aws autoscaling \
    create-or-update-tags \
    --tags \
    ResourceId=$ASG_NAME,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/node-template/label/foo,Value=bar,PropagateAtLaunch=true
  1. Create workloads using labels.
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1 # tells deployment to run 2 pods matching the template
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80
      nodeSelector:
        foo: bar

I will add details steps to documentation. Close this issue. Feel free to reopen if you still have problems against these versions.

/close

@Jeffwan: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

In your example what are the values for $ASG_NAME is the Auto Scaling Group name from AWS --> EC2 --> Auto Scaling --> Auto Scaling Group.

What is the $NAME in step number 2? I suppose it is the EKS cluster name.

  1. Make new Node Group auto-discovered by tagging ASG
aws autoscaling \
    create-or-update-tags \
    --tags \
    ResourceId=$ASG_NAME,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/enabled,Value=true,PropagateAtLaunch=true \
    ResourceId=$ASG_NAME,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/$NAME,Value=true,PropagateAtLaunch=true

@prashant-wipro You are right, that's cluster name

Hi, we're running into the issue that frankgu968 describes.

We have an cluster running on EKS provisioned using Terraform. We have several ASGs, where the desired size is 0. Terraform/EKS do not allow you to do this directly, but we created them, and then manually changed the desired and min sizes to 0. It was working fine, all of the correct labels and tags are set, and it is able to scale up from 0, _when the ASGs are registered_.

However, the custom node labels, such as "app=something" don't seem to be picked up by the CA until 1 node from this ASG has run at least once; but the k8s generated labels like instance-type are recognized just fine.

This bit is the problem. I believe it was working until the CA pod restarted. Now, none of the ASGs with 0 instances are registered, and the CA does not register them until I manually scale them up to 1. I can still see them in the cluster-autoscaler-status ConfigMap, but registered is false.

      Name:        eks-4cb9bb80-6fc2-d92d-554b-4c81839e1430
      Health:      Healthy (ready=0 unready=0 notStarted=0 longNotStarted=0 registered=0 longUnregistered=0 cloudProviderTarget=0 (minSize=0, maxSize=12))
                   LastProbeTime:      0001-01-01 00:00:00 +0000 UTC
                   LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
      ScaleUp:     NoActivity (ready=0 cloudProviderTarget=0)
                   LastProbeTime:      0001-01-01 00:00:00 +0000 UTC
                   LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
      ScaleDown:   NoCandidates (candidates=0)
                   LastProbeTime:      2020-08-05 07:27:57.051259035 +0000 UTC m=+1654.948542645
                   LastTransitionTime: 2020-08-05 07:00:53.825759831 +0000 UTC m=+31.723043437

The CA pod doesn't restart often, because it's in an ASG which always has at least 1 instance, but I'd rather not have to manually scale them up everytime the CA pod happens to restart. Is there perhaps a workaround where I could permanently manually register the ASGs? The ASGs will change less often than the CA pod restarts.

Reproduced on EKS 1.17:
Pod can't be scheduled on , predicate failed: GeneralPredicates predicate mismatch, reason: node(s) didn't match node selector

Root cause :
My Cluster autoscaler (version 1.17.4) needs after every (re-)start a manual desired=1 in aws console. ONCE !
After that CA can downscale to zero and upscale from zero.
Why does it need a MANUAL upscale ONCE ?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

davidquarles picture davidquarles  路  7Comments

johanneswuerbach picture johanneswuerbach  路  5Comments

losipiuk picture losipiuk  路  7Comments

clamoriniere picture clamoriniere  路  5Comments

whereisaaron picture whereisaaron  路  7Comments