Keda: Integration with the Cluster Autoscaler

Created on 20 Feb 2020 · 15Comments · Source: kedacore/keda

Today KEDA can indirectly influence cluster autoscaling by causing the HPA to schedule too many pods so the cluster autoscaler kicks in and adds nodes. A common ask is to have KEDA potentially poke the cluster autoscaler even earlier the same way it pokes HPA before CPU + Memory hit. Not sure what integrations make this possible today, but would be a nice feature to be able to set some cluster threshold or something, or some way to describe driving cluster scale in addition to HPA scale.

feature-request help wanted needs-discussion

Source

jeffhollan

👍7

Most helpful comment

Certainly an interesting scenario if you ask me!

Would you use the same component or split them? Maybe good to have seperation between app & cluster autoscaling so people can pick which component they are interested in.

tomkerkhove on 20 Feb 2020

👍2

All 15 comments

Certainly an interesting scenario if you ask me!

Would you use the same component or split them? Maybe good to have seperation between app & cluster autoscaling so people can pick which component they are interested in.

tomkerkhove on 20 Feb 2020

👍2

@melmaliacone know you were interested in looking in some more "kubernetes deep" features - this may be a good one. Also @jaypipes mentioned at the SIG-Runtime meeting he'd be interested to help collaborate as well 👍

jeffhollan on 20 Feb 2020

Awesome! Would propose for this one to draft a design spec on how it would work and what it would look like or is that overkill? Actually liked that with the introduction of the auth spec.

We could move those to design-proposals/ or so.

tomkerkhove on 20 Feb 2020

👍1

Hi @jeffhollan, team,
In AKS to indirectly trigger Cluster Autoscaler from HPA custom metric, we are using low priority pods as a buffer to overprovision the node scaler. Depending on the amount of pods that you want to buffer, we can configure how fast to scale.
More details: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-can-i-configure-overprovisioning-with-cluster-autoscaler
Helm: https://hub.helm.sh/charts/stable/cluster-overprovisioner
Hieu

hieumoscow on 23 Mar 2020

👍1

That is correct, we don't have to do anything on our end since it's based on cluster resources which will implicitely impacted by the HPAs.

tomkerkhove on 23 Mar 2020

Hi @tomkerkhove,
What I meant is we had to do this manually in AKS for a few customers where they want the cluster autoscaler to kick in earlier than waiting for CPU & memory limit to hit. It is not built in by default. Thus, within KEDA, we could implement a solution where KEDA adjust the buffer value for low priority pods to control how fast Cluster Autoscale provisions new nodes to cope with the events.

hieumoscow on 24 Mar 2020

I'm not sure if that's actually our responsibility to do that as we solely give application autoscaling and the rest is part of cluster autoscaling.

We could provide another component, but I'm not really sure what it would do then? Tell the CA to scale out by changing the buffer?

tomkerkhove on 24 Mar 2020

I think this will depends on how sensitive the application is to a scaling delay. For KEDA or any other event based framework, the scale bottleneck is most likely to be the CA.

Either like you said have another component to tell CA how fast to scale via buffer.
Alternatively I see there's a potential fit with an incubator project which is Cluster Proportional Autoscaler (CPA). It is based on same overprovisioner concept but we can define the buffer size to be proportional to the cluster size (e.g. 10% of Cluster Cores). Thus, the more events that KEDA detects & tells HPA to scale out, the bigger the buffer size CPA will adjust to trigger CA scale out speed. I could look into doing a POC here.

hieumoscow on 25 Mar 2020

I think what you are looking for is Virtual Nodes to overflow until CA catches up but will evaluate.

What do you think @jeffhollan @zroubalik?

tomkerkhove on 25 Mar 2020

Several people have expressed interest in the potential for some use of KEDA to solve cluster autoscaling challenges. So that we don't try to solve all possible scenarios, I started a document to gather use cases to focus this discussion around.

craiglpeters on 1 Jun 2020

Thanks - I've added a section with alternatives such as Virtual Nodes on AKS which solves exactly this scenario.

Personally I'm not sure yet where KEDA can help and if we should do it, or if we should bring this to CA team - But if we can help, why not!

tomkerkhove on 2 Jun 2020

👍1

Just to be clear - It's not that I don't think it's a good idea but merely making sure we are fixing gaps and not reinventing the wheel!

tomkerkhove on 2 Jun 2020

I really like the idea event driven cluster autoscaling, but I'm not sure exactly how this might work. In fact, I'm not even sure I understand the e2e journey of architecting a pod-scaled workflow with Keda.

Let's take, for example, an SQS Queue with some length. Keda is configured to scale on a threshold of 5 messages.

How many nodes should the autoscaler scale up?
Is 1 node -> 5 messages, the threshold? What is there are 1,000 messages?
Can messages be of a different weight?
Who is responsible for dequeuing messages out of the queue?

I think I'm probably missing something fundamental about Keda. My understanding is that for pods, Keda can scale up a new pod based off of a queue threshold, and then the pod is responsible for draining the queue. For nodes, this doesn't make as much sense to me as there's no agent to pop a message off of the queue. It also isn't clear to me what mechanism stops the cluster from infinitely scaling if the queue doesn't drain.

ellistarn on 17 Jul 2020

I think I'm probably missing something fundamental about Keda. My understanding is that for pods, Keda can scale up a new pod based off of a queue threshold, and then the pod is responsible for draining the queue.

Your understanding is correct.

We are looking how we can help the CA scale because of spikes or so that we are seeing but this is still under investigation if it would make sense. Personally, I'm not convinced yet this is something we can add enough value to.

tomkerkhove on 17 Jul 2020

Some more reasons to have integrations with the Cluster Autoscaler: https://stackoverflow.com/questions/63495899/using-multiple-autoscaling-mechanisms-to-autoscale-a-k8s-cluster