Keda: Provide support for explicitly stating workloads to scale to zero.

Created on 22 Jul 2020  路  23Comments  路  Source: kedacore/keda

Provide support for explicitly stating workloads to scale to zero without the option of scaling up.

This can be useful for manually scaling-to-zero instances because:

  • You want to do maintenance
  • Your cluster is suffering from resource starvation and you want to remove non-mission-critical workloads

Why not delete the deployment? Glad you've asked! Because we don't want to touch the applications themselves but merely remove the instances it is running from an operational perspective. Once everything is good to go, we can enable it to scale again.

Suggestion

Introduce a new CRD, for example ManualScaleToZero, which targets a given deployment/workload and provides a description why it's scaled to 0 for now.

If scaled objects/jobs are configured, they are ignored in favor of the new CRD.

feature-request help wanted needs-discussion

Most helpful comment

do you want to explicitly set a replica count through this new field/CRD. I personally don't really see the need for this last option, as it's just replicating native functionality.

This is already possible today by aligning min/max replica count

I guess the main question is: do you just want to be able to "suspend" keda scaling for a while and take manual control (with a HPA or just scaling the target object)

This is my current thinking actually to have a State field which has Autoscaling & Paused or so.

All 23 comments

I like the idea with a separate CRD especially because of the description field and to leave the ScaledObject untouched will fit better in my gitops use case with argocd autosync.

I also support this idea and what @flecno said, however I would like to see a more generic CRD where you not can only scale to zero, but also enforce any value you like. Maybe call it ManualScale or ScaledObjectOverride.

This would help us in certain other seldom situations, where we have to process more independent of the triggers.

The goal with this seperate CRD is to override the ScaledObject.

With the manual approach, what would be the scenario where you cannot just scale the deployment itself let's say?

Scaling to zero or scaling to n, in both cases you enforce a value of fixed pods you require, where the autoscaling is not supposed to interfere.

Using kubectl scale <type> <name> --replicas=<n> doesn't works with zero, but also not for any other value submitted. Right after scaling to another value, the hpa controller kicks in and restores the old value.

Example ScaledObject:

apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  labels:
    deploymentName: myapp
  name: myapp
  ...
spec:
  cooldownPeriod: 600
  maxReplicaCount: 8
  minReplicaCount: 0
  pollingInterval: 30
  scaleType: deployment
  triggers:
  ...

Current scale is 1, trying to scale to 3 with kubectl scale deployment myapp --replicas=3 will start 2 additional pods, but they are terminated immediately:

0s   Normal    ScalingReplicaSet    deployment/myapp                  Scaled up replica set myapp-5cbd86475b to 3
0s   Normal    SuccessfulCreate     replicaset/myapp-5cbd86475b       Created pod: myapp-5cbd86475b-wxjbj
0s   Normal    SuccessfulCreate     replicaset/myapp-5cbd86475b       Created pod: myapp-5cbd86475b-257k4
0s   Normal    SuccessfulRescale    horizontalpodautoscaler/myapp     New size: 1; reason: Current number of replicas above Spec.MaxReplicas
0s   Normal    ScalingReplicaSet    deployment/myapp                  Scaled down replica set myapp-5cbd86475b to 1
0s   Normal    SuccessfulDelete     replicaset/myapp-5cbd86475b       Deleted pod: myapp-5cbd86475b-257k4
0s   Normal    SuccessfulDelete     replicaset/myapp-5cbd86475b       Deleted pod: myapp-5cbd86475b-wxjbj

And the spec of the keda-generated hpa is not above MaxReplicas:

spec:
  maxReplicas: 8
  minReplicas: 1

So all I wanted to suggest was, that a more generic override with any value is appreciated (not only zero).

Being able to suspend scaling during operations without removing or touching object state would be ideal. With our HPA setup, we could scale down the target object to 0, without touching any of the min/max parameters of the HPA, and allow the HPA to kick back in afterwards by scaling the target back up to 1.
With the current keda approach, we need to keep track of those min/max values while suspending scaling.

Agreed, would you prefer a new CRD then for that or what would you expect from KEDA?

@tomkerkhove a CRD would work for sure! It feels a bit elaborate compared to just setting an annotation/label on one of the involved objects, but it also leaves room for much more functionality I guess?

I am more inclined towards annotation/label based solution. it is more clean and we don't introduce a new set of resources.

Out of curiosity, what benefits do you see behind the CRD based solution?

Oh I was just checking what the expectations were; either are fine for me but would be good if this got surfaced somehow if we can do kubectl get so so maybe we add it as a field instead?

Exposing it in the get would be nice for sure. At first glance I thought the Active field was an indication of this feature, but turned out it serves a different meaning.
I guess the main question is: do you just want to be able to "suspend" keda scaling for a while and take manual control (with a HPA or just scaling the target object), or do you want to explicitly set a replica count through this new field/CRD. I personally don't really see the need for this last option, as it's just replicating native functionality.

do you want to explicitly set a replica count through this new field/CRD. I personally don't really see the need for this last option, as it's just replicating native functionality.

This is already possible today by aligning min/max replica count

I guess the main question is: do you just want to be able to "suspend" keda scaling for a while and take manual control (with a HPA or just scaling the target object)

This is my current thinking actually to have a State field which has Autoscaling & Paused or so.

Ideally, it would be something where granular permission could be given in the event someone on-call can temporarily disable w/o touching real object w/ any git re-run / etc. But yeah. big +1 to this request.

@zroubalik OK for you if we commit to this for our roadmap?

@tomkerkhove yeah :)

Guys would implementing something like this handle this case ?
https://github.com/kedacore/keda/issues/1500
If you can define "Maintenance" in a scaler terms things will work automatically

Shall we aim for 2.3 for this @jeffhollan @zroubalik ?

This is my current thinking actually to have a State field which has Autoscaling & Paused or so.

As prior art, CronJob objects have a boolean field for paused.

That's fine by me to have that or something similar.

Fine by me :) I am still not sure whether we should just explicitly scale to 0 (then I am up for using the same property like CronJob does) or whether we should scale to some specific numbers of replica (this would require a different property)

Based on what I've heard and seen I think we should consider it explicitly scale to X and "disable/pause" autoscaling without removing the ScaledObject.

Then you can do maintenance jobs or so and just put it back to autoscaling when done.

One feature that would be nice to have is a duration for the scaling override. It can be all too easy to scale things to 0 when something happens, and then somebody forgets to scale it back up. Being able to say "scale to 0, then return to normal in 4 hours" would solve that problem nicely.

@derekperkins That realllly doesn't work great in a convergent system :-/ At best you can do "disregard this config after timestamp X" but it's an inherently non-convergent behavior so the edge cases get gnarly.

+1 on this feature, would help us alot!

Was this page helpful?
0 / 5 - 0 ratings