Keda: cooldownPeriod parameter not working as expected

Created on 9 Apr 2020 · 18Comments · Source: kedacore/keda

Hello! I am using Keda with RabbitMQ scaler but it seems that cooldownPeriod seems not working as espected. Even though I configure cooldownPeriod: 30, Pods are reduced to minReplicaCount after default 300s.

Expected Behavior

Pods should be reduced to minReplicaCount after cooldownPeriod

Actual Behavior

Pods are reduced to minReplicaCount after default 300s

Cooldownperiod seems the only parameter that doesnt work. pollingInterval, maxReplicaCount, and minReplicaCount works correctly.

Specifications

KEDA Version: Master branch, Commit ef7e4e9e1753e7038d65a0afd30146293139ec15
Kubernetes Version: 1.15.5 Docker Desktop on Mac os, and 1.15.10 Azure AKS
Scaler(s): RabbitMQ

bug scaler-rabbit-mq

Source

marcocello

Most helpful comment

@marcocello you can use all the features, it depends wheter you want to scale your deployments from 0 (ie. minReplicaCount = 0) or from 1 or another number minReplicaCount = X.

KEDA manages 0 <-> 1 and HPA 1 <-> N scaling, but you don't have to care about the underlying HPA, feeding the metrics or other mechanism, it is all done for you by KEDA.

You can use this ugly hack, if you want to scale your deployment from 1, but still want to modify the cooldown parameter. You can have 2 same deployments of the same app, first static set to 1 replica. The second one scaled by KEDA with minReplicaCount = 0. But I understand that's not optimal 🤷‍♂️

Thanks for raising this issue, I will take a look on how we can improve the cooldown scenario!
(@tomkerkhove I'll investigate whether we can safely effect this)

zroubalik on 14 Apr 2020

❤1 👍1

All 18 comments

Sorry to hear, we'll get this fixed.

Just for our local repro, would you mind pasting your ScaledObject config please?

tomkerkhove on 10 Apr 2020

Hello Tom,

this is my ScaledObject config:

apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  name: compute-scaledobject
  labels:
    deploymentName: compute-deployment
spec:
  scaleTargetRef:
    deploymentName: compute-deployment
  pollingInterval: 1   # Optional. Default: 30 seconds
  cooldownPeriod: 20   # Optional. Default: 300 seconds
  maxReplicaCount: 10  # Optional. Default: 100
  minReplicaCount: 1
  triggers:
  - type: rabbitmq
    metadata:
      queueName: compute
      host: RabbitMqHost
      queueLength  : '5'

These are the logs from keda-operator pod:

{"level":"info","ts":1586506360.6137688,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506360.6139557,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506360.6140635,"logger":"controller_scaledobject","msg":"Creating a new HPA","Request.Namespace":"default","Request.Name":"compute-scaledobject","HPA.Namespace":"default","HPA.Name":"keda-hpa-compute-deployment"}
{"level":"info","ts":1586506360.6502018,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506360.6503305,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506376.1512656,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506376.1515474,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506498.892882,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506498.8929482,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506513.9334903,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506513.9336584,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506514.0644085,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506514.064469,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506529.3715875,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506529.3757153,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"default","Request.Name":"compute-scaledobject"}

Above the last log after the last scale up of compute component.

Logs below are related to scale down to 1 replica. As you can see this happen more ore less after 300s (1586506804.0495086-1586506529.3757153).

{"level":"info","ts":1586506804.0495086,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506804.0499203,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506819.0569682,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"default","Request.Name":"compute-scaledobject"}
{"level":"info","ts":1586506819.0571373,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"default","Request.Name":"compute-scaledobject"}

marcocello on 10 Apr 2020

Thanks!

tomkerkhove on 10 Apr 2020

@marcocello could you please paste here log with log level debug enabled?
https://github.com/kedacore/keda#keda-operator-logging

zroubalik on 10 Apr 2020

@zroubalik, thanks for your help.

Attached you can find logs for keda-operator and keda-metric-apiserver.

Here the list of events I think could be useful for you in keda-operator logs:

at "ts":1586520723.1727092 ScaledObject was able to connect to the rabbitmq instance
at "ts":1586520906.380907 the scaling started
at "ts":1586520938.6841013 the last "scaling up" event
at "ts":1586521214.3048892 the scaling down event

keda-operator.log

keda-metrics-apiserver.log

marcocello on 10 Apr 2020

I tried the sample here https://github.com/kedacore/sample-go-rabbitmq and it works, cooldownPeriod works as expected.

marcocello on 10 Apr 2020

@marcocello if you set minReplicaCount = 0, cooldown period on your deployment will work, because it is handled by KEDA. The problem is that scaling 1<->N is handled by kubernetes HPA, and there are very limited ways how to influence HPA scaling from KEDA, there is not an option for something like cooldownPeriod in HPA itself. You can modify a similar setting on a cluster level though, but that would affect all HPAs in your cluster.

@tomkerkhove we should document that probably, wdyt?

zroubalik on 14 Apr 2020

Yes, we should do that indeed. So cooldown is mainly for 0 <-> 1, but shouldn't we manage the HPA cooldown then as well as part of KEDA?

tomkerkhove on 14 Apr 2020

Many thanks @zroubalik. Now it works!

Can I still use Keda functionalities with minReplicaCount != 1 or this doesn't work well with HPA?

marcocello on 14 Apr 2020

@marcocello you can use all the features, it depends wheter you want to scale your deployments from 0 (ie. minReplicaCount = 0) or from 1 or another number minReplicaCount = X.

KEDA manages 0 <-> 1 and HPA 1 <-> N scaling, but you don't have to care about the underlying HPA, feeding the metrics or other mechanism, it is all done for you by KEDA.

Thanks for raising this issue, I will take a look on how we can improve the cooldown scenario!
(@tomkerkhove I'll investigate whether we can safely effect this)

zroubalik on 14 Apr 2020

❤1 👍1

@marcocello I tried to reproduce the issue and HPA scaled down the Deployment almost immediately. I can't reproduce the behavior you are talking about, I haven't seen that long timeout. So there's nothing we can do about it from KEDA side.

If your issues were solved by minReplicaCount = 0, let me know and we can close this issue.

zroubalik on 15 Apr 2020

Hello @zroubalik, to wrap up:

with minReplicaCount = 0, cooldownPeriod works
with minReplicaCount != 0, scale down is managed by HPA

Please close the issue. Thanks to all for the help!

marcocello on 15 Apr 2020

👍1

@zroubalik Hi, I would like to discuss about the rational behind this issue as this seems like a strange limitation to me. If I understand correctly your comments:

KEDA manages 0 <-> 1 and Kubernetes HPA manages 1 <-> N scaling
so the cooldownPeriod option is only working when we setup minReplicaCount: 0

But in my experience if I setup KEDA for exemple with:

minReplicaCount: 0
maxReplicaCount: 10
cooldownPeriod : 20

Then I see that KEDA is able to scale down from 10 -> 0 straight after the 20s period, whereas reading your comments I was expecting that it will first do 10 -> 1 after 300s (default value for HPA --horizontal-pod-autoscaler-downscale-stabilization), and next 1 -> 0 after 20s (cooldownPeriod value).

So my question is how come that KEDA is able to downscale from N to 0 (first N -> 1, then 1 -> 0) using the cooldownPeriod but is not able to simply downscale from N to 1 using the same cooldownPeriod? I guess there may be technical constraints from Kubernetes that limit KEDA, but here it seems like KEDA is able to do something more complex (downscale from N to 0 using cooldownPeriod) but is not able to do something more simple (downscale from N to 1 using cooldownPeriod). Is there something I am missing?

RemiGaudin on 29 May 2020

Hi @RemiGaudin

* KEDA manages 0 <-> 1 and Kubernetes HPA manages 1 <-> N scaling

that's correct and that's exactly the answer to your next question :)

So my question is how come that KEDA is able to downscale from N to 0 (first N -> 1, then 1 -> 0) using the cooldownPeriod but is not able to simply downscale from N to 1 using the same cooldownPeriod?

cooldownPeriod setting is taken into acount only when scaling is handled by KEDA operator, so:

If we set minReplicaCount = 1, KEDA forwards metrics directly to Metrics Server and they are then consumed by HPA, which handles scaling up and down. KEDA doesn't handle the scaling (just processing the metrics), therefore we cannot affect cooldown.
If we set minReplicaCount = 0, KEDA checks the metrics and "activity" on the referenced trigger (eg. Kafka). If there's some load (ie. trigger is active) KEDA scales deployment from 0 to 1 and then HPA takes over. KEDA is processing and forwarding metrics the same way as it is described above, plus KEDA still checks the "activity" on the referenced trigger. So for example, if deployment is scaled to 10 replicas and trigger become inactive (eg. Kafka topic is empty), KEDA will start calculating the cooldownPeriod, if the trigger is still inactive after the cooldownPeriod passes, KEDA will force deployment to be scaled to 0 no matter how many replicas are there currently (that's why you don't see the 300s + 20s period). The cluster wide --horizontal-pod-autoscaler-downscale-stabilization setting applies only for scaling handled by HPA (eg. scaling from 10 -> 9 etc).

FYI, for upcoming KEDA v2 release, we have added a possibility to tweak the HPA scaling behavior from ScaledObject a little bit, by implementing the standard config options: https://github.com/kedacore/keda/pull/805

zroubalik on 29 May 2020

@zroubalik Thanks for your explanation, but then it leads to another question: why KEDA is checking the trigger "activity" only if minReplicaCount = 0 and doesn't do it when minReplicaCount = 1?

We could imagine that KEDA checks the trigger activity whatever the minReplicaCount value so when the trigger is still inactive after the cooldownPeriod then KEDA scale down the deployment to 1 (instead of zero). Therefore the cooldownPeriod parameter would work in every scenario and that would be more straightforward and intuitive than tweaking the HPA options. Is there a technical constraint that prevent that?

RemiGaudin on 29 May 2020

@RemiGaudin I am afraid that HPA would scale the deployment back to it's target replicas in that case (in case the deployment is scaled to 0, HPA ignores it so we can do that).

zroubalik on 29 May 2020

@zroubalik Ok I get it now, the piece I was missing indeed is that HPA doesn't try to scale up the deployment again only if replica count = 0. Thanks for your explanations.

RemiGaudin on 29 May 2020

Glad to help!

zroubalik on 29 May 2020

Was this page helpful?

0 / 5 - 0 ratings