Scaling to and from zero is not working with Prometheus scaler after KEDA upgrade from 1.3.0 to 1.4.0.
KEDA should scale up deployment from zero when Prometheus metric increases from zero and scale down to zero when the metric decreases to zero.
KEDA scales deployment from zero to one replica even when Prometheus metric is zero and never scales it down to zero.
minReplicaCount=0 and use Prometheus scaler with query that returns constant zero.image:
keda: docker.io/kedacore/keda:1.3.0
metricsAdapter: docker.io/kedacore/keda-metrics-adapter:1.3.0
and watch that KEDA scales the deployment to zero and again up from zero when the metric becomes positive as expected.
Could you please rerun 1.4.0 with debug log level on KEDA operator? https://github.com/kedacore/keda#setting-log-levels (or set it in the chart).
And paste here your ScaledObject please.
apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
name: prometheus-scaling
labels:
deploymentName: worker
spec:
scaleTargetRef:
deploymentName: worker
pollingInterval: 30
cooldownPeriod: 60
minReplicaCount: 0
maxReplicaCount: 2
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus-server.monitoring.svc.cluster.local:9090
metricName: queue_length
threshold: '120'
query: avg_over_time(worker_queue_length[5m])
I don't see anything suspicious in keda-logs.txt, except the last line
I0423 05:18:21.328041 1 wrap.go:47] GET /openapi/v2: (3.086429ms) 404 [ 172.16.0.23:33094]
KEDA scales replica to 1 from 0 even though the Prometheus metric is zero:
{"level":"info","ts":1587618899.730634,"logger":"scalehandler","msg":"Successfully updated deployment","ScaledObject.Namespace":"staging","ScaledObject.Name":"prometheus-scaling","ScaledObject.ScaleType":"deployment","Deployment.Namespace":"staging","Deployment.Name":"worker","Original Replicas Count":0,"New Replicas Count":1}
It seems like this regression was brought by this change: https://github.com/kedacore/keda/pull/695/files#diff-a63ae5a2f6036b9f3bc750d5fe46437cR105
@droessmj what did you do this particular change? ie. Set scaler to active even for value 0?
@zroubalik That commit updated a zero result to be non-error inducing. Based on the behavior described above I'm assuming since it's now not throwing errors, the scaler ensures 1 replica is up regardless. The "fix" I introduced just returns zero as a valid metric. If we need a special case where zero is non-error inducing but not a valid metric that can be introduced, but as-is the referenced commit just allows this code to now run for zero results:
metric := external_metrics.ExternalMetricValue{
MetricName: metricName,
Value: *resource.NewQuantity(int64(val), resource.DecimalSI),
Timestamp: metav1.Now(),
}
@droessmj I see, but the scaler marks itself as Active even when there is zero result. So my only concern is the change in isActive() function: return val > -1, nil. This should be on the first sight reverted back to return val > 0, nil. WDYT? As your change forces the scaler to be Active all the time, thus scaling 0<->1 doesn't work.
I believe you're correct. Reverting the isActive check while retaining the other part should resolve.
@hmoravec are you able to retest the change please if I send you link to dev image later today?
Would be great if anybody with Prometheus instance could check that the fix helped, just replace the images for KEDA Operator and KEDA Metrics Server. Thanks!
docker.io/zroubalik/keda:promFix
docker.io/zroubalik/keda-metrics-adapter:promFix
@zroubalik Sure, I'll test it. Btw automatic tests are planned? :-)
We are always open for PRs ;)
But you are right, this shouldn't have slipped through. Sorry about this.
@zroubalik Working, thanks! It scaled down to zero when the metric became zero and scaled up when the metric became positive.
@hmoravec great thanks. Yeah we do have some tests but it doesn't cover everything, so this is an area we definitely need to improve.