Hello,
I am having trouble understanding the meaning of the variable: triggers.metadata.queueLength especially in the case of RabbitMQ.
I am having trouble with scaling my deployment above 8 replicas. I have auto scaling enabled on the whole cluster, yet my deployments stagnates at 8 replicas even if I have 200+ messages in my queue.
I am asking because I am wondering if this problem is by any chance related to the queueLength variable
Thanks
Hi TAnas0,
Can you please provide us with the ScaledObject yaml?
The queueLength represents the amount of messages in the RabbitMQ by which the HPA knows to scale more pods.
For example, if queueLength is set to 50 and you have a 100 messages in the queue, you should be seeing 2 pods.
Regarding what you said about autoscaling enabled in your cluster - KEDA doesn't have anything to do with node autoscaling. It is using Kubernetes Horizontal Pod Autoscalers to scale out pods, not nodes.
Thanks @yaron2.
That's what I suspected the queueLength to be. Thanks you for clearing that out.
I set it to 1 but my deployment still does not scale above 4 replicas for some reason.
The ScaledObject YAML, is as follows, where I am using it in the middle of a Helm chart:
apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
name: {{ include "worker-helm.fullname" . }}
namespace: default
labels:
deploymentName: {{ include "worker-helm.fullname" . }}
spec:
scaleTargetRef:
deploymentName: {{ include "worker-helm.fullname" . }}
pollingInterval: 5 # Optional. Default: 30 seconds
cooldownPeriod: 30 # Optional. Default: 300 seconds
maxReplicaCount: 100 # Optional. Default: 100
triggers:
- type: rabbitmq
metadata:
queueName: audits
host: 'amqp://user:<password>@<rabbitmq-service>.default.svc.cluster.local:5672'
queueLength : '1'
I believe the configuration is right, and I suspected that there might be something wrong with my cluster, so I retried on a fresh cluster. But I still have the same issue.
Also, hope this helps you, as additional information, this is the output of the command kubectl get hpa when:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
keda-hpa-my-worker-helm Deployment/prototype-scanner-worker-worker-helm <unknown>/80% 1 100 0 3h49m
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
keda-hpa-my-worker-helm Deployment/prototype-scanner-worker-worker-helm 232%/80% 1 100 7 3h53m
With ~150 messages in the queue, this is the result of the command kubectl get deployment:
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
prototype-my-worker-helm 30 30 30 30 3h53m
This time it seems that it scaled in steps, from 1, then 7,then 24, then stagnates at 30 (I set the max replica count to 100). This is much better and might fulfill the job for me. Consider this issue closed.
The question is, what is KEDA's strategy for scaling? Why doesn't it directly scale to its target? and what might be the reason why it stops at a certain level?
I might be getting off-topic, but I am kind of a noob at Kubernetes and contemplating to use this in production :smile:
@TAnas0 did this end up working for you? Just wanted to understand why you closed. The scaling logic is the same as the HPA autoscaling algorithm because the HPA autoscaling is what is driving scale from 1->n and n-> 1
I believe he closed because its working as expected.
Why doesn't it directly scale to its target?
The Kubernetes HPA polls KEDA on an interval, and then begins autoscaling for the metrics it was given for that moment in time.
and what might be the reason why it stops at a certain level?
It stops when the threshold specified is not crossed with the latest values provided - meaning you have enough workers to keep the number of items below or equal to the threshold.
I closed this issue because I understand now the queueLength variable (original question), and because my deployment is scaling higher than before (stopped at 8, now at 30, which somehow works for me).
But I am not sure I understand:
It stops when the threshold specified is not crossed with the latest values provided - meaning you have enough workers to keep the number of items below or equal to the threshold.
Which threshold? where did I define it? I believe it should stagnate at the maxReplicaCount (100)
Were you guys able to reproduce the behavior?
Most helpful comment
I believe he closed because its working as expected.
The Kubernetes HPA polls KEDA on an interval, and then begins autoscaling for the metrics it was given for that moment in time.
It stops when the threshold specified is not crossed with the latest values provided - meaning you have enough workers to keep the number of items below or equal to the threshold.