Keda: STAN scaler does not scale from 0

Created on 18 Dec 2019 · 21Comments · Source: kedacore/keda

A clear and concise description of what the bug is.

Expected Behavior

When there messages in the queue and the HPA is scaled down to 0, the scaler should create at least one instance of a pod to create the subscription with STAN.

Actual Behavior

When the loop runs, The logic checks if the pending count of messages is greater than 0 and that the queue names match. It never takes into account if the subscription does not exist.

Steps to Reproduce the Problem

Create a new ScaledObject that can scale to 0

apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  name: test-scaledobject
  labels:
    deploymentName: test
spec:
  pollingInterval: 10   # Optional. Default: 30 seconds
  cooldownPeriod: 30   # Optional. Default: 300 seconds
  minReplicaCount: 0   # Optional. Default: 0
  maxReplicaCount: 5  # Optional. Default: 100  
  scaleTargetRef:
    deploymentName: test
  triggers:
  - type: stan
    metadata:
      natsServerMonitoringEndpoint: "stan.default:8222"
      queueGroup: "place_indexer"
      durableName: "ImDurable"
      subject: myQueue
      lagThreshold: "100"

cURL the STAN channel endpoint. It will return a response like:

{
  "name": "myQueue",
  "msgs": 500,
  "bytes": 0,
  "first_seq": 0,
  "last_seq": 0,
  "subscriptions": []

Keda will ignore the msgs field because there are no active subscriptions.
If minReplicaCount is set to 1, scaling works normally.

Specifications

Version: 1.0.0
Platform: Linux
Scaler(s): STAN

bug

Source

cwoolum

All 21 comments

Thanks @cwoolum for spotting this one. Its strange that I didn't see this one, sure tested it will min replicas 0. Anyway thanks!

balchua on 21 Dec 2019

I think i know how i missed this one. My initial deployment has a replica of 1. So atleast there is a registered subscriber. Once you get past that it can scale from 0 afterwards.

balchua on 24 Dec 2019

If it scales to zero though, the subscription will go away if you are draining and closing the connection. Even if you don't drain and close, the subscription will drop off after a short period(5 minutes?) And that prevents it from scaling back up.

cwoolum on 24 Dec 2019

I was testing it with this program https://github.com/balchua/gonuts. Subscriber seems to stick around.

balchua on 24 Dec 2019

https://github.com/balchua/gonuts/blob/c0ce3d6448fc1dafeaf5695136d008a04a696873/sub/main.go#L150

You are calling sc.Close() but not sc.Unsubscribe() when your subscriber app ends.

If your service scales up to 20 instances, you'll have 20 subscribers in STAN. Now say you scale back down to 1, each of those instance need to unsubscribe so that STAN doesn't keep trying to deliver messages to them, right?

I could be thinking about this the wrong way though.

cwoolum on 24 Dec 2019

Thanks for the explanation.👍 I remember setting the maxinflight to 1 so it round robins it to different subscribers. If the original subscriber goes down, i think stan will try to deliver it to the same subscriber until certain period of time. But i maybe wrong here too.

balchua on 25 Dec 2019

I do use MaxInFlight but set it to 50 since each one of my pods can handle that number of concurrent requests.

I saw that one of the newer versions of STAN/NATS does have a keepalive that it sends periodically to make sure a subscriber is still listening but I'm not sure of the relationship between a connection being open and a subscription existing for it. I'm going to open up an issue in the STAN repo and see if the team can provide some best practices on it.

cwoolum on 25 Dec 2019

👍1

I was able to find the docs for this. It looks like I was doing it wrong. For durable subscriptions, you don't want to Unsubscribe but only Close the connection.

The main difference between a queue subscription and a durable one, is that when the last member leaves the group, the state of the group will be maintained by the server. Later, when a member rejoins the group, the delivery will resume.
Note: For a durable queue subscription, the last member to  unsubscribe  (not simply close) causes the group to be removed from the server.

I'll try removing the Unsubscribe call from my code and see how that affects it.

cwoolum on 25 Dec 2019

@cwoolum thanks for checking the docs. Did it work for your when you removed the Unsubscribe?

balchua on 26 Dec 2019

I see the subscriptions are still there but I'm not sure if the messages are getting delivered successfully now. I'm seeing a large number of pending_count on the pods that were taken down as part of the scale down.

{
  "name": "place.inserted",
  "msgs": 653382,
  "bytes": 358537710,
  "first_seq": 1,
  "last_seq": 653382,
  "subscriptions": [
    {
      "client_id": "xxxplaceinserter-5595885df6-v57vq",
      "inbox": "_INBOX.AQ77TV2TOF1DVJUUW109X8",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfD3HS",
      "queue_name": "UpdatedDurable:place_inserter",
      "is_durable": true,
      "is_offline": false,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 653379,
      "pending_count": 0,
      "is_stalled": false
    },
    {
      "client_id": "xxxplaceinserter-7b79867b48-zf6js",
      "inbox": "_INBOX.ZQCLS4F06A4AMM1V78KYZ8",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfD3Qt",
      "queue_name": "UpdatedDurable:place_inserter",
      "is_durable": true,
      "is_offline": false,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 653382,
      "pending_count": 110,
      "is_stalled": false
    },
    {
      "client_id": "xxxplaceinserter-7b79867b48-tjwzt",
      "inbox": "_INBOX.SD4GQBUAJY4L8SPRIUYNGH",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfD4Dx",
      "queue_name": "UpdatedDurable:place_inserter",
      "is_durable": true,
      "is_offline": false,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 653382,
      "pending_count": 110,
      "is_stalled": false
    },
    {
      "client_id": "xxxplaceinserter-7b79867b48-2cd2v",
      "inbox": "_INBOX.XAERJTARSEMGR9N3ZUHZ1J",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfD44W",
      "queue_name": "UpdatedDurable:place_inserter",
      "is_durable": true,
      "is_offline": false,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 653382,
      "pending_count": 110,
      "is_stalled": false
    },
    {
      "client_id": "xxxplaceinserter-7b79867b48-mv9fr",
      "inbox": "_INBOX.7PWDZ4ZLPWOA311VJKKC6V",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfD3yr",
      "queue_name": "UpdatedDurable:place_inserter",
      "is_durable": true,
      "is_offline": false,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 653376,
      "pending_count": 110,
      "is_stalled": false
    },
    {
      "client_id": "xxxplaceinserter-7b79867b48-xzz2s",
      "inbox": "_INBOX.DSA21NRUGOLG1DBW8CIP52",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfD40k",
      "queue_name": "UpdatedDurable:place_inserter",
      "is_durable": true,
      "is_offline": false,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 653378,
      "pending_count": 110,
      "is_stalled": false
    }
  ]
}

cwoolum on 27 Dec 2019

After the scale down occurred, the number of subscribers dropped to 1. I'm not sure the time it took for that to happen though.

I opened an issue to ask the best practices for this.

https://github.com/nats-io/nats-streaming-server/issues/1001

cwoolum on 27 Dec 2019

👍1

As far as i know stan will direct the pending messages to available subscribers. But how long before it takes that decision.

balchua on 27 Dec 2019

I'm going to keep testing to make sure none of the messages are getting stuck. I keep running into the error Operation cannot be fulfilled on scaledobjects.keda.k8s.io \"mytest-scaledobject\": the object has been modified; please apply your changes to the latest version and try again. I've tried to delete both the HPA and ScaledObject multiple times but it keeps occurring. Any idea?

cwoolum on 31 Dec 2019

When this error shows up does it stop scaling? Care to show what kubectl get hpa looks like? And also your scaledobject yaml.
Thanks!

balchua on 1 Jan 2020

👍1

@cwoolum could you please decribe steps leading to this error + more details on ScaledObject and HPA. thanks

zroubalik on 2 Jan 2020

Okay, the modified issue was due to an old CPU based HPA conflicting with the Keda Scaler but I was able to recreate the issue with the pod not scaling from 0. Here are the requested items.

apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  creationTimestamp: '2020-01-03T06:52:29Z'
  finalizers:
    - finalizer.keda.k8s.io
  generation: 3
  labels:
    deploymentName: turnouttplaceindexer
  name: turnouttplaceindexer-scaledobject
  namespace: api
spec:
  cooldownPeriod: 30
  maxReplicaCount: 5
  minReplicaCount: 0
  pollingInterval: 10
  scaleTargetRef:
    deploymentName: turnouttplaceindexer
  triggers:
    - metadata:
        durableName: UpdatedDurable
        lagThreshold: '100'
        natsServerMonitoringEndpoint: 'stan.default:8222'
        queueGroup: place_indexer
        subject: place.updated
      type: stan
status:
  externalMetricNames:
    - lagThreshold
  lastActiveTime: '2020-01-03T15:56:05Z'

kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
  name: keda-hpa-turnouttplaceindexer
  namespace: api
  selfLink: >-
    /apis/autoscaling/v2beta1/namespaces/api/horizontalpodautoscalers/keda-hpa-turnouttplaceindexer
  uid: 8a0bd442-13df-43db-9897-0b96ffae7e4f
  resourceVersion: '25270090'
  creationTimestamp: '2020-01-03T15:52:05Z'
  ownerReferences:
    - apiVersion: keda.k8s.io/v1alpha1
      kind: ScaledObject
      name: turnouttplaceindexer-scaledobject
      uid: 8f9f9d7f-022a-4f95-ba70-c30cb155337c
      controller: true
      blockOwnerDeletion: true
spec:
  scaleTargetRef:
    kind: Deployment
    name: turnouttplaceindexer
    apiVersion: apps/v1
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: External
      external:
        metricName: lagThreshold
        metricSelector:
          matchLabels:
            deploymentName: turnouttplaceindexer
        targetAverageValue: '100'
status:
  currentReplicas: 0
  desiredReplicas: 0
  currentMetrics: null
  conditions:
    - type: AbleToScale
      status: 'True'
      lastTransitionTime: '2020-01-03T15:52:20Z'
      reason: SucceededGetScale
      message: the HPA controller was able to get the target's current scale
    - type: ScalingActive
      status: 'False'
      lastTransitionTime: '2020-01-03T15:57:12Z'
      reason: ScalingDisabled
      message: scaling is disabled since the replica count of the target is zero
    - type: ScalingLimited
      status: 'True'
      lastTransitionTime: '2020-01-03T15:55:40Z'
      reason: TooFewReplicas
      message: >-
        the desired replica count is increasing faster than the maximum scale
        rate

Here is the response from STAN:

{
  "name": "place.updated",
  "msgs": 1000000,
  "bytes": 131655679,
  "first_seq": 1752831,
  "last_seq": 2752830,
  "subscriptions": [
    {
      "client_id": "turnouttplaceindexer-68c66f7545-dhwqc",
      "inbox": "_INBOX.6Z7TV1D0B84HNLMDPIM8IW",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfuiaw",
      "queue_name": "UpdatedDurable:place_indexer",
      "is_durable": true,
      "is_offline": true,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 2751730,
      "pending_count": 0,
      "is_stalled": false
    }
  ]
}

And here are the logs from KEDA

2020-01-03T15:54:24.630417316Z {"level":"info","ts":1578066864.6303,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
2020-01-03T15:54:24.730667858Z {"level":"info","ts":1578066864.7304275,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"scaledobject-controller"}
2020-01-03T15:54:24.830825399Z {"level":"info","ts":1578066864.8306682,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"scaledobject-controller","worker count":1}
2020-01-03T15:54:25.156227482Z {"level":"info","ts":1578066865.1561325,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject"}
2020-01-03T15:54:25.156241982Z {"level":"info","ts":1578066865.1561651,"logger":"controller_scaledobject","msg":"Detecting ScaleType from ScaledObject","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject"}
2020-01-03T15:54:25.156247382Z {"level":"info","ts":1578066865.1561697,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject"}
2020-01-03T15:54:25.181735746Z {"level":"info","ts":1578066865.1815946,"logger":"controller_scaledobject","msg":"Updated HPA according to ScaledObject","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject","HPA.Namespace":"api","HPA.Name":"keda-hpa-turnouttplaceindexer"}

2020-01-03T15:57:10.682897812Z {"level":"info","ts":1578067030.6824272,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject"}
2020-01-03T15:57:10.684467522Z {"level":"info","ts":1578067030.682496,"logger":"controller_scaledobject","msg":"Detecting ScaleType from ScaledObject","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject"}
2020-01-03T15:57:10.684494823Z {"level":"info","ts":1578067030.6825025,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject"}
2020-01-03T15:57:10.695899496Z {"level":"info","ts":1578067030.6957793,"logger":"controller_scaledobject","msg":"Updated HPA according to ScaledObject","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject","HPA.Namespace":"api","HPA.Name":"keda-hpa-turnouttplaceindexer"}
2020-01-03T15:57:10.718530941Z {"level":"info","ts":1578067030.7182388,"logger":"scalehandler","msg":"Successfully scaled deployment to 0 replicas","ScaledObject.Namespace":"api","ScaledObject.Name":"turnouttplaceindexer-scaledobject","ScaledObject.ScaleType":"deployment","Deployment.Namespace":"api","Deployment.Name":"turnouttplaceindexer"}

There are messages pending for this subscription but Keda doesn't seem to be picking them up. The pods do call close on the NATS and STAN connections but do not Unsubscribe.

cwoolum on 3 Jan 2020

@cwoolum i think i know what the issue is, the stan scaler actually uses the channels' msg attribute and subtract that using the subscriber's last_sent. Unfortunately last_sent is a sequence number.
Looking at the STAN response you provided, msg=1000000 and last_sent=2751730 the resulting pending lag will be negative.
I think the better approach is to use the channel information's last_seq instead of msg.
Wdyt?