Keda: KEDA Operator reaching OOM

Created on 3 Feb 2021  路  8Comments  路  Source: kedacore/keda

A KEDA operator deployment is suffering from a memory leak, leading to the operator being restarted.

Expected Behavior

Keda operator is able to run with stable memory performance. Memory usage & connections should remain stable over time.

Actual Behavior

Keda operator pod memory is increasing over time until the pod is reached to his memory limit, causing OOM.

Steps to Reproduce the Problem

  1. Deploy Keda with chart version 2.1.1
  2. Define scaledobjects.keda.sh kubernetes objects
    3.

Logs from KEDA operator - removed objects names

2021-02-03T15:19:41.838Z    INFO    controller-runtime.metrics  metrics server is starting to listen    {"addr": ":8080"}
2021-02-03T15:19:41.840Z    INFO    controllers.ScaledObject    Running on Kubernetes 1.18  {"version": "v1.18.15"}
2021-02-03T15:19:41.840Z    INFO    setup   Starting manager
2021-02-03T15:19:41.840Z    INFO    setup   KEDA Version: 2.1.0
2021-02-03T15:19:41.840Z    INFO    setup   Git Commit: 4866ce69c4897df532b43390bafe4477275bf65a
2021-02-03T15:19:41.840Z    INFO    setup   Go Version: go1.15.6
2021-02-03T15:19:41.840Z    INFO    setup   Go OS/Arch: linux/amd64
I0203 15:19:41.840358       1 leaderelection.go:243] attempting to acquire leader lease keda/operator.keda.sh...
2021-02-03T15:19:41.840Z    INFO    controller-runtime.manager  starting metrics server {"path": "/metrics"}
I0203 15:19:59.239226       1 leaderelection.go:253] successfully acquired lease keda/operator.keda.sh
2021-02-03T15:19:59.239Z    INFO    controller  Starting EventSource    {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledJob", "controller": "scaledjob", "source": "kind source: /, Kind="}
2021-02-03T15:19:59.239Z    INFO    controller  Starting EventSource    {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "source": "kind source: /, Kind="}
2021-02-03T15:19:59.339Z    INFO    controller  Starting Controller {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledJob", "controller": "scaledjob"}
2021-02-03T15:19:59.339Z    INFO    controller  Starting workers    {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledJob", "controller": "scaledjob", "worker count": 1}
2021-02-03T15:19:59.339Z    INFO    controller  Starting EventSource    {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "source": "kind source: /, Kind="}
2021-02-03T15:19:59.440Z    INFO    controller  Starting Controller {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject"}
2021-02-03T15:19:59.440Z    INFO    controller  Starting workers    {"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "worker count": 1}
2021-02-03T15:19:59.440Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:19:59.956Z    INFO    controllers.ScaledObject    Initializing Scaling logic according to ScaledObject Specification  {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:19:59.963Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": "enrichment-ingest"}
2021-02-03T15:20:01.478Z    INFO    controllers.ScaledObject    Initializing Scaling logic according to ScaledObject Specification  {"ScaledObject.Namespace": "default", "ScaledObject.Name": "enrichment-ingest"}
2021-02-03T15:20:01.485Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:01.506Z    INFO    controllers.ScaledObject    Initializing Scaling logic according to ScaledObject Specification  {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:01.512Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:01.552Z    INFO    controllers.ScaledObject    Initializing Scaling logic according to ScaledObject Specification  {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:12.157Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:12.258Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:12.314Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:27.205Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:27.280Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:27.306Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:42.251Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:42.315Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:42.340Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:57.298Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:57.346Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:57.371Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:12.347Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:12.440Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:12.460Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:27.811Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:27.839Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:27.865Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:42.857Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:42.889Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:42.916Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:58.667Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:58.697Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:58.778Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:13.718Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:13.747Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:13.805Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:29.539Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:29.571Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:29.596Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:44.586Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:44.615Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:44.639Z    INFO    controllers.ScaledObject    Reconciling ScaledObject        

Specifications

  • KEDA Version: 2.1.0
  • KEDA Chart Version: 2.1.1
  • Platform & Version: *OS: Ubuntu 20.04.1 LTS, Kernel: 5.4.0-1035-aws, Conainer runtime: docker://19.3.11 *
  • Kubernetes Version: Kuberneted 1.18.15 - KOPS (1.18.3)
  • Scaler(s): Kafka Scaler

Additonal Metrics

Pod performance

Screen Shot 2021-02-03 at 18 08 42

bug

Most helpful comment

I have a repro for the issue and I'm looking at it. The frequent reconciliation is odd, but shouldn't cause a memory leak. I had to artificially cause it and can see the memory building up.

All 8 comments

Quickly looking at the logs, it is quite strange, on many lines ScaledObject.Name is empty:

2021-02-03T15:20:01.485Z    INFO    controllers.ScaledObject    Reconciling ScaledObject    {"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}

How many ScaledObjects do you have deployed? And what is this giving you:

kubectl get so

@zroubalik Hey!
I've said in the description that ScaledObject.Name is removed (by me) from the logs.
There are 4 active scaledobjects.keda.sh in our cluster

@avivgold098 sorry, I missed that note 馃う

Could you please enable debug log level?
https://github.com/kedacore/keda/blob/main/BUILD.md#setting-log-levels

@avivgold098 by chance, isn't there something that could modify the scaledobjects? We shouldn't see that much reconciliation happennig on the scaledobjects

Have you run that setup with KEDA 2.0?

@zroubalik Nothing is modified the scaledobjects.

We experienced the same behavior with Chart: v2.0.1 | App: v2.0.0 setup, thought the upgrade will fix the issue.

I have a repro for the issue and I'm looking at it. The frequent reconciliation is odd, but shouldn't cause a memory leak. I had to artificially cause it and can see the memory building up.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

alexakr picture alexakr  路  4Comments

lee0c picture lee0c  路  4Comments

trisberg picture trisberg  路  4Comments

cwhfa picture cwhfa  路  4Comments

aman-bansal picture aman-bansal  路  4Comments