Keda: Namespaces stuck in Terminating state

Created on 8 Oct 2020  路  4Comments  路  Source: kedacore/keda

Expected Behavior

Namespaces should delete successfully

Actual Behavior

Namespaces stuck in Terminating state

Steps to Reproduce the Problem

  1. Install keda to cluster

Specifications

  • KEDA Version: 2.0.0-rc
  • Platform & Version: GKE
  • Kubernetes Version: 1.17.9-gke.1504
  • Scaler(s): prometheus

  • Addons: certmanager, argocd, nginx-ingress

I installed Keda yesterday to staging cluster and it cause problems with namespaces - they stuck in Terminating state

review-2709-xxx                       Terminating   18h                                                                                                                                                                                                        
review-2787-xxx                       Terminating   25h                                                                                                                                                                                                        
review-2817-xxx                       Terminating   22h                                                                                                                                                                                                        
review-keda-test                      Active        14h
review-master                         Terminating   14h                                                                                                                                                                                                        

ScaledObjects defined only in review-keda-test namespace

Of course I can delete them via kubectl patch but that is not a solution.

There are errors in keda pod:

keda-operator-5d7bd795cb-6psbc keda-operator {"level":"error","ts":1602134398.0005147,"logger":"controllers.ScaledObject","msg":"Target resource doesn't exist","ScaledObject.Namespace":"review-master","ScaledObject.Name":"prometheus-xxx-sidekiq-scale","resource":"apps/v1.Deployment","name":"xxx-sidekiq","error":"deployments.apps \"xxx-sidekiq\" not found","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128\ngithub.com/kedacore/keda/controllers.(*ScaledObjectReconciler).checkTargetResourceIsScalable\n\t/workspace/controllers/scaledobject_controller.go:241\ngithub.com/kedacore/keda/controllers.(*ScaledObjectReconciler).reconcileScaledObject\n\t/workspace/controllers/scaledobject_controller.go:173\ngithub.com/kedacore/keda/controllers.(*ScaledObjectReconciler).Reconcile\n\t/workspace/controllers/scaledobject_controller.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:209\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:188\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90"}
keda-operator-5d7bd795cb-6psbc keda-operator {"level":"error","ts":1602134398.0009277,"logger":"controllers.ScaledObject","msg":"ScaledObject doesn't have correct scaleTargetRef specification","ScaledObject.Namespace":"review-master","ScaledObject.Name":"prometheus-xxx-sidekiq-scale","error":"deployments.apps \"xxx-sidekiq\" not found","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128\ngithub.com/kedacore/keda/controllers.(*ScaledObjectReconciler).Reconcile\n\t/workspace/controllers/scaledobject_controller.go:146\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:209\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:188\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90"}
bug

Most helpful comment

I ran into this issue on gke as well. If you're running a VPC native cluster you'll need to add a firewall rule to allow the masters to talk with keda's apiservice (port 6443 by default)

All 4 comments

What KEDA related objects are in those namespaces? If KEDA is still installed on the cluster and you remove ScaledObjects from the cluster, KEDA will remove finalizers on those objects and there should be any problem with deleting the namespaces.

I have KEDA in the cluster. I installed it for test and I created ScaledObjects only in one namespace - review-keda-test.
And I start experiencing problems with other namespaces (without ScaledObjects) - they stuck in Terminating state.
Before installing KEDA I have no problems like that.

I've stumbled upon a similar issue when trying to delete KEDA. In my case it was caused by leftover apiserver. To check this you can do:

  1. kubectl get ns review-master -o yaml and check status.conditions for causes. In my case this was:
- lastTransitionTime: "2020-10-11T09:23:33Z"
    message: 'Discovery failed for some groups, 1 failing: unable to retrieve the
      complete list of server APIs: external.metrics.k8s.io/v1beta1: the server is
      currently unable to handle the request'
    reason: DiscoveryFailed
    status: "True"
    type: NamespaceDeletionDiscoveryFailure

Issue is similar to: https://github.com/prometheus-operator/kube-prometheus/issues/275#issuecomment-545305515

  1. Check for dangling API servers kubectl get APIService, output would be similar to:
...
v1beta1.external.metrics.k8s.io        keda/keda-metrics-apiserver   False (ServiceNotFound)   3h51m
...
  1. After deleting failed API server kubectl delete APIService v1beta1.external.metrics.k8s.io other namespaces were deleted successfully

I ran into this issue on gke as well. If you're running a VPC native cluster you'll need to add a firewall rule to allow the masters to talk with keda's apiservice (port 6443 by default)

Was this page helpful?
0 / 5 - 0 ratings