Cert-manager: cainjector failing frequently

Created on 14 Nov 2019  路  22Comments  路  Source: jetstack/cert-manager

Describe the bug:
cainjector 0.11.0 fails frequently with message:

I1114 12:05:33.699752       1 controller.go:242] cert-manager/controller-runtime/controller "level"=1 "msg"="Successfully Reconciled"  "controller"="apiservice" "request"={"Namespace":"","Name":"v1.networking.k8s.io"}
I1114 12:05:33.699805       1 controller.go:242] cert-manager/controller-runtime/controller "level"=1 "msg"="Successfully Reconciled"  "controller"="apiservice" "request"={"Namespace":"","Name":"v1alpha1.kubeapps.com"}
I1114 12:05:33.699814       1 controller.go:242] cert-manager/controller-runtime/controller "level"=1 "msg"="Successfully Reconciled"  "controller"="customresourcedefinition" "request"={"Namespace":"","Name":"issuers.cert-manager.io"}
I1114 12:05:33.699916       1 controller.go:242] cert-manager/controller-runtime/controller "level"=1 "msg"="Successfully Reconciled"  "controller"="apiservice" "request"={"Namespace":"","Name":"v1beta1.metrics.k8s.io"}
I1114 12:05:33.700003       1 controller.go:242] cert-manager/controller-runtime/controller "level"=1 "msg"="Successfully Reconciled"  "controller"="apiservice" "request"={"Namespace":"","Name":"v1.apiextensions.k8s.io"}
E1114 12:25:09.735275       1 leaderelection.go:365] Failed to update lock: etcdserver: request timed out
I1114 12:25:10.923019       1 leaderelection.go:287] failed to renew lease kube-system/cert-manager-cainjector-leader-election: failed to tryAcquireOrRenew context deadline exceeded
F1114 12:25:10.923143       1 start.go:127] error running manager: leader election lost
$ kubectl -n cert-manager get pod cert-manager-cainjector-576978ffc8-mtg7b -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    cni.projectcalico.org/podIP: 10.2.0.178/32
  creationTimestamp: "2019-11-07T11:02:50Z"
  generateName: cert-manager-cainjector-576978ffc8-
  labels:
    app: cainjector
    app.kubernetes.io/instance: cert-manager
    app.kubernetes.io/managed-by: Tiller
    app.kubernetes.io/name: cainjector
    helm.sh/chart: cainjector-v0.11.0
    pod-template-hash: 576978ffc8
  name: cert-manager-cainjector-576978ffc8-mtg7b
  namespace: cert-manager
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: cert-manager-cainjector-576978ffc8
    uid: 4f7bbfc9-dfda-4264-bd06-b722bd88d21b
  resourceVersion: "6158194385"
  selfLink: /api/v1/namespaces/cert-manager/pods/cert-manager-cainjector-576978ffc8-mtg7b
  uid: 052447f0-59af-4a93-bfef-c09c8727610e
spec:
  containers:
  - args:
    - --v=2
    - --leader-election-namespace=kube-system
    env:
    - name: POD_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    image: quay.io/jetstack/cert-manager-cainjector:v0.11.0
    imagePullPolicy: Always
    name: cainjector
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: cert-manager-cainjector-token-vbfw7
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: kubernetes-internal-node0
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: cert-manager-cainjector
  serviceAccountName: cert-manager-cainjector
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: cert-manager-cainjector-token-vbfw7
    secret:
      defaultMode: 420
      secretName: cert-manager-cainjector-token-vbfw7
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2019-11-07T11:02:50Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2019-11-14T12:25:14Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2019-11-14T12:25:14Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2019-11-07T11:02:50Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://5b2521cd7d0fe3c6a5516aaf90e20a5589e47ef8c04134e51974b724ea7aaff3
    image: quay.io/jetstack/cert-manager-cainjector:v0.11.0
    imageID: docker-pullable://quay.io/jetstack/cert-manager-cainjector@sha256:cf77d14d1c825190a38ac6b593f591998e3b34464f626f24479eb3e21dd589b3
    lastState:
      terminated:
        containerID: docker://dcba9c63711b87cd1ce348577546ee9d2cde557b0c993866997fb8f24028c9bc
        exitCode: 255
        finishedAt: "2019-11-14T12:25:10Z"
        reason: Error
        startedAt: "2019-11-14T12:05:16Z"
    name: cainjector
    ready: true
    restartCount: 323
    started: true
    state:
      running:
        startedAt: "2019-11-14T12:25:13Z"
  hostIP: 51.83.15.9
  phase: Running
  podIP: 10.2.0.178
  podIPs:
  - ip: 10.2.0.178
  qosClass: BestEffort
  startTime: "2019-11-07T11:02:50Z"



md5-4968d46bdbe69c8324429fea229f190f



$ kubectl get pods -A|grep cert-manager
cert-manager           cert-manager-55c44f98f-dghd7                       1/1     Running   0          7d1h
cert-manager           cert-manager-cainjector-576978ffc8-mtg7b           1/1     Running   323        7d1h
cert-manager           cert-manager-webhook-c67fbc858-gbn8w               1/1     Running   1          7d1h



md5-561acd711ebcc755b0ceec3b4cadf074



$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.3", GitCommit:"b3cbbae08ec52a7fc73d334838e18d17e8512749", GitTreeState:"clean", BuildDate:"2019-11-13T11:23:11Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:09:08Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud-provider/provisioner (e.g. GKE, kops AWS, etc):
    OVH Managed Kubernetes on public cloud (https://www.ovh.com/world/public-cloud/kubernetes/)
  • cert-manager version (e.g. v0.4.0):
    0.11.0
  • Install method (e.g. helm or static manifests):
    manifests (CustomResourceDefinition, namespace) + helm

/kind bug

arecainjector kinbug prioritawaiting-more-evidence

Most helpful comment

We saw the same issue, when we set the --leader-elect flag to false the error doesn't occur. We are running cainjector with a replica count of 1. The relevant part from the values file for the Helm chart looks as follows:

cainjector:
  enabled: true
  replicaCount: 1

  extraArgs:
    - --leader-elect=false

  image:
    repository: quay.io/jetstack/cert-manager-cainjector
    tag: v0.14.2
    pullPolicy: IfNotPresent
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T23:35:15Z", GoVersion:"go1.14.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.4", GitCommit:"8d8aa39598534325ad77120c120a22b3a990b5ea", GitTreeState:"clean", BuildDate:"2020-03-12T20:55:23Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

Maybe this helps someone. Does someone have an idea how to solve this for a higher replica count?

All 22 comments

I also noticed some other error messages in caininjector container logs.

I1114 13:25:13.862928       1 start.go:82] starting ca-injector v0.11.0 (revision b030b7eb4)
I1114 13:25:14.482306       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="mutatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I1114 13:25:14.482783       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="mutatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"secretName":"","issuerRef":{"name":""}},"status":{}}}
I1114 13:25:14.483209       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="mutatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I1114 13:25:14.483953       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="validatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I1114 13:25:14.484164       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="validatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"secretName":"","issuerRef":{"name":""}},"status":{}}}
I1114 13:25:14.484271       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="validatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
E1114 13:25:14.484308       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_depth" "queue"="mutatingwebhookconfiguration"
E1114 13:25:14.484405       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_adds_total" "queue"="mutatingwebhookconfiguration"
E1114 13:25:14.484450       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_queue_duration_seconds" "queue"="mutatingwebhookconfiguration"
E1114 13:25:14.484485       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_work_duration_seconds" "queue"="mutatingwebhookconfiguration"
E1114 13:25:14.484533       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_unfinished_work_seconds" "queue"="mutatingwebhookconfiguration"
E1114 13:25:14.484568       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_longest_running_processor_seconds" "queue"="mutatingwebhookconfiguration"
E1114 13:25:14.484613       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_retries_total" "queue"="mutatingwebhookconfiguration"
I1114 13:25:14.484659       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="mutatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I1114 13:25:14.484709       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="mutatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
E1114 13:25:14.484943       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_depth" "queue"="validatingwebhookconfiguration"
I1114 13:25:14.484979       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="apiservice" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"service":null,"groupPriorityMinimum":0,"versionPriority":0},"status":{}}}
E1114 13:25:14.484981       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_adds_total" "queue"="validatingwebhookconfiguration"
E1114 13:25:14.485160       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_queue_duration_seconds" "queue"="validatingwebhookconfiguration"
I1114 13:25:14.485109       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="apiservice" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"secretName":"","issuerRef":{"name":""}},"status":{}}}
E1114 13:25:14.485196       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_work_duration_seconds" "queue"="validatingwebhookconfiguration"
E1114 13:25:14.485240       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_unfinished_work_seconds" "queue"="validatingwebhookconfiguration"
E1114 13:25:14.485277       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_longest_running_processor_seconds" "queue"="validatingwebhookconfiguration"
E1114 13:25:14.485318       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_retries_total" "queue"="validatingwebhookconfiguration"
I1114 13:25:14.485359       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="validatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I1114 13:25:14.485352       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="apiservice" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I1114 13:25:14.485395       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="validatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
E1114 13:25:14.485564       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_depth" "queue"="apiservice"
E1114 13:25:14.485601       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_adds_total" "queue"="apiservice"
E1114 13:25:14.485646       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_queue_duration_seconds" "queue"="apiservice"
E1114 13:25:14.485686       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_work_duration_seconds" "queue"="apiservice"
E1114 13:25:14.485727       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_unfinished_work_seconds" "queue"="apiservice"
E1114 13:25:14.485892       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_longest_running_processor_seconds" "queue"="apiservice"
E1114 13:25:14.486002       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_retries_total" "queue"="apiservice"
I1114 13:25:14.486115       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="apiservice" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"service":null,"groupPriorityMinimum":0,"versionPriority":0},"status":{}}}
I1114 13:25:14.486186       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="apiservice" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
E1114 13:25:14.486536       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_depth" "queue"="customresourcedefinition"
E1114 13:25:14.486607       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_adds_total" "queue"="customresourcedefinition"
E1114 13:25:14.486698       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_queue_duration_seconds" "queue"="customresourcedefinition"
E1114 13:25:14.486773       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_work_duration_seconds" "queue"="customresourcedefinition"
E1114 13:25:14.486862       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_unfinished_work_seconds" "queue"="customresourcedefinition"
E1114 13:25:14.486934       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_longest_running_processor_seconds" "queue"="customresourcedefinition"
E1114 13:25:14.487017       1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted"  "name"="workqueue_retries_total" "queue"="customresourcedefinition"
I1114 13:25:14.487578       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="customresourcedefinition" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"group":"","names":{"plural":"","kind":""},"scope":""},"status":{"conditions":null,"acceptedNames":{"plural":"","kind":""},"storedVersions":null}}}
I1114 13:25:14.487606       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="customresourcedefinition" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"group":"","names":{"plural":"","kind":""},"scope":""},"status":{"conditions":null,"acceptedNames":{"plural":"","kind":""},"storedVersions":null}}}
I1114 13:25:14.487663       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="customresourcedefinition" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I1114 13:25:14.487688       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="customresourcedefinition" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"secretName":"","issuerRef":{"name":""}},"status":{}}}
I1114 13:25:14.487739       1 leaderelection.go:241] attempting to acquire leader lease  kube-system/cert-manager-cainjector-leader-election-core...
I1114 13:25:14.487746       1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource"  "controller"="customresourcedefinition" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I1114 13:25:14.487844       1 leaderelection.go:241] attempting to acquire leader lease  kube-system/cert-manager-cainjector-leader-election...
E1114 13:25:14.603789       1 indexers.go:93] cert-manager/secret-for-certificate-mapper "msg"="unable to fetch certificate that owns the secret" "error"="Certificate.cert-manager.io \"quickstart-example-tls\" not found" "certificate"={"Namespace":"default","Name":"quickstart-example-tls"} "secret"={"Namespace":"default","Name":"quickstart-example-tls"} 
E1114 13:25:14.603802       1 indexers.go:93] cert-manager/secret-for-certificate-mapper "msg"="unable to fetch certificate that owns the secret" "error"="Certificate.cert-manager.io \"quickstart-example-tls\" not found" "certificate"={"Namespace":"default","Name":"quickstart-example-tls"} "secret"={"Namespace":"default","Name":"quickstart-example-tls"} 
E1114 13:25:14.603832       1 indexers.go:93] cert-manager/secret-for-certificate-mapper "msg"="unable to fetch certificate that owns the secret" "error"="Certificate.cert-manager.io \"quickstart-example-tls\" not found" "certificate"={"Namespace":"default","Name":"quickstart-example-tls"} "secret"={"Namespace":"default","Name":"quickstart-example-tls"} 
E1114 13:25:14.703879       1 indexers.go:93] cert-manager/secret-for-certificate-mapper "msg"="unable to fetch certificate that owns the secret" "error"="Certificate.cert-manager.io \"quickstart-example-tls\" not found" "certificate"={"Namespace":"default","Name":"quickstart-example-tls"} "secret"={"Namespace":"default","Name":"quickstart-example-tls"} 
I1114 13:25:30.267601       1 leaderelection.go:251] successfully acquired lease kube-system/cert-manager-cainjector-leader-election-core

I'd like to resolve this but have no ide why it's failing.
Anybody?

same issue on my cluster

Have the same on OVH K8S as well. I tried to disable the webhook using --set webhook.enabled=false during helm install and I get exactly the same kind of logs.

I have the same problem. I tried to update to lastest cert-manager version and kubernetes 1.14 and nothing helped. My config is baremetal cluster on CentOS.

We've seen this issue as well on cert-manager 0.11 and 0.12. We've deployed via Helm onto GKE 1.14. cainjector crashes and restarts every few days with log lines very similar to the above.

Note that the logs below are newest to oldest.


I0108 01:45:49.901682 1 leaderelection.go:251] successfully acquired lease kube-system/cert-manager-cainjector-leader-election
I0108 01:45:33.722802 1 leaderelection.go:241] attempting to acquire leader lease kube-system/cert-manager-cainjector-leader-election...
I0108 01:45:33.722736 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="customresourcedefinition" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I0108 01:45:33.722692 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="customresourcedefinition" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"secretName":"","issuerRef":{"name":""}},"status":{}}}
I0108 01:45:33.722617 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="customresourcedefinition" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"group":"","names":{"plural":"","kind":""},"scope":""},"status":{"conditions":null,"acceptedNames":{"plural":"","kind":""},"storedVersions":null}}}
7x      
E0108 01:45:33.722535 1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted" "name"="workqueue_retries_total" "queue"="customresourcedefinition"
I0108 01:45:33.721954 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="apiservice" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I0108 01:45:33.721886 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="apiservice" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"secretName":"","issuerRef":{"name":""}},"status":{}}}
I0108 01:45:33.721793 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="apiservice" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"service":null,"groupPriorityMinimum":0,"versionPriority":0},"status":{}}}
7x      
E0108 01:45:33.721744 1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted" "name"="workqueue_retries_total" "queue"="apiservice"
I0108 01:45:33.721187 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="validatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I0108 01:45:33.721144 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="validatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"secretName":"","issuerRef":{"name":""}},"status":{}}}
I0108 01:45:33.721074 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="validatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
7x      
E0108 01:45:33.721010 1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted" "name"="workqueue_retries_total" "queue"="validatingwebhookconfiguration"
I0108 01:45:33.720349 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="mutatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I0108 01:45:33.720096 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="mutatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"secretName":"","issuerRef":{"name":""}},"status":{}}}
I0108 01:45:33.719828 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="mutatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
7x      
E0108 01:45:33.719750 1 workqueue.go:37] cert-manager/controller-runtime/metrics "msg"="failed to register metric" "error"="duplicate metrics collector registration attempted" "name"="workqueue_retries_total" "queue"="mutatingwebhookconfiguration"
I0108 01:45:33.719091 1 leaderelection.go:241] attempting to acquire leader lease kube-system/cert-manager-cainjector-leader-election-core...
I0108 01:45:33.718880 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="customresourcedefinition" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I0108 01:45:33.718650 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="customresourcedefinition" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"group":"","names":{"plural":"","kind":""},"scope":""},"status":{"conditions":null,"acceptedNames":{"plural":"","kind":""},"storedVersions":null}}}
I0108 01:45:33.717613 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="apiservice" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I0108 01:45:33.717492 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="apiservice" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"service":null,"groupPriorityMinimum":0,"versionPriority":0},"status":{}}}
4x      
I0108 01:45:33.716981 1 controller.go:121] cert-manager/controller-runtime/controller "level"=0 "msg"="Starting EventSource" "controller"="validatingwebhookconfiguration" "source"={"Type":{"metadata":{"creationTimestamp":null}}}
I0108 01:45:33.052123 1 start.go:82] starting ca-injector v0.12.0 (revision 0e384f5d0)
F0108 01:45:32.184322 1 start.go:127] error running manager: leader election lost
I0108 01:45:32.184262 1 recorder.go:52] cert-manager/controller-runtime/manager/events "level"=1 "msg"="Normal" "message"="cert-manager-cainjector-594fd9cc45-m45rd_847f12d3-4512-43b7-9fc9-dc979b1b79a2 stopped leading" "object"={"kind":"ConfigMap","namespace":"kube-system","name":"cert-manager-cainjector-leader-election","uid":"20ffd830-f686-11e9-91a6-42010a800159","apiVersion":"v1","resourceVersion":"48157750"} "reason"="LeaderElection"
I0108 01:45:32.184121 1 leaderelection.go:287] failed to renew lease kube-system/cert-manager-cainjector-leader-election: failed to tryAcquireOrRenew context deadline exceeded
2x      
I0108 01:45:28.963618 1 controller.go:242] cert-manager/controller-runtime/controller "level"=1 "msg"="Successfully Reconciled" "controller"="apiservice" "request"={"Namespace":"","Name":"v1beta1.metrics.k8s.io"}
I0107 19:16:24.503744 1 controller.go:242] cert-manager/controller-runtime/controller "level"=1 "msg"="Successfully Reconciled" "controller"="mutatingwebhookconfiguration" "request"={"Namespace":"","Name":"cert-manager-webhook"}

Any workaround for this?

Getting a lot of "failed to register metric" on 0.13 also. kops-provisioned cluster, so no metrics server

Add me to the list.
Kubernetes 1.15.7.
Cert-manager 0.13.0

Same
Kubernetes (on GKE) 1.15.9-gke.8
Cert-manager 0.13.1

K8S 1.17
quay.io/jetstack/cert-manager-cainjector:v0.12.0

+1 cert-manager 0.13.1 on GKE

Facing the same issue
v0.12
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.4", GitCommit:"8d8aa39598534325ad77120c120a22b3a990b5ea", GitTreeState:"clean", BuildDate:"2020-03-12T21:03:42Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15+", GitVersion:"v1.15.7-gke.32", GitCommit:"4ce968103257586ffc7adc1b3fb93f640875c2fc", GitTreeState:"clean", BuildDate:"2020-03-06T01:30:29Z", GoVersion:"go1.12.12b4", Compiler:"gc", Platform:"linux/amd64"}

Any luck?

Same issue on cert-manager 0.13.0 on k8s v1.15, after node draining

We saw the same issue, when we set the --leader-elect flag to false the error doesn't occur. We are running cainjector with a replica count of 1. The relevant part from the values file for the Helm chart looks as follows:

cainjector:
  enabled: true
  replicaCount: 1

  extraArgs:
    - --leader-elect=false

  image:
    repository: quay.io/jetstack/cert-manager-cainjector
    tag: v0.14.2
    pullPolicy: IfNotPresent
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T23:35:15Z", GoVersion:"go1.14.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.4", GitCommit:"8d8aa39598534325ad77120c120a22b3a990b5ea", GitTreeState:"clean", BuildDate:"2020-03-12T20:55:23Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

Maybe this helps someone. Does someone have an idea how to solve this for a higher replica count?

We saw the same issue, when we set the --leader-elect flag to false the error doesn't occur. We are running cainjector with a replica count of 1. The relevant part from the values file for the Helm chart looks as follows:

cainjector:
  enabled: true
  replicaCount: 1

  extraArgs:
    - --leader-elect=false

  image:
    repository: quay.io/jetstack/cert-manager-cainjector
    tag: v0.14.2
    pullPolicy: IfNotPresent
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T23:35:15Z", GoVersion:"go1.14.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.4", GitCommit:"8d8aa39598534325ad77120c120a22b3a990b5ea", GitTreeState:"clean", BuildDate:"2020-03-12T20:55:23Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

Maybe this helps someone. Does someone have an idea how to solve this for a higher replica count?

Thanks for sharing this workaround. While it works, what happens in the case of split brain or network connectivity issues on the kubelet control plane level. Will this be properly handled if -leader-elect=false, or will it just keep trying to store its state on the kubelet it can last contact?

I've read through your logs and have been digging into the leader election code.

First, thing I would try is upgrading to the latest version of cert-manager (0.14.2 at time of writing).
This version uses the latest version of controller-runtime, which contains various fixes around leader-election.
And this version also uses latest versions of Kubernetes client-go, which is where the leader-election library lives and I can see that there have been some bug fixes in that code.

@vkukk I also see that your original error logs were during lease renewal, and I have found that the default RenewDeadline is 10s.

        // RenewDeadline is the duration that the acting master will retry
    // refreshing leadership before giving up. Default is 10 seconds.
    RenewDeadline *time.Duration

Which should be plenty enough time to complete the lease renewal, but perhaps your Etcd server, API server API server network is overloaded and running slowly for some reason.

Unfortunately, the RenewDeadline is not currently configurable, but if you continue to have problems, we could consider adding a new CLI option to cainjector so that this parameter can be tweaked.

/area cainjector

FWIW I got this on cert-manager using the .yaml file from the release page (with 0.14.2). It seems to be fixed by adding --leader-elect=false (I also just have one replica).

adding more evidence

Same issue on cert-manager v0.10.0 on k8s 1.16.2.

Any update here? Same issue with cert-manager helm chart 1.0.4! Is there any drawback from passing --leader-elect=false?

same issues on helm installation v.1.1.0 on AKS with AAD Pod Identity v1.7.1

Was this page helpful?
0 / 5 - 0 ratings