cert-manager is crashing when applying new certificate

Created on 31 Oct 2019  路  11Comments  路  Source: jetstack/cert-manager

Sometimes when applying new certificate I am getting the following error and cert-manager cannot recover or even restart:

cert-manager-f7f8bf74d-q7bqm cert-manager E1030 15:09:16.626127       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
cert-manager-f7f8bf74d-q7bqm cert-manager goroutine 208 [running]:
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/runtime.logPanic(0x18b8dc0, 0x2b738c0)
cert-manager-f7f8bf74d-q7bqm cert-manager   external/io_k8s_apimachinery/pkg/util/runtime/runtime.go:74 +0xa3
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
cert-manager-f7f8bf74d-q7bqm cert-manager   external/io_k8s_apimachinery/pkg/util/runtime/runtime.go:48 +0x82
cert-manager-f7f8bf74d-q7bqm cert-manager panic(0x18b8dc0, 0x2b738c0)
cert-manager-f7f8bf74d-q7bqm cert-manager   GOROOT/src/runtime/panic.go:679 +0x1b2
cert-manager-f7f8bf74d-q7bqm cert-manager github.com/jetstack/cert-manager/pkg/controller/clusterissuers.(*controller).issuersForSecret(0xc000392780, 0xc002e98780, 0x1ee6240, 0xc002e98780, 0x1eb9860, 0xc001966fc0, 0x170e)
cert-manager-f7f8bf74d-q7bqm cert-manager   pkg/controller/clusterissuers/checks.go:42 +0x254
cert-manager-f7f8bf74d-q7bqm cert-manager github.com/jetstack/cert-manager/pkg/controller/clusterissuers.(*controller).secretDeleted(0xc000392780, 0x1ae7760, 0xc002e98780)
cert-manager-f7f8bf74d-q7bqm cert-manager   pkg/controller/clusterissuers/controller.go:113 +0xe1
cert-manager-f7f8bf74d-q7bqm cert-manager github.com/jetstack/cert-manager/pkg/controller.(*BlockingEventHandler).OnAdd(0xc00000ebd0, 0x1ae7760, 0xc002e98780)
cert-manager-f7f8bf74d-q7bqm cert-manager   pkg/controller/util.go:131 +0x3d
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/client-go/tools/cache.(*processorListener).run.func1.1(0x0, 0x1afdd67, 0x1)
cert-manager-f7f8bf74d-q7bqm cert-manager   external/io_k8s_client_go/tools/cache/shared_informer.go:658 +0x218
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0xc0015d6dd8, 0x0, 0xc0005c75f8)
cert-manager-f7f8bf74d-q7bqm cert-manager   external/io_k8s_apimachinery/pkg/util/wait/wait.go:292 +0x51
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/client-go/tools/cache.(*processorListener).run.func1()
cert-manager-f7f8bf74d-q7bqm cert-manager   external/io_k8s_client_go/tools/cache/shared_informer.go:652 +0x79
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0005c7740)
cert-manager-f7f8bf74d-q7bqm cert-manager   external/io_k8s_apimachinery/pkg/util/wait/wait.go:152 +0x5e
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0015d6f40, 0xdf8475800, 0x0, 0x42d801, 0xc0015c25a0)
cert-manager-f7f8bf74d-q7bqm cert-manager   external/io_k8s_apimachinery/pkg/util/wait/wait.go:153 +0xf8
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/wait.Until(...)
cert-manager-f7f8bf74d-q7bqm cert-manager   external/io_k8s_apimachinery/pkg/util/wait/wait.go:88
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/client-go/tools/cache.(*processorListener).run(0xc000392880)
cert-manager-f7f8bf74d-q7bqm cert-manager   external/io_k8s_client_go/tools/cache/shared_informer.go:650 +0x9b
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(0xc000121170, 0xc00150fd90)
cert-manager-f7f8bf74d-q7bqm cert-manager   external/io_k8s_apimachinery/pkg/util/wait/wait.go:71 +0x59
cert-manager-f7f8bf74d-q7bqm cert-manager created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start
cert-manager-f7f8bf74d-q7bqm cert-manager   external/io_k8s_apimachinery/pkg/util/wait/wait.go:69 +0x62
cert-manager-f7f8bf74d-q7bqm cert-manager panic: runtime error: invalid memory address or nil pointer dereference [recovered]
cert-manager-f7f8bf74d-q7bqm cert-manager   panic: runtime error: invalid memory address or nil pointer dereference
cert-manager-f7f8bf74d-q7bqm cert-manager [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x16d9944]

Environment details::

  • Kubernetes version 1.14
  • cert-manager version (e.g. v0.11.0):

/kind bug

After I recreated the Vault issuer I'm using everything is back to normal.

Is there anyway I can be sure what caused the error?

kinbug

Most helpful comment

Fixed by #2316

All 11 comments

Thanks for bringing this up, we are not checking for a pointer.

/assign
/milestone v0.12

Can you provide more context to your setup? I'm trying to replicate the bug locally.

I had issuer and clusterissuer deployed (both had VaultVerified status):

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: vault-staging-clusterissuer
  namespace: cert-manager
spec:
  vault:
    path: pki_int/sign/ivan-test-dot-com
    server: https://vault.kubernetes.ivan-test.com
    auth:
      appRole:
        path: approle
        roleId: "<role_id>"
        secretRef:
          name: approle-secret
          key: secretId
apiVersion: cert-manager.io/v1alpha2
kind: Issuer
metadata:
  name: vault-staging-issuer
  namespace: cert-manager
spec:
  vault:
    path: pki_int/sign/ivan-test-dot-com
    server: https://vault.kubernetes.ivan-test.com
    auth:
      appRole:
        path: approle
        roleId: "<role_id>"
        secretRef:
          name: approle-secret
          key: secretId



md5-6a020dddd446d249a3dda7414d95d90a



apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: cert-ivan-test
  namespace: cert-manager
spec:
  secretName: cert-ivan-test-tls
  issuerRef:
    name: vault-staging-issuer
  commonName: blah.ivan-test.com
  dnsNames:
  - blah.ivan-test.com

I'm using old Vault version 0.10.1.

I had issuer and clusterissuer deployed (both had VaultVerified status):

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: vault-staging-clusterissuer
  namespace: cert-manager
spec:
  vault:
    path: pki_int/sign/ivan-test-dot-com
    server: https://vault.kubernetes.ivan-test.com
    auth:
      appRole:
        path: approle
        roleId: "<role_id>"
        secretRef:
          name: approle-secret
          key: secretId
apiVersion: cert-manager.io/v1alpha2
kind: Issuer
metadata:
  name: vault-staging-issuer
  namespace: cert-manager
spec:
  vault:
    path: pki_int/sign/ivan-test-dot-com
    server: https://vault.kubernetes.ivan-test.com
    auth:
      appRole:
        path: approle
        roleId: "<role_id>"
        secretRef:
          name: approle-secret
          key: secretId

and certificate like this one:

apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: cert-ivan-test
  namespace: cert-manager
spec:
  secretName: cert-ivan-test-tls
  issuerRef:
    name: vault-staging-issuer
  commonName: blah.ivan-test.com
  dnsNames:
  - blah.ivan-test.com

I'm using old Vault version 0.10.1.

Thanks, can you also paste your commands here as well?

@KadenLNelson I was just applying the manifest file with a certificate. Other important thing to mention is that it was not a constant error. Sometimes appears, sometimes does not.

I'm experiencing the same problem, on Vault 1.1.3, with an identical stack trace on failure.

In my case, I _do not_ have any Issuers, only a ClusterIssuer (just in case that helps rule out some potential causes).

Quick follow-up.

I swapped over to using an Issuer to create the certificate instead of a ClusterIssuer and the problem went away, so it's definitely not a configuration problem with the PKI role in Vault.

I might have to just create Issuers and secrets in every namespace where I need them as a workaround for now.

https://github.com/jetstack/cert-manager/pull/2316 Looks like this pull request fixes the issue. Can you see if you can replicate this on tag: canary?

@KadenLNelson I had to install it by templating out the Helm chart and re-applying the YAML until the webhook went up, but I can confirm that the problem persists.

I'm attempting to create a certificate in the consul namespace, and I'm seeing this error in the CertificateRequest, specifically:

apiVersion: cert-manager.io/v1alpha2
  kind: CertificateRequest
  metadata:
    annotations:
      cert-manager.io/certificate-name: test-cert
      cert-manager.io/private-key-secret-name: test-cert
    generation: 1
    name: test-cert-2088831866
    namespace: consul
    ownerReferences:
    - apiVersion: cert-manager.io/v1alpha2
      blockOwnerDeletion: true
      controller: true
      kind: Certificate
      name: test-cert
      uid: d18e1864-04a6-11ea-b546-123b4b7971c0
    resourceVersion: "18207952"
    selfLink: /apis/cert-manager.io/v1alpha2/namespaces/consul/certificaterequests/test-cert-2088831866
  spec:
    csr: <redacted>
    issuerRef:
      kind: ClusterIssuer
      name: vault-issuer
  status:
    conditions:
    - lastTransitionTime: "2019-11-11T17:15:13Z"
      message: 'Required secret resource not found: secret "cert-manager-vault-approle"
        not found'
      reason: Pending
      status: "False"
      type: Ready

Here's the certificate definition.

apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: test-cert
  namespace: consul
spec:
  secretName: test-cert
  issuerRef:
    name: vault-issuer
    kind: ClusterIssuer
  commonName: test.service.consul
  dnsNames:
    - test.default.svc.cluster.local

Is there a recent change to the RBAC rules that I need to apply to give the cluster-scoped issuer access to the secret in cert-manager?

At this point, the crashing isn't happening any more, so rather than hijack this issue, I'm going to open a separate one for what I'm running into.

Fixed by #2316

Was this page helpful?
0 / 5 - 0 ratings