Sometimes when applying new certificate I am getting the following error and cert-manager cannot recover or even restart:
cert-manager-f7f8bf74d-q7bqm cert-manager E1030 15:09:16.626127 1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
cert-manager-f7f8bf74d-q7bqm cert-manager goroutine 208 [running]:
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/runtime.logPanic(0x18b8dc0, 0x2b738c0)
cert-manager-f7f8bf74d-q7bqm cert-manager external/io_k8s_apimachinery/pkg/util/runtime/runtime.go:74 +0xa3
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
cert-manager-f7f8bf74d-q7bqm cert-manager external/io_k8s_apimachinery/pkg/util/runtime/runtime.go:48 +0x82
cert-manager-f7f8bf74d-q7bqm cert-manager panic(0x18b8dc0, 0x2b738c0)
cert-manager-f7f8bf74d-q7bqm cert-manager GOROOT/src/runtime/panic.go:679 +0x1b2
cert-manager-f7f8bf74d-q7bqm cert-manager github.com/jetstack/cert-manager/pkg/controller/clusterissuers.(*controller).issuersForSecret(0xc000392780, 0xc002e98780, 0x1ee6240, 0xc002e98780, 0x1eb9860, 0xc001966fc0, 0x170e)
cert-manager-f7f8bf74d-q7bqm cert-manager pkg/controller/clusterissuers/checks.go:42 +0x254
cert-manager-f7f8bf74d-q7bqm cert-manager github.com/jetstack/cert-manager/pkg/controller/clusterissuers.(*controller).secretDeleted(0xc000392780, 0x1ae7760, 0xc002e98780)
cert-manager-f7f8bf74d-q7bqm cert-manager pkg/controller/clusterissuers/controller.go:113 +0xe1
cert-manager-f7f8bf74d-q7bqm cert-manager github.com/jetstack/cert-manager/pkg/controller.(*BlockingEventHandler).OnAdd(0xc00000ebd0, 0x1ae7760, 0xc002e98780)
cert-manager-f7f8bf74d-q7bqm cert-manager pkg/controller/util.go:131 +0x3d
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/client-go/tools/cache.(*processorListener).run.func1.1(0x0, 0x1afdd67, 0x1)
cert-manager-f7f8bf74d-q7bqm cert-manager external/io_k8s_client_go/tools/cache/shared_informer.go:658 +0x218
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0xc0015d6dd8, 0x0, 0xc0005c75f8)
cert-manager-f7f8bf74d-q7bqm cert-manager external/io_k8s_apimachinery/pkg/util/wait/wait.go:292 +0x51
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/client-go/tools/cache.(*processorListener).run.func1()
cert-manager-f7f8bf74d-q7bqm cert-manager external/io_k8s_client_go/tools/cache/shared_informer.go:652 +0x79
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0005c7740)
cert-manager-f7f8bf74d-q7bqm cert-manager external/io_k8s_apimachinery/pkg/util/wait/wait.go:152 +0x5e
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0015d6f40, 0xdf8475800, 0x0, 0x42d801, 0xc0015c25a0)
cert-manager-f7f8bf74d-q7bqm cert-manager external/io_k8s_apimachinery/pkg/util/wait/wait.go:153 +0xf8
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/wait.Until(...)
cert-manager-f7f8bf74d-q7bqm cert-manager external/io_k8s_apimachinery/pkg/util/wait/wait.go:88
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/client-go/tools/cache.(*processorListener).run(0xc000392880)
cert-manager-f7f8bf74d-q7bqm cert-manager external/io_k8s_client_go/tools/cache/shared_informer.go:650 +0x9b
cert-manager-f7f8bf74d-q7bqm cert-manager k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(0xc000121170, 0xc00150fd90)
cert-manager-f7f8bf74d-q7bqm cert-manager external/io_k8s_apimachinery/pkg/util/wait/wait.go:71 +0x59
cert-manager-f7f8bf74d-q7bqm cert-manager created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start
cert-manager-f7f8bf74d-q7bqm cert-manager external/io_k8s_apimachinery/pkg/util/wait/wait.go:69 +0x62
cert-manager-f7f8bf74d-q7bqm cert-manager panic: runtime error: invalid memory address or nil pointer dereference [recovered]
cert-manager-f7f8bf74d-q7bqm cert-manager panic: runtime error: invalid memory address or nil pointer dereference
cert-manager-f7f8bf74d-q7bqm cert-manager [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x16d9944]
Environment details::
/kind bug
After I recreated the Vault issuer I'm using everything is back to normal.
Is there anyway I can be sure what caused the error?
Thanks for bringing this up, we are not checking for a pointer.
/assign
/milestone v0.12
Can you provide more context to your setup? I'm trying to replicate the bug locally.
I had issuer and clusterissuer deployed (both had VaultVerified status):
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
name: vault-staging-clusterissuer
namespace: cert-manager
spec:
vault:
path: pki_int/sign/ivan-test-dot-com
server: https://vault.kubernetes.ivan-test.com
auth:
appRole:
path: approle
roleId: "<role_id>"
secretRef:
name: approle-secret
key: secretId
apiVersion: cert-manager.io/v1alpha2
kind: Issuer
metadata:
name: vault-staging-issuer
namespace: cert-manager
spec:
vault:
path: pki_int/sign/ivan-test-dot-com
server: https://vault.kubernetes.ivan-test.com
auth:
appRole:
path: approle
roleId: "<role_id>"
secretRef:
name: approle-secret
key: secretId
md5-6a020dddd446d249a3dda7414d95d90a
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
name: cert-ivan-test
namespace: cert-manager
spec:
secretName: cert-ivan-test-tls
issuerRef:
name: vault-staging-issuer
commonName: blah.ivan-test.com
dnsNames:
- blah.ivan-test.com
I'm using old Vault version 0.10.1.
I had issuer and clusterissuer deployed (both had
VaultVerifiedstatus):apiVersion: cert-manager.io/v1alpha2 kind: ClusterIssuer metadata: name: vault-staging-clusterissuer namespace: cert-manager spec: vault: path: pki_int/sign/ivan-test-dot-com server: https://vault.kubernetes.ivan-test.com auth: appRole: path: approle roleId: "<role_id>" secretRef: name: approle-secret key: secretIdapiVersion: cert-manager.io/v1alpha2 kind: Issuer metadata: name: vault-staging-issuer namespace: cert-manager spec: vault: path: pki_int/sign/ivan-test-dot-com server: https://vault.kubernetes.ivan-test.com auth: appRole: path: approle roleId: "<role_id>" secretRef: name: approle-secret key: secretIdand certificate like this one:
apiVersion: cert-manager.io/v1alpha2 kind: Certificate metadata: name: cert-ivan-test namespace: cert-manager spec: secretName: cert-ivan-test-tls issuerRef: name: vault-staging-issuer commonName: blah.ivan-test.com dnsNames: - blah.ivan-test.comI'm using old Vault version
0.10.1.
Thanks, can you also paste your commands here as well?
@KadenLNelson I was just applying the manifest file with a certificate. Other important thing to mention is that it was not a constant error. Sometimes appears, sometimes does not.
I'm experiencing the same problem, on Vault 1.1.3, with an identical stack trace on failure.
In my case, I _do not_ have any Issuers, only a ClusterIssuer (just in case that helps rule out some potential causes).
Quick follow-up.
I swapped over to using an Issuer to create the certificate instead of a ClusterIssuer and the problem went away, so it's definitely not a configuration problem with the PKI role in Vault.
I might have to just create Issuers and secrets in every namespace where I need them as a workaround for now.
https://github.com/jetstack/cert-manager/pull/2316 Looks like this pull request fixes the issue. Can you see if you can replicate this on tag: canary?
@KadenLNelson I had to install it by templating out the Helm chart and re-applying the YAML until the webhook went up, but I can confirm that the problem persists.
I'm attempting to create a certificate in the consul namespace, and I'm seeing this error in the CertificateRequest, specifically:
apiVersion: cert-manager.io/v1alpha2
kind: CertificateRequest
metadata:
annotations:
cert-manager.io/certificate-name: test-cert
cert-manager.io/private-key-secret-name: test-cert
generation: 1
name: test-cert-2088831866
namespace: consul
ownerReferences:
- apiVersion: cert-manager.io/v1alpha2
blockOwnerDeletion: true
controller: true
kind: Certificate
name: test-cert
uid: d18e1864-04a6-11ea-b546-123b4b7971c0
resourceVersion: "18207952"
selfLink: /apis/cert-manager.io/v1alpha2/namespaces/consul/certificaterequests/test-cert-2088831866
spec:
csr: <redacted>
issuerRef:
kind: ClusterIssuer
name: vault-issuer
status:
conditions:
- lastTransitionTime: "2019-11-11T17:15:13Z"
message: 'Required secret resource not found: secret "cert-manager-vault-approle"
not found'
reason: Pending
status: "False"
type: Ready
Here's the certificate definition.
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
name: test-cert
namespace: consul
spec:
secretName: test-cert
issuerRef:
name: vault-issuer
kind: ClusterIssuer
commonName: test.service.consul
dnsNames:
- test.default.svc.cluster.local
Is there a recent change to the RBAC rules that I need to apply to give the cluster-scoped issuer access to the secret in cert-manager?
At this point, the crashing isn't happening any more, so rather than hijack this issue, I'm going to open a separate one for what I'm running into.
Fixed by #2316
Most helpful comment
Fixed by #2316