Cert-manager: Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://rancher-cert-cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority (possibly because of "x509: ECDSA verification failure" while trying to verify candidate authority certificate "cert-manager-webhook-ca")

Created on 1 Oct 2020  路  19Comments  路  Source: jetstack/cert-manager

Describe the bug:

I am using cert-manager to generate the certificate for Rancher. I am using helm chart to deploy both. (cert-manager version 0.16.1
and Rancher version 2.4.8). Cert-manager is deployed successfully but while deploying Rancher facing issues in generating certificate.

Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://rancher-cert-cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority (possibly because of "x509: ECDSA verification failure" while trying to verify candidate authority certificate "cert-manager-webhook-ca")

Expected Behavior:
It should deploy successfully.

Environment:
Kubernetes : "v1.15.11-eks-af3caf"
kubectl : v1.18.6
Install method: helm + kustomize (using Argo-CD)

There is no change in default values of charts.

triagsupport

All 19 comments

Is this a fresh install of cert-manager?

/triage support

Hi @meyskens
Yes it is a fresh installation of cert-manager.

Have you taken a look at https://cert-manager.io/docs/concepts/webhook/#known-problems-and-solutions?

Hi @meyskens
First thing is that I am not using any custom CNI. And second one is that the URL which you provided is for cert-manager version 1.X but I am using cert-manager 0.16.1.
I checked for compatibility issues in v0.16.0 docs but didn't find any issue regarding EKS cluster.
Also all the ports are allowed within the cluster. I tried with changing the "webhook.secureport" to some other port but still no luck.

The documentation for that is only present in v1.0.0 as the workaround wasn't in 0.16.
Have you looked into the cabundle part as you are getting a certiicate validation error

I checked the ca-bundle from my side. Everything seems fine. Could you please suggest what all things need to be checked. I don't want to miss anything.
Also while deploying cert-manager I got some warnings:

CustomResourceDefinition/certificaterequests.cert-manager.io is part of a different application: cert-manager
CustomResourceDefinition/certificates.cert-manager.io is part of a different application: cert-manager
CustomResourceDefinition/challenges.acme.cert-manager.io is part of a different application: cert-manager
CustomResourceDefinition/clusterissuers.cert-manager.io is part of a different application: cert-manager
CustomResourceDefinition/issuers.cert-manager.io is part of a different application: cert-manager
CustomResourceDefinition/orders.acme.cert-manager.io is part of a different application: cert-manager

Hi @meyskens
This issue got resolved for me. So closing it.

I getting pissed off by people who say the issue resolved, without explanation, what was done to solve it.

I getting pissed off by people who say the issue resolved, without explanation, what was done to solve it.

Hi @alexsorkin
Apologies for not mentioning the reason. Let me explain you the scenerio.
As the cert-manager was not working for me initially so I tried to deploy it on some different namespace(rancher-certs). But there also it didn't worked. So I removed it from that namespace. But unfortunately "ValidatingWebhookConfiguration" and "MutatingWebhookConfiguration" didn't removed at that time.
Then I again deployed cert-manager in cert-manager namespace and it also created both of the above objects.
So due to this the issue occurred.

I debugged the whole process and found the multiple objects. So once I deleted the old one the issue went off.
Also I deployed the required CRD with helm later on (by doing installCRDs : true) rather then installing them using palin YAML's.

But now Facing new issue i.e.

Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://rancher-cert-cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority

I re-checked everything but still didn't find any solution for that. Please let me know if are familiar with this issue.

Hi, the problem eventually heels itself... You should simply wait ~20 seconds before creating issuer, after cert-manager deploy, to let cainjector to inject the CA certificates into webhook.
Its not an issue... but the behaviour should be documented in cert-manager "Getting Started document".

Hi, the problem eventually heels itself... You should simply wait ~20 seconds before creating issuer, after cert-manager deploy, to let cainjector to inject the CA certificates into webhook.
Its not an issue... but the behaviour should be documented in cert-manager "Getting Started document".

Hi @alexsorkin
I tried this also. Waited for 10 min after cert-manager deployment but still the same issue.

Is the cainjector running? See #3338 (comment)

Hi @meyskens
The cainjector is also running.
Everything is healthy and is in running state.

[ankit@rancher ~]$ kubectl get all -n cert-manager
NAME                                                        READY   STATUS    RESTARTS   AGE
pod/rancher-cert-cert-manager-5786f46d5b-lsm6s              1/1     Running   0          12h
pod/rancher-cert-cert-manager-cainjector-6894f9cbcf-jqg46   1/1     Running   0          12h
pod/rancher-cert-cert-manager-webhook-86df8cf76-z8f22       1/1     Running   0          12h

NAME                                        TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/rancher-cert-cert-manager           ClusterIP   10.100.4.67     <none>        9402/TCP   2d4h
service/rancher-cert-cert-manager-webhook   ClusterIP   10.100.85.152   <none>        443/TCP    2d4h

NAME                                                   READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/rancher-cert-cert-manager              1/1     1            1           2d4h
deployment.apps/rancher-cert-cert-manager-cainjector   1/1     1            1           2d4h
deployment.apps/rancher-cert-cert-manager-webhook      1/1     1            1           2d4h

NAME                                                              DESIRED   CURRENT   READY   AGE
replicaset.apps/rancher-cert-cert-manager-5786f46d5b              1         1         1       2d4h
replicaset.apps/rancher-cert-cert-manager-cainjector-6894f9cbcf   1         1         1       2d4h
replicaset.apps/rancher-cert-cert-manager-webhook-86df8cf76       1         1         1       2d4h
[ankit@rancher ~]$ kubectl get secret -n cert-manager
NAME                                               TYPE                                  DATA   AGE
default-token-9v8sh                                kubernetes.io/service-account-token   3      2d4h
istio.default                                      istio.io/key-and-cert                 3      2d4h
rancher-cert-cert-manager-cainjector-token-5tgj5   kubernetes.io/service-account-token   3      2d4h
rancher-cert-cert-manager-token-48bd2              kubernetes.io/service-account-token   3      2d4h
rancher-cert-cert-manager-webhook-ca               Opaque                                3      2d4h
rancher-cert-cert-manager-webhook-token-hbw8c      kubernetes.io/service-account-token   3      2d4h

Hi @meyskens
I debugged a bit more and monitored the logs of cainjector pod logs and found the below issue.

E1014 10:27:24.861262       1 leaderelection.go:320] error retrieving resource lock kube-system/cert-manager-cainjector-leader-election-core: configmaps "cert-manager-cainjector-leader-election-core" is forbidden: User "system:serviceaccount:cert-manager:rancher-cert-cert-manager-cainjector" cannot get resource "configmaps" in API group "" in the namespace "kube-system"

Hi @meyskens
I debugged a bit more and monitored the logs of cainjector pod logs and found the below issue.

E1014 10:27:24.861262       1 leaderelection.go:320] error retrieving resource lock kube-system/cert-manager-cainjector-leader-election-core: configmaps "cert-manager-cainjector-leader-election-core" is forbidden: User "system:serviceaccount:cert-manager:rancher-cert-cert-manager-cainjector" cannot get resource "configmaps" in API group "" in the namespace "kube-system"

It seems to be Rancher specific, missing proper PSP role.

Hi @meyskens
I debugged a bit more and monitored the logs of cainjector pod logs and found the below issue.

E1014 10:27:24.861262       1 leaderelection.go:320] error retrieving resource lock kube-system/cert-manager-cainjector-leader-election-core: configmaps "cert-manager-cainjector-leader-election-core" is forbidden: User "system:serviceaccount:cert-manager:rancher-cert-cert-manager-cainjector" cannot get resource "configmaps" in API group "" in the namespace "kube-system"

It seems to be Rancher specific, missing proper PSP role.

Hi @meyskens
The issue is resolved now. The problem was with Clusterrole "cert-manager:rancher-cert-cert-manager-cainjector" created for serviceaccount by cert-manager. The serviceaccout mapped with this role also expects access on configmaps. But in cert-manager helm chart configmaps access part is missing in ClusterRole definition.
I have edited the file manually and the issue went off.
This thing needs to be updated in helm chart itself.
Thanks for you help and support :)

I'm also seeing this using v1.0.3. I'm not familiar with the codebase, but do we need to patch clusterrole.rbac.authorization.k8s.io/v1/cert-manager-cainjector to include access to configmaps?

I'm also seeing this using v1.0.3. I'm not familiar with the codebase, but do we need to patch clusterrole.rbac.authorization.k8s.io/v1/cert-manager-cainjector to include access to configmaps?

I just ran into this issue, and I patched the ClusterRole with the following rule to access configmaps:

  - apiGroups:
      - ""
    resources:
      - configmaps
    verbs:
      - get
      - create
      - update

Note that if you're using Kustomize, you'll have to provide all the required permissions because the patch will replace the rules in the ClusterRole, instead of appending the new rule to the list. Here's a gist of the patch I'm using with Kustomize.

I'm also seeing this using v1.0.3. I'm not familiar with the codebase, but do we need to patch clusterrole.rbac.authorization.k8s.io/v1/cert-manager-cainjector to include access to configmaps?

Hi @johanbrandhorst
It is mandatory to patch the clusterrole because it is expecting access on configmaps. So to make it work smoothly, you have to add the configmaps part in clusterrole.

Was this page helpful?
0 / 5 - 0 ratings