Cert-manager: Webhook seems not ready

Created on 9 Jul 2019 · 13Comments · Source: jetstack/cert-manager

Describe the bug:
I'm trying to automate the installation of cert-manager with terraform but I got an error when deploying the first ClusterIssuer:

Error from server (InternalError): error when creating "cluster-issuer-prod.yaml": Internal error occurred: failed calling webhook "clusterissuers.admission.certmanager.k8s.io": the server is currently unable to handle the request

I'm not sure it's Cert-manager related (maybe it's more K8s related ?)

So, I redo the same process with a small bash script to reproduce it without terraform.
Here the base script:

kubectl create ns cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
kubectl apply -f crds/
helm upgrade --install cert-manager --namespace cert-manager jetstack/cert-manager --wait -f values.yaml
kubectl create secret generic azuredns-config   --from-literal=CLIENT_SECRET="MYSECRET" -n cert-manager
kubectl get pod -n cert-manager
kubectl apply -f cluster-issuer-prod.yaml

# kubectl create ns cert-manager
namespace/cert-manager created
# kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
namespace/cert-manager labeled
# kubectl apply -f crds/
customresourcedefinition.apiextensions.k8s.io/certificates.certmanager.k8s.io created
customresourcedefinition.apiextensions.k8s.io/challenges.certmanager.k8s.io created
customresourcedefinition.apiextensions.k8s.io/clusterissuers.certmanager.k8s.io created
customresourcedefinition.apiextensions.k8s.io/issuers.certmanager.k8s.io created
customresourcedefinition.apiextensions.k8s.io/orders.certmanager.k8s.io created
# helm upgrade --install cert-manager --namespace cert-manager jetstack/cert-manager --wait -f values.yaml
Release "cert-manager" does not exist. Installing it now.
NAME:   cert-manager
LAST DEPLOYED: Mon Jul  8 12:55:29 2019
NAMESPACE: cert-manager
STATUS: DEPLOYED

RESOURCES:
==> v1/ClusterRole
NAME                                    AGE
cert-manager-edit                       42s
cert-manager-view                       42s
cert-manager-webhook:webhook-requester  42s

==> v1/Pod(related)
NAME                                     READY  STATUS   RESTARTS  AGE
cert-manager-6f4b9bdbf-4ckt8             1/1    Running  0         41s
cert-manager-cainjector-776644c94-2htht  1/1    Running  0         42s
cert-manager-webhook-58945c4ff7-9qfbh    1/1    Running  0         41s

==> v1/Service
NAME                  TYPE       CLUSTER-IP       EXTERNAL-IP  PORT(S)  AGE
cert-manager-webhook  ClusterIP  192.168.217.161  <none>       443/TCP  42s

==> v1/ServiceAccount
NAME                     SECRETS  AGE
cert-manager             1        42s
cert-manager-cainjector  1        42s
cert-manager-webhook     1        42s

==> v1alpha1/Certificate
NAME                              AGE
cert-manager-webhook-ca           41s
cert-manager-webhook-webhook-tls  41s

==> v1alpha1/Issuer
NAME                           AGE
cert-manager-webhook-ca        41s
cert-manager-webhook-selfsign  41s

==> v1beta1/APIService
NAME                                  AGE
v1beta1.admission.certmanager.k8s.io  41s

==> v1beta1/ClusterRole
NAME                     AGE
cert-manager             42s
cert-manager-cainjector  42s

==> v1beta1/ClusterRoleBinding
NAME                                 AGE
cert-manager                         42s
cert-manager-cainjector              42s
cert-manager-webhook:auth-delegator  42s

==> v1beta1/Deployment
NAME                     READY  UP-TO-DATE  AVAILABLE  AGE
cert-manager             1/1    1           1          42s
cert-manager-cainjector  1/1    1           1          42s
cert-manager-webhook     1/1    1           1          42s

==> v1beta1/RoleBinding
NAME                                                AGE
cert-manager-webhook:webhook-authentication-reader  42s

==> v1beta1/ValidatingWebhookConfiguration
NAME                  AGE
cert-manager-webhook  41s


NOTES:
cert-manager has been deployed successfully!

In order to begin issuing certificates, you will need to set up a ClusterIssuer
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).

More information on the different types of issuers and how to configure them
can be found in our documentation:

https://docs.cert-manager.io/en/latest/reference/issuers.html

For information on how to configure cert-manager to automatically provision
Certificates for Ingress resources, take a look at the `ingress-shim`
documentation:

https://docs.cert-manager.io/en/latest/reference/ingress-shim.html
# kubectl create secret generic azuredns-config   --from-literal=CLIENT_SECRET="MYSECRET" -n cert-manager
secret/azuredns-config created
# kubectl get pod -n cert-manager
NAME                                      READY   STATUS    RESTARTS   AGE
cert-manager-6f4b9bdbf-4ckt8              1/1     Running   0          42s
cert-manager-cainjector-776644c94-2htht   1/1     Running   0          43s
cert-manager-webhook-58945c4ff7-9qfbh     1/1     Running   0          42s
# kubectl apply -f cluster-issuer-prod.yaml
Error from server (InternalError): error when creating "cluster-issuer-prod.yaml": Internal error occurred: failed calling webhook "clusterissuers.admission.certmanager.k8s.io": the server is currently unable to handle the request

Expected behaviour:
Getting no error.

Steps to reproduce the bug:
Deploy cert-manager and then, right after (in the same bash script), a ClusterIssuer (or an Issuer) resource.

Anything else we need to know?:

Environment details::

Kubernetes version: v1.14.0
Cloud-provider/provisioner: AKS
cert-manager version: v0.8.1
Install method: helm

/kind bug

kinbug lifecyclrotten

Source

titilambert

👍7

Most helpful comment

With the same conditions (cert-manager 0.13.1) getting this error:

$ kubectl wait --for=condition=Available deployment/cert-manager-webhook -n cert-manager
deployment.extensions/cert-manager-webhook condition met

$ kubectl wait --for=condition=Ready pod/cert-manager-webhook-84954f5587-4k8jg -n cert-manager
pod/cert-manager-webhook-84954f5587-4k8jg condition met

$ kubectl apply -f /path/to/20-cluster-issuer-pr.yaml 
Error from server (InternalError): error when creating "/path/to/20-cluster-issuer-pr.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority

galamiram on 5 Mar 2020

👍4 😕2

All 13 comments

I have also run into this. The workaround I ended up with was this:

kubectl wait --for=condition=Available --timeout=300s \
    apiservice v1beta1.admission.certmanager.k8s.io

This command waits for the API service to be available before continuing. I add this just after installing cert-manager, before trying to create any resources depending on its functionality.

Unfortunately I don't think the helm --wait flag knows anything about API services so it cannot check this condition. The documentation only states that it waits for Pods, PVCs and Services.

lentzi90 on 14 Aug 2019

👍1

I ran into this too and used @lentzi90 's work around with cert-manager 0.10 and a slight adjustment to the resource name:

kubectl wait --for=condition=Available --timeout=300s apiservice v1beta1.webhook.certmanager.k8s.io

Thanks!

wallrj on 17 Sep 2019

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle stale

retest-bot on 16 Dec 2019

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle rotten
/remove-lifecycle stale

retest-bot on 15 Jan 2020

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to jetstack.
/close

retest-bot on 14 Feb 2020

@retest-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to jetstack.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jetstack-bot on 14 Feb 2020

With the same conditions (cert-manager 0.13.1) getting this error:

$ kubectl wait --for=condition=Available deployment/cert-manager-webhook -n cert-manager
deployment.extensions/cert-manager-webhook condition met

$ kubectl wait --for=condition=Ready pod/cert-manager-webhook-84954f5587-4k8jg -n cert-manager
pod/cert-manager-webhook-84954f5587-4k8jg condition met

$ kubectl apply -f /path/to/20-cluster-issuer-pr.yaml 
Error from server (InternalError): error when creating "/path/to/20-cluster-issuer-pr.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority

galamiram on 5 Mar 2020

👍4 😕2

I'm having the same error running 0.13.0 and installing via the helm chart. Did you find a resolution to this @galamiram ? It's very strange..
Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: x509: certificate signed by unknown authority

mccrackend on 2 Apr 2020

My workaround is to add a synthetic wait of a minute using a script that wraps these three commands...

galamiram on 2 Apr 2020

Thanks for the reply - so simply waiting a minute after deployment of the chart, before deploying issuers or certs, was enough for you to not get the x509 error? I've given it quite some time post-deploy and observed all the webhook tls secrets and pods come up & pass the wait for condition checks. I then apply my issuer and am still getting the x509 error.

mccrackend on 2 Apr 2020

It works for me, though I'm not sure about the specific reason for why it fails... Maybe worth open a new bug for this in particular

galamiram on 2 Apr 2020

Actually, if you are upgrading the Cert-Manager using helm, please make sure that you remove the old secret present in the namespace where previous cert-manager is installed. Mostly, the secrets which create the error "unknown authority" are named like "cert-manager-webhook-ca" and "cert-manager-webhook-webhook-tls".
So, to upgrade to a newer cert-manager release like v0.15.x, you need to remove these 2 secrets first.
Then, no issues while creating prod_issuer.
Thanks.

prabhatnagpal on 22 May 2020

Just ran into this on 1.0.0-beta.1. Very annoying. I'm also adding a sleep 60 after the creating of the cert-manager chart before creating issuers.

This is for a fresh install. I've also tried removing and re-installing. When doing this, I've checked to make sure there aren't any secrets that need cleaning up and there aren't any.