Describe the bug:
Upon re-installing cert-manager and trying to verify the install, the admission api is failing with the following description:
kubectl describe APIService v1beta1.admission.certmanager.k8s.io
Name: v1beta1.admission.certmanager.k8s.io
Namespace:
Labels: app=webhook
chart=webhook-v0.6.4
heritage=Tiller
release=cert-manager
Annotations: <none>
API Version: apiregistration.k8s.io/v1
Kind: APIService
Metadata:
Creation Timestamp: 2019-03-01T10:08:13Z
Resource Version: 13956808
Self Link: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.admission.certmanager.k8s.io
UID: ecc47923-3c09-11e9-bae6-6e4899a3d5f0
Spec:
Ca Bundle: LS0tLS1<removed for brevity>LS0tCg==
Group: admission.certmanager.k8s.io
Group Priority Minimum: 1000
Service:
Name: cert-manager-webhook
Namespace: cert-manager
Version: v1beta1
Version Priority: 15
Status:
Conditions:
Last Transition Time: 2019-03-01T10:08:13Z
Message: no response from https://10.0.233.160:443: Get https://10.0.233.160:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Reason: FailedDiscoveryCheck
Status: False
Type: Available
Events: <none>
This manifests when trying to apply the test-resources.yaml for verifying the install, with the following output:
kubectl apply -f test-resources.yaml
namespace "cert-manager-test" created
Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling admission webhook "issuers.admission.certmanager.k8s.io": the server is currently unable to handle the request
Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling admission webhook "certificates.admission.certmanager.k8s.io": the server is currently unable to handle the request
Expected behaviour:
Test Resources should be created successfully with no errors.
Steps to reproduce the bug:
Note: I have removed all other items from my cluster and following the install of the CRD's, created the name space, labelled the name space, then tried the install via helm using the following commands:
kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/v0.6.2/deploy/manifests/00-crds.yaml
kubectl create namespace cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
helm install --name cert-manager --namespace cert-manager --version v0.6.6 stable/cert-manager
Anything else we need to know?:
I have previously installed cert-manager successfully on this cluster. I was then trying to get the nginx-ingress working but got into a bit of a mess. So I deleted all resources created (via helm), and tidied up any orphaned objects - so I could start from scratch again. However, I'm now running into this issue.
The only similar issue I've seen is this https://github.com/helm/charts/issues/10869. But I'm unsure what the resolution to this is.
All other objects appear to have been created and started successfully. I haven't been able to see any other error messages having gone through the logs for the different pods.
Environment details::
/kind bug
Going on gut feel alone (and wandering in the dark a little!) I reckon this is the crux of the issue:
no response from https://10.0.233.160:443: Get https://10.0.233.160:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
(Taken from the Status section of the "describe APIService v1beta1.admission.certmanager.k8s.io" command above)
"10.0.233.160" is the IP of the "cert-manager-webhook" Service. When I check the log of the underlying Pod attached to this service I can't see any errors as such, but do see this line...
Serving securely on [::]:6443
... which ties in closely with the mention of the firewall issue (that was blocking the master from communicating with the nodes on port 6443) in the other issue I referenced above. Question from me now is how to I check that the Pod is not being blocked by such a firewall rule within AKS? Anyone got any hints here?
_Azure Support Ticket has been raised to query this._
Initial response from the Azure Support team confirms that there isn't any firewall blocking comms between the master and node. At this stage, we've used port forwarding via kubectl to try and connect with the cert-manager-webhook pod / service but we've not had any response, so early indication is that there seems to be something not working with the webhook service.
Is there a recommended way to check whether the webhook service specifically is running correctly? @munnerz
i'm hitting the same problem on a GKE private cluster. i've attempted to allow maximal access on port 6443, but i'm hitting the same issue (test fails with failed calling admission webhook) and i get the same error from kubectl describe APIService v1beta1.admission.certmanager.k8s.io:
Status:
Conditions:
Last Transition Time: 2019-03-07T20:29:58Z
Message: no response from https://10.149.2.15:6443: Get https://10.149.2.15:6443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Reason: FailedDiscoveryCheck
Status: False
Type: Available
i've given up on getting the webhook to work for now, and am sticking with cert-manager-no-webhook.yaml, but i'd love a resolution to this issue
@igor47 - I've actually parked cert-manager over this for the time being. One item that could be worth a test (based on some things I've run into on the journey) would be to clear out the routing rules in kube-proxy-xxxxx in case those are affecting access to the webhook.
In an unrelated issue, the easiest way I achieved this was simply deleting the kube-proxy-xxxxx pod (which gets automatically recreated by the kube proxy daemon) in the kube-system namespace. You could give that a whirl and see if a re-install of cert-manager then fixes it?
Just my two cents worth and purely based on trial and error on my part. Would love to see some input from the cert-manager team on this though!
@woodwardmatt using cert-manager without the webhook actually works fine -- just don't submit invalid resources! curious what you're using instead of cert-manager to get SSL certs on k8s. did you go back to buying them and copying them into k8s secrets by hand?
I had the same and i am a bit afraid this is due to my network policies in place.
Here my workflow to migrate from with webhook to without - NO WARRANTY!
kubectl get -o yaml \
--all-namespaces \
issuer,clusterissuer,certificates,orders,challenges > cert-manager-backup.yaml
# Delete old stuff ! - WATCH OUT YOU DELETE THE NAMESPACE AND ALL YOUR CUSTOM SECRETS E.G. FOR YOUR CLUSTER ISSUER
kubectl delete -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.7/deploy/manifests/cert-manager.yaml
kubectl create -f your-custom-secrets-in-the-cert-manager-namespace-e-g-aws-creds.yaml
# Now deploy new setup
curl https://raw.githubusercontent.com/jetstack/cert-manager/release-0.7/deploy/manifests/cert-manager-no-webhook.yaml > cert-manager-no-webhook.yml
kubectl apply -f cert-manager-no-webhook.yml
# Recreate missing things through backup
kubectl create -f cert-manager-backup.yaml
@igor47 - That's good to know! I'm at the Proof of Concept stage at the moment, so I've temporarily created some test SSL certificates and created corresponding secrets manually at this point. I'd like the automated renewals, but have side-lined this for the time being.
@tsupertramp thanks for sharing! I'll have to give 0.7 a whirl when I'm back on to SSL management :)
What's the impact of losing the webhook?
I am seeing this same issue with v0.7 as well, from installing via the manifests.
I'm also getting this issue, the webhook pod never comes up:
MountVolume.SetUp failed for volume "certs" : secrets "cert-manager-webhook-webhook-tls" not found
The secret doesn't exist.
We are also seeing the same issue.
Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling admission webhook "issuers.admission.certmanager.k8s.io": the server is currently unable to handle the request
We are running kops on a private topology w/ Calico as the networking component.
Seeing this on kube-apiserver
E0416 14:53:10.189577 1 available_controller.go:311] v1beta1.admission.certmanager.k8s.io failed with: Get https://100.71.249.58:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Environment details::
Kubernetes version (e.g. v1.10.2): v1.11.7
Cloud-provider/provisioner (e.g. GKE, kops AWS, etc): kops AWS
cert-manager version (e.g. v0.4.0): 0.7.0
Install method (e.g. helm or static manifests): static manifests
Ditto, on GKE, freshly minted cluster.
We're hitting the same issue in GKE. Is there any staff follow-up to this issue?
What's the impact of losing the webhook?
From the docs:
Doing this may expose your cluster to mis-configuration problems that in some cases could cause cert-manager to stop working altogether (i.e. if invalid types are set for fields on cert-manager resources).
Anyone able to elaborate on what this means?
There's a note about this issue in cert-manager docs now: https://cert-manager.readthedocs.io/en/latest/getting-started/webhook.html#running-on-private-gke-clusters
Still doesn't quite explain how to fix it, but it's a start.
I had a configuration that was not deleted by ValidatingWebhookConfiguration because of this there was an error.
I do not use cert-manager-webhook
I'm also getting this on bare metal, and I'm scratching my head as to what to do about it. In case the details are useful:
I have a functioning set of pods:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cert-manager-68cfd787b6-h2bz6 1/1 Running 0 13h 10.42.2.115 node-int-worker-01 <none> <none>
cert-manager-cainjector-5975fd64c5-6gm98 1/1 Running 0 13h 10.42.2.114 node-int-worker-01 <none> <none>
cert-manager-webhook-5c7f95fd44-84cz4 1/1 Running 0 2m26s 10.42.2.117 node-int-worker-01 <none> <none>
But when I try to apply my Issuer yaml:
apiVersion: certmanager.k8s.io/v1alpha1
kind: Issuer
metadata:
name: letsencrypt-staging
namespace: default
spec:
acme:
server: https://acme-staging-v02.api.letsencrypt.org/directory
email: [email protected]
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-staging
# ACME DNS-01 provider configurations
dns01:
# Here we define a list of DNS-01 providers that can solve DNS challenges
providers:
- name: cloudflare-dns
cloudflare:
email: [email protected]
# A secretKeyRef to a cloudflare api key
apiKeySecretRef:
name: cloudflare-api-key
key: api-key.txt
I get:
Error from server (InternalError): error when creating "/tmp/tmp7_0rbu2s/lets-encrypt-issuer.yaml": Internal error occurred: failed calling webhook "issuers.admission.certmanager.k8s.io":
the server is currently unable to handle the request
I'd love pointers on how to debug further. This is as far as I've gotten:
Somewhere along my google searching I came across "kubectl get apiservice" which let me see the following:
NAME SERVICE AVAILABLE AGE
v1. Local True 2d
v1.apps Local True 2d
v1.authentication.k8s.io Local True 2d
v1.authorization.k8s.io Local True 2d
v1.autoscaling Local True 2d
v1.batch Local True 2d
v1.crd.projectcalico.org Local True 2d
v1.monitoring.coreos.com Local True 14h
v1.networking.k8s.io Local True 2d
v1.rbac.authorization.k8s.io Local True 2d
v1.storage.k8s.io Local True 2d
v1alpha1.certmanager.k8s.io Local True 13h
v1beta1.admission.certmanager.k8s.io cert-manager/cert-manager-webhook False (FailedDiscoveryCheck) 13h
v1beta1.admissionregistration.k8s.io Local True 2d
v1beta1.apiextensions.k8s.io Local True 2d
v1beta1.apps Local True 2d
v1beta1.authentication.k8s.io Local True 2d
v1beta1.authorization.k8s.io Local True 2d
v1beta1.batch Local True 2d
v1beta1.certificates.k8s.io Local True 2d
v1beta1.coordination.k8s.io Local True 2d
v1beta1.events.k8s.io Local True 2d
v1beta1.extensions Local True 2d
v1beta1.metrics.k8s.io kube-system/metrics-server False (FailedDiscoveryCheck) 2d
v1beta1.policy Local True 2d
v1beta1.rbac.authorization.k8s.io Local True 2d
v1beta1.scheduling.k8s.io Local True 2d
v1beta1.storage.k8s.io Local True 2d
v1beta2.apps Local True 2d
v2beta1.autoscaling Local True 2d
v2beta2.autoscaling Local True 2d
v3.cluster.cattle.io Local True 2d
Notably, the "v1beta1.admission.certmanager.k8s.io" seems to be failing it's availability checks. Looking into it I see:
Name: v1beta1.admission.certmanager.k8s.io
Namespace:
Labels: app=webhook
chart=webhook-v0.8.1
heritage=Tiller
release=cert-manager
Annotations: certmanager.k8s.io/inject-ca-from: cert-manager/cert-manager-webhook-webhook-tls
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"apiregistration.k8s.io/v1beta1","kind":"APIService","metadata":{"annotations":{"certmanager.k8s.io/inject-ca-from":"cert-ma...
API Version: apiregistration.k8s.io/v1
Kind: APIService
Metadata:
Creation Timestamp: 2019-06-22T07:10:56Z
Resource Version: 108892
Self Link: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.admission.certmanager.k8s.io
UID: e10e43e3-94bc-11e9-a957-0244a03303e1
Spec:
Ca Bundle: long-ca-string-here
Group: admission.certmanager.k8s.io
Group Priority Minimum: 1000
Service:
Name: cert-manager-webhook
Namespace: cert-manager
Version: v1beta1
Version Priority: 15
Status:
Conditions:
Last Transition Time: 2019-06-22T07:10:56Z
Message: no response from https://10.43.216.179:443: Get https://10.43.216.179:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while
awaiting headers)
Reason: FailedDiscoveryCheck
Status: False
Type: Available
Events: <none>
The message of being unable to connect to https://10.43.216.179 looks suspicious, so I look into my services:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cert-manager-webhook ClusterIP 10.43.216.179 <none> 443/TCP 13h
And they seem fine? Describing the svc has it using selectors that match the pod itself, so all of that seems to be running?
I'm not sure if it's a connection issue, but I'm unsure what node the API service is running on, and how I can debug connectivity issues?
If it helps, this whole cluster is a bunch of VMs brought up by vagrant. The vagrantfile looks like this:
# -*- mode: ruby -*-
# vim: ft=ruby ts=2 sw=2 sts=2 noexpandtab
Vagrant.configure("2") do |config|
["int","ext"].each_with_index do |cluster, index_cluster|
["control", "etcd", "worker"].each_with_index do |role, index_role|
(1..1).each_with_index do |num, index_num|
box_name = "node-#{cluster}-#{role}-#{num.to_s.rjust(2,'0')}"
config.vm.define box_name do |box|
l_mac_address="0E000000#{index_cluster}#{index_role}#{index_num}1"
box.vm.box = "ubuntu/bionic64"
box.vm.hostname = box_name
box.disksize.size = '20GB'
box.vm.network "public_network",
use_dhcp_assigned_default_route: true,
bridge: "eno2",
mac: l_mac_address
box.vm.provider "virtualbox" do |vb|
vb.name = box_name
if "worker" == role then
vb.cpus = "8"
vb.memory = "8192"
else
vb.cpus = "2"
vb.memory = "4096"
end
end
box.vm.provision :shell, :path => "bootstrap.sh"
box.vm.provision "ansible" do |ansible|
ansible.playbook = "playbook.yml"
ansible.compatibility_mode = "2.0"
end
end
end
end
end
end
Apologies for the deluge of information; but I'm hoping someone else has run into this.
I'm also experiencing all of the issues listed in this thread.
Commands that I ran:
kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml
kubectl create namespace cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
helm install --name cert-manager --namespace cert-manager --version v0.8.1 jetstack/cert-manager
Output from the kube-apiserver:
I0624 17:14:56.867048 1 controller.go:608] quota admission added evaluator for: certificates.certmanager.k8s.io
I0624 17:14:56.900181 1 controller.go:608] quota admission added evaluator for: issuers.certmanager.k8s.io
I0624 17:14:59.674043 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:00.493680 1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[X-Content-Type-Options:[nosniff] Content-Type:[text/plain; charset=utf-8]]
I0624 17:15:00.493691 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0624 17:15:06.565081 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:06.565268 1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0624 17:15:06.565291 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0624 17:15:10.182483 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:10.199673 1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: OpenAPI spec does not exists
I0624 17:15:10.199697 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
Output from k get apiservices.apiregistration.k8s.io shows the following:
NAME SERVICE AVAILABLE AGE
v1alpha1.certmanager.k8s.io Local True 9m
v1beta1.admission.certmanager.k8s.io cert-manager/cert-manager-webhook True 8m
v1beta1.certificates.k8s.io Local True 16m
This was performed on a brand new, fresh k8s cluster running on Ubuntu on bare metal using RKE to set up the cluster.
Kubernetes version: 1.13.5
Helm version: 2.13.0
cert-manager version: 0.8.1
We ended up having to punt on cert-manager for now because of this issue. We are going to deploy a self-signed cert for the Nginx ingress for now and reevaluate when cert-manager resolves these issues.
I ended up fixing my issue. Basically vagrant was creating a default route for the vagrant enp0s3, and I had to remove it. In doing so I was able to get all this working, and it fixed some other issues with my rke cluster.
The relevant ansible playbook tasks were:
handlers:
- name: restart netplan
become: yes
shell: netplan apply
tasks:
- name: Lower gateway metric
copy:
src: 50-vagrant.yaml
dest: /etc/netplan/
notify:
- restart netplan
- name: Look for dhcp4-overrides
become: yes
shell: "grep \"^ dhcp4-overrides: { use-routes: false }$\" /etc/netplan/50-cloud-init.yaml || true"
register: test_dhcp4_overrides
- name: Override default gateway
become: yes
lineinfile:
dest: /etc/netplan/50-cloud-init.yaml
line: " dhcp4-overrides: { use-routes: false }"
when: test_dhcp4_overrides.stdout == ""
notify:
- restart netplan
And 50-vagrant.yaml is a file I created that looks like:
network:
version: 2
renderer: networkd
ethernets:
enp0s8:
dhcp4: true
dhcp4-overrides:
route-metric: 50
This was for an ubuntu bionic box. In short, I:
My suggestion to anyone experiencing this issue is to see what your default route on your nodes are (route -n, look for 'default'). Make sure it's the right gateway/device and adjust accordingly - my solution should only work for people similarly afflicted and using netplan.
Good luck!
For GKE private cluster here's a simple solution
I'm also experiencing all of the issues listed in this thread.
Commands that I ran:
kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml kubectl create namespace cert-manager kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true helm install --name cert-manager --namespace cert-manager --version v0.8.1 jetstack/cert-managerOutput from the
kube-apiserver:I0624 17:14:56.867048 1 controller.go:608] quota admission added evaluator for: certificates.certmanager.k8s.io I0624 17:14:56.900181 1 controller.go:608] quota admission added evaluator for: issuers.certmanager.k8s.io I0624 17:14:59.674043 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io E0624 17:15:00.493680 1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable , Header: map[X-Content-Type-Options:[nosniff] Content-Type:[text/plain; charset=utf-8]] I0624 17:15:00.493691 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue. I0624 17:15:06.565081 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io E0624 17:15:06.565268 1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable , Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]] I0624 17:15:06.565291 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue. I0624 17:15:10.182483 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io E0624 17:15:10.199673 1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: OpenAPI spec does not exists I0624 17:15:10.199697 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.Output from
k get apiservices.apiregistration.k8s.ioshows the following:NAME SERVICE AVAILABLE AGE v1alpha1.certmanager.k8s.io Local True 9m v1beta1.admission.certmanager.k8s.io cert-manager/cert-manager-webhook True 8m v1beta1.certificates.k8s.io Local True 16mThis was performed on a brand new, fresh k8s cluster running on Ubuntu on bare metal using RKE to set up the cluster.
Kubernetes version: 1.13.5
Helm version: 2.13.0
cert-manager version: 0.8.1We ended up having to punt on cert-manager for now because of this issue. We are going to deploy a self-signed cert for the Nginx ingress for now and reevaluate when cert-manager resolves these issues.
I am facing same exact issue with 0.9.1 version as well. any update on this issue ?
I came here because I got mail with ACTION REQUIRED because Lets Encrypt does only support 0.8.0 cert-manager instances (current jetstack/cert-manager version) and onwards in a couple of weeks. Air is getting thin. Experiencing the same issues on 0.9.1.
kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.9/deploy/manifests/00-crds.yaml
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation="true"
helm upgrade -i \
--namespace cert-manager \
--set ingressShim.defaultIssuerName=letsencrypt \
--set ingressShim.defaultIssuerKind=ClusterIssuer \
--set webhook.enabled=false \
cert-manager \
jetstack/cert-manager
That just doesn't work. Thought that was a stable release
guys the biggest problem is that you are installing it without webhook.Enabled. If you do that you cannot use clusterissuers because kube apiservice is not there.
So what I did is
kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.9/deploy/manifests/00-crds.yaml
kubectl create ns cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation="true"
helm install cert-manager jetstack/cert-manager --namespace cert-manager --version v0.9.0 --set ingressShim.defaultIssuerName=letsencrypt --set ingressShim.defaultIssuerKind=ClusterIssuer
after that creating clusterissuer works and I can see that certificates are created automatically.
For debugging the network part, I used https://github.com/nicolaka/netshoot and everything seems fine in that area. I'm 1 hop away from that host and the port is open, tried also from net=host and I got the same results - IP is accessible and port is open.
I think the Service needs to be exposed (ie. give it an External IP). Still doing some investigations.
LE: according to what @nrobert13 suggested on issue helm/charts#15809 , I just added a firewall rule in GKE for opening 6443 and Webhook passes validation now.
I needed to update due to the ACTION REQUIRED mails as @AmazingTurtle mentioned. The only way I could make it work is by disabling webhooks. I have no idea what could be the problem. I am using istio so I thought it might be it but I haven't figured out why if that is the case.
For me it worked to uninstall and reinstall the cert-manager helm chart many times and then out of the sudden it worked. This is mysterious.
We were having the same issue using flux helm operator. To share some insights from the past, upgrading helm charts throughout "major" updates/version bumps never really worked. Usually we just delete (--purge) the release before doing this kind of bigger leaps.
So apparently one of our clusters got rid of v1beta1.admission.certmanager.k8s.io apiservice by itself with deletion of the helm release. The other ones got stuck with the aforementioned "failed calling admission webhook".
Coming from v.0.6.X it seems that v0.10 now has v1beta1.webhook.certmanager.k8s.io instead of v1beta1.admission.certmanager.k8s.io.
TLDR; Just tried by deleting the helm release multiple times, but the "old" apiservice didn't get removed. So I went on cleaning up with kubectl delete apiservice v1beta1.admission.certmanager.k8s.io. Everything's gucci.
it also worked for me when i allowed port 6443 in firewall rule for my private GKE cluster .
way to troubleshoot is to do : install Custom Resouces , install cert-manager following all standanrd
then check for v1beta1.admission.certmanager.k8s.io cert-manager/cert-manager-webhook False (FailedDiscoveryCheck) using --> kubectl get apiservice
then describe to find out the blocking port .
But can anyone tell me how good is to expose port 6443 on a private GKE cluster :-)
But can anyone tell me how good is to expose port 6443 on a private GKE cluster :-)
it's fine because you're just opening it between master <-> nodes
after running this command it worked, looks like an access role issue kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=system:anonymous
Please use this command for test purposes, as it grants anyone access to perform any action on the cluster. THIS IS NOT A FIX
kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=system:anonymous
i'm not a k8s expert, but this command looks like it grants cluster admin permissions to the system anonymous account, which doesn't sound like a good idea to me. @sasiedu maybe add a disclaimer to your post?
kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=system:anonymous
i'm not a k8s expert, but this command looks like it grants cluster admin permissions to the system anonymous account, which doesn't sound like a good idea to me. @sasiedu maybe add a disclaimer to your post?
@igor47 sorry i didn't add a disclaimer. i was suggesting what might be the problem. didn't ask anyone to use the command, i haven't figured out the actual permissions needed. Thanks for alerting me.
I see this too on a virgin Kops cluster (v1.13.0). It fixes itself after several minutes but makes automating the creation of a cluster error-prone.
this seems similar to #2109
I'm going to close this as the original issue was created quite a while ago, and lots has changed since then. If there are more specific issues that can be extracted from here, please open a new issue to track that and we can attempt to resolve 馃槃
Not sure why is this closed, it seems an expired certificate issue with the webhook:
ERROR: logging before flag.Parse: I0922 02:36:19.506819 1 request.go:874] Response Body: {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"extension-apiserver-authentication","namespace":"kube-system","selfLink":"/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication"
and the API log shows:
E0922 02:29:33.714261 1 controller.go:114] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: Error trying to reach service: 'x509: certificate has expired or is not yet valid', Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
@igoratencompass certmanager.k8s.io has been deprecated since version v0.10 which is 7 versions ago. I recommend to upgrade your installation.
Most helpful comment
I am facing same exact issue with 0.9.1 version as well. any update on this issue ?