Cert-manager: Verifying Install: "failed calling admission webhook"

Created on 1 Mar 2019 · 36Comments · Source: jetstack/cert-manager

Describe the bug:
Upon re-installing cert-manager and trying to verify the install, the admission api is failing with the following description:

kubectl describe APIService v1beta1.admission.certmanager.k8s.io
Name:         v1beta1.admission.certmanager.k8s.io
Namespace:
Labels:       app=webhook
              chart=webhook-v0.6.4
              heritage=Tiller
              release=cert-manager
Annotations:  <none>
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2019-03-01T10:08:13Z
  Resource Version:    13956808
  Self Link:           /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.admission.certmanager.k8s.io
  UID:                 ecc47923-3c09-11e9-bae6-6e4899a3d5f0
Spec:
  Ca Bundle:               LS0tLS1<removed for brevity>LS0tCg==
  Group:                   admission.certmanager.k8s.io
  Group Priority Minimum:  1000
  Service:
    Name:            cert-manager-webhook
    Namespace:       cert-manager
  Version:           v1beta1
  Version Priority:  15
Status:
  Conditions:
    Last Transition Time:  2019-03-01T10:08:13Z
    Message:               no response from https://10.0.233.160:443: Get https://10.0.233.160:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

This manifests when trying to apply the test-resources.yaml for verifying the install, with the following output:

kubectl apply -f test-resources.yaml
namespace "cert-manager-test" created
Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling admission webhook "issuers.admission.certmanager.k8s.io": the server is currently unable to handle the request
Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling admission webhook "certificates.admission.certmanager.k8s.io": the server is currently unable to handle the request

Expected behaviour:
Test Resources should be created successfully with no errors.

Steps to reproduce the bug:

Note: I have removed all other items from my cluster and following the install of the CRD's, created the name space, labelled the name space, then tried the install via helm using the following commands:

kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/v0.6.2/deploy/manifests/00-crds.yaml
kubectl create namespace cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
helm install --name cert-manager --namespace cert-manager --version v0.6.6 stable/cert-manager

Anything else we need to know?:
I have previously installed cert-manager successfully on this cluster. I was then trying to get the nginx-ingress working but got into a bit of a mess. So I deleted all resources created (via helm), and tidied up any orphaned objects - so I could start from scratch again. However, I'm now running into this issue.

The only similar issue I've seen is this https://github.com/helm/charts/issues/10869. But I'm unsure what the resolution to this is.

All other objects appear to have been created and started successfully. I haven't been able to see any other error messages having gone through the logs for the different pods.

Environment details::

Kubernetes version (e.g. v1.10.2): v1.11.3
Cloud-provider/provisioner (e.g. GKE, kops AWS, etc): Azure
cert-manager version (e.g. v0.4.0): 0.6.6
Install method (e.g. helm or static manifests): Helm

/kind bug

kinbug

Source

woodwardmatt

👍13

Most helpful comment

I'm also experiencing all of the issues listed in this thread.

Commands that I ran:

kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml
kubectl create namespace cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
helm install   --name cert-manager   --namespace cert-manager   --version v0.8.1   jetstack/cert-manager

Output from the kube-apiserver:

I0624 17:14:56.867048       1 controller.go:608] quota admission added evaluator for: certificates.certmanager.k8s.io
I0624 17:14:56.900181       1 controller.go:608] quota admission added evaluator for: issuers.certmanager.k8s.io
I0624 17:14:59.674043       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:00.493680       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[X-Content-Type-Options:[nosniff] Content-Type:[text/plain; charset=utf-8]]
I0624 17:15:00.493691       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0624 17:15:06.565081       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:06.565268       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0624 17:15:06.565291       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0624 17:15:10.182483       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:10.199673       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: OpenAPI spec does not exists
I0624 17:15:10.199697       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.

Output from k get apiservices.apiregistration.k8s.io shows the following:

NAME                                   SERVICE                             AVAILABLE   AGE
v1alpha1.certmanager.k8s.io            Local                               True        9m
v1beta1.admission.certmanager.k8s.io   cert-manager/cert-manager-webhook   True        8m
v1beta1.certificates.k8s.io            Local                               True        16m

This was performed on a brand new, fresh k8s cluster running on Ubuntu on bare metal using RKE to set up the cluster.

Kubernetes version: 1.13.5
Helm version: 2.13.0
cert-manager version: 0.8.1

We ended up having to punt on cert-manager for now because of this issue. We are going to deploy a self-signed cert for the Nginx ingress for now and reevaluate when cert-manager resolves these issues.

I am facing same exact issue with 0.9.1 version as well. any update on this issue ?

vdoodala on 22 Aug 2019

👍7 👎1

All 36 comments

Going on gut feel alone (and wandering in the dark a little!) I reckon this is the crux of the issue:

no response from https://10.0.233.160:443: Get https://10.0.233.160:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

(Taken from the Status section of the "describe APIService v1beta1.admission.certmanager.k8s.io" command above)

"10.0.233.160" is the IP of the "cert-manager-webhook" Service. When I check the log of the underlying Pod attached to this service I can't see any errors as such, but do see this line...

Serving securely on [::]:6443

... which ties in closely with the mention of the firewall issue (that was blocking the master from communicating with the nodes on port 6443) in the other issue I referenced above. Question from me now is how to I check that the Pod is not being blocked by such a firewall rule within AKS? Anyone got any hints here?

_Azure Support Ticket has been raised to query this._

woodwardmatt on 5 Mar 2019

Initial response from the Azure Support team confirms that there isn't any firewall blocking comms between the master and node. At this stage, we've used port forwarding via kubectl to try and connect with the cert-manager-webhook pod / service but we've not had any response, so early indication is that there seems to be something not working with the webhook service.

Is there a recommended way to check whether the webhook service specifically is running correctly? @munnerz

woodwardmatt on 5 Mar 2019

i'm hitting the same problem on a GKE private cluster. i've attempted to allow maximal access on port 6443, but i'm hitting the same issue (test fails with failed calling admission webhook) and i get the same error from kubectl describe APIService v1beta1.admission.certmanager.k8s.io:

Status:
  Conditions:
    Last Transition Time:  2019-03-07T20:29:58Z
    Message:               no response from https://10.149.2.15:6443: Get https://10.149.2.15:6443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available

i've given up on getting the webhook to work for now, and am sticking with cert-manager-no-webhook.yaml, but i'd love a resolution to this issue

igor47 on 7 Mar 2019

👍6

@igor47 - I've actually parked cert-manager over this for the time being. One item that could be worth a test (based on some things I've run into on the journey) would be to clear out the routing rules in kube-proxy-xxxxx in case those are affecting access to the webhook.

In an unrelated issue, the easiest way I achieved this was simply deleting the kube-proxy-xxxxx pod (which gets automatically recreated by the kube proxy daemon) in the kube-system namespace. You could give that a whirl and see if a re-install of cert-manager then fixes it?

Just my two cents worth and purely based on trial and error on my part. Would love to see some input from the cert-manager team on this though!

woodwardmatt on 7 Mar 2019

@woodwardmatt using cert-manager without the webhook actually works fine -- just don't submit invalid resources! curious what you're using instead of cert-manager to get SSL certs on k8s. did you go back to buying them and copying them into k8s secrets by hand?

igor47 on 10 Mar 2019

👍2

I had the same and i am a bit afraid this is due to my network policies in place.

Here my workflow to migrate from with webhook to without - NO WARRANTY!

kubectl get -o yaml \
   --all-namespaces \
   issuer,clusterissuer,certificates,orders,challenges > cert-manager-backup.yaml

# Delete old stuff ! - WATCH OUT YOU DELETE THE NAMESPACE AND ALL YOUR CUSTOM SECRETS E.G. FOR YOUR CLUSTER ISSUER
kubectl delete -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.7/deploy/manifests/cert-manager.yaml

kubectl create -f your-custom-secrets-in-the-cert-manager-namespace-e-g-aws-creds.yaml

# Now deploy new setup
curl https://raw.githubusercontent.com/jetstack/cert-manager/release-0.7/deploy/manifests/cert-manager-no-webhook.yaml  > cert-manager-no-webhook.yml
kubectl apply -f cert-manager-no-webhook.yml

# Recreate missing things through backup
kubectl create -f cert-manager-backup.yaml

This worked for me.

thomaspeitz on 13 Mar 2019

👍4

@igor47 - That's good to know! I'm at the Proof of Concept stage at the moment, so I've temporarily created some test SSL certificates and created corresponding secrets manually at this point. I'd like the automated renewals, but have side-lined this for the time being.

@tsupertramp thanks for sharing! I'll have to give 0.7 a whirl when I'm back on to SSL management :)

What's the impact of losing the webhook?

woodwardmatt on 14 Mar 2019

I am seeing this same issue with v0.7 as well, from installing via the manifests.

BrendanThompson on 19 Mar 2019

👍3

I'm also getting this issue, the webhook pod never comes up:

MountVolume.SetUp failed for volume "certs" : secrets "cert-manager-webhook-webhook-tls" not found

The secret doesn't exist.

oliverholliday on 26 Mar 2019

👍6 👀2

We are also seeing the same issue.

Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling admission webhook "issuers.admission.certmanager.k8s.io": the server is currently unable to handle the request

We are running kops on a private topology w/ Calico as the networking component.

Seeing this on kube-apiserver

E0416 14:53:10.189577       1 available_controller.go:311] v1beta1.admission.certmanager.k8s.io failed with: Get https://100.71.249.58:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Environment details::
Kubernetes version (e.g. v1.10.2): v1.11.7
Cloud-provider/provisioner (e.g. GKE, kops AWS, etc): kops AWS
cert-manager version (e.g. v0.4.0): 0.7.0
Install method (e.g. helm or static manifests): static manifests

Tomesco on 16 Apr 2019

Ditto, on GKE, freshly minted cluster.

zvozin on 17 Apr 2019

👍3

We're hitting the same issue in GKE. Is there any staff follow-up to this issue?

smaslennikov on 17 May 2019

👍5

What's the impact of losing the webhook?

From the docs:

Doing this may expose your cluster to mis-configuration problems that in some cases could cause cert-manager to stop working altogether (i.e. if invalid types are set for fields on cert-manager resources).

Anyone able to elaborate on what this means?

oshalygin on 17 May 2019

There's a note about this issue in cert-manager docs now: https://cert-manager.readthedocs.io/en/latest/getting-started/webhook.html#running-on-private-gke-clusters

Still doesn't quite explain how to fix it, but it's a start.

benley on 30 May 2019

I had a configuration that was not deleted by ValidatingWebhookConfiguration because of this there was an error.
I do not use cert-manager-webhook

gubik89 on 20 Jun 2019

👍1

I'm also getting this on bare metal, and I'm scratching my head as to what to do about it. In case the details are useful:

I have a functioning set of pods:

NAME                                       READY   STATUS    RESTARTS   AGE     IP            NODE                 NOMINATED NODE   READINESS GATES
cert-manager-68cfd787b6-h2bz6              1/1     Running   0          13h     10.42.2.115   node-int-worker-01   <none>           <none>
cert-manager-cainjector-5975fd64c5-6gm98   1/1     Running   0          13h     10.42.2.114   node-int-worker-01   <none>           <none>
cert-manager-webhook-5c7f95fd44-84cz4      1/1     Running   0          2m26s   10.42.2.117   node-int-worker-01   <none>           <none>

But when I try to apply my Issuer yaml:

apiVersion: certmanager.k8s.io/v1alpha1
kind: Issuer
metadata:
  name: letsencrypt-staging
  namespace: default
spec:
  acme:
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    email: [email protected]

    # Name of a secret used to store the ACME account private key
    privateKeySecretRef:
      name: letsencrypt-staging

    # ACME DNS-01 provider configurations
    dns01:

      # Here we define a list of DNS-01 providers that can solve DNS challenges
      providers:

        - name: cloudflare-dns
          cloudflare:
            email: [email protected]
            # A secretKeyRef to a cloudflare api key
            apiKeySecretRef:
              name: cloudflare-api-key
              key: api-key.txt

I get:

Error from server (InternalError): error when creating "/tmp/tmp7_0rbu2s/lets-encrypt-issuer.yaml": Internal error occurred: failed calling webhook "issuers.admission.certmanager.k8s.io":
the server is currently unable to handle the request

I'd love pointers on how to debug further. This is as far as I've gotten:

Somewhere along my google searching I came across "kubectl get apiservice" which let me see the following:

NAME                                   SERVICE                             AVAILABLE                      AGE
v1.                                    Local                               True                           2d
v1.apps                                Local                               True                           2d
v1.authentication.k8s.io               Local                               True                           2d
v1.authorization.k8s.io                Local                               True                           2d
v1.autoscaling                         Local                               True                           2d
v1.batch                               Local                               True                           2d
v1.crd.projectcalico.org               Local                               True                           2d
v1.monitoring.coreos.com               Local                               True                           14h
v1.networking.k8s.io                   Local                               True                           2d
v1.rbac.authorization.k8s.io           Local                               True                           2d
v1.storage.k8s.io                      Local                               True                           2d
v1alpha1.certmanager.k8s.io            Local                               True                           13h
v1beta1.admission.certmanager.k8s.io   cert-manager/cert-manager-webhook   False (FailedDiscoveryCheck)   13h
v1beta1.admissionregistration.k8s.io   Local                               True                           2d
v1beta1.apiextensions.k8s.io           Local                               True                           2d
v1beta1.apps                           Local                               True                           2d
v1beta1.authentication.k8s.io          Local                               True                           2d
v1beta1.authorization.k8s.io           Local                               True                           2d
v1beta1.batch                          Local                               True                           2d
v1beta1.certificates.k8s.io            Local                               True                           2d
v1beta1.coordination.k8s.io            Local                               True                           2d
v1beta1.events.k8s.io                  Local                               True                           2d
v1beta1.extensions                     Local                               True                           2d
v1beta1.metrics.k8s.io                 kube-system/metrics-server          False (FailedDiscoveryCheck)   2d
v1beta1.policy                         Local                               True                           2d
v1beta1.rbac.authorization.k8s.io      Local                               True                           2d
v1beta1.scheduling.k8s.io              Local                               True                           2d
v1beta1.storage.k8s.io                 Local                               True                           2d
v1beta2.apps                           Local                               True                           2d
v2beta1.autoscaling                    Local                               True                           2d
v2beta2.autoscaling                    Local                               True                           2d
v3.cluster.cattle.io                   Local                               True                           2d

Notably, the "v1beta1.admission.certmanager.k8s.io" seems to be failing it's availability checks. Looking into it I see:

Name:         v1beta1.admission.certmanager.k8s.io
Namespace:
Labels:       app=webhook
              chart=webhook-v0.8.1
              heritage=Tiller
              release=cert-manager
Annotations:  certmanager.k8s.io/inject-ca-from: cert-manager/cert-manager-webhook-webhook-tls
              kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"apiregistration.k8s.io/v1beta1","kind":"APIService","metadata":{"annotations":{"certmanager.k8s.io/inject-ca-from":"cert-ma...
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2019-06-22T07:10:56Z
  Resource Version:    108892
  Self Link:           /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.admission.certmanager.k8s.io
  UID:                 e10e43e3-94bc-11e9-a957-0244a03303e1
Spec:
  Ca Bundle:               long-ca-string-here
  Group:                   admission.certmanager.k8s.io
  Group Priority Minimum:  1000
  Service:
    Name:            cert-manager-webhook
    Namespace:       cert-manager
  Version:           v1beta1
  Version Priority:  15
Status:
  Conditions:
    Last Transition Time:  2019-06-22T07:10:56Z
    Message:               no response from https://10.43.216.179:443: Get https://10.43.216.179:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while
awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

The message of being unable to connect to https://10.43.216.179 looks suspicious, so I look into my services:

NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
cert-manager-webhook   ClusterIP   10.43.216.179   <none>        443/TCP   13h

And they seem fine? Describing the svc has it using selectors that match the pod itself, so all of that seems to be running?

I'm not sure if it's a connection issue, but I'm unsure what node the API service is running on, and how I can debug connectivity issues?

If it helps, this whole cluster is a bunch of VMs brought up by vagrant. The vagrantfile looks like this:

# -*- mode: ruby -*-
# vim: ft=ruby ts=2 sw=2 sts=2 noexpandtab


Vagrant.configure("2") do |config|

  ["int","ext"].each_with_index do |cluster, index_cluster|
    ["control", "etcd", "worker"].each_with_index do |role, index_role|
      (1..1).each_with_index do |num, index_num|
        box_name = "node-#{cluster}-#{role}-#{num.to_s.rjust(2,'0')}"

        config.vm.define box_name do |box|
          l_mac_address="0E000000#{index_cluster}#{index_role}#{index_num}1"

          box.vm.box = "ubuntu/bionic64"
          box.vm.hostname = box_name
          box.disksize.size = '20GB'
          box.vm.network "public_network",
            use_dhcp_assigned_default_route: true,
            bridge: "eno2",
            mac: l_mac_address

          box.vm.provider "virtualbox" do |vb|
            vb.name = box_name

            if "worker" == role then
              vb.cpus = "8"
              vb.memory = "8192"
            else
              vb.cpus = "2"
              vb.memory = "4096"
            end

          end

          box.vm.provision :shell, :path => "bootstrap.sh"

          box.vm.provision "ansible" do |ansible|
            ansible.playbook = "playbook.yml"
            ansible.compatibility_mode = "2.0"
          end
        end
      end
    end
  end
end

Apologies for the deluge of information; but I'm hoping someone else has run into this.

eddieparker on 22 Jun 2019

👍3

I'm also experiencing all of the issues listed in this thread.

Commands that I ran:

kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml
kubectl create namespace cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
helm install   --name cert-manager   --namespace cert-manager   --version v0.8.1   jetstack/cert-manager

Output from the kube-apiserver:

I0624 17:14:56.867048       1 controller.go:608] quota admission added evaluator for: certificates.certmanager.k8s.io
I0624 17:14:56.900181       1 controller.go:608] quota admission added evaluator for: issuers.certmanager.k8s.io
I0624 17:14:59.674043       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:00.493680       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[X-Content-Type-Options:[nosniff] Content-Type:[text/plain; charset=utf-8]]
I0624 17:15:00.493691       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0624 17:15:06.565081       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:06.565268       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0624 17:15:06.565291       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0624 17:15:10.182483       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:10.199673       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: OpenAPI spec does not exists
I0624 17:15:10.199697       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.

Output from k get apiservices.apiregistration.k8s.io shows the following:

NAME                                   SERVICE                             AVAILABLE   AGE
v1alpha1.certmanager.k8s.io            Local                               True        9m
v1beta1.admission.certmanager.k8s.io   cert-manager/cert-manager-webhook   True        8m
v1beta1.certificates.k8s.io            Local                               True        16m

This was performed on a brand new, fresh k8s cluster running on Ubuntu on bare metal using RKE to set up the cluster.

Kubernetes version: 1.13.5
Helm version: 2.13.0
cert-manager version: 0.8.1

boxboatjeremy on 24 Jun 2019

👍2

I ended up fixing my issue. Basically vagrant was creating a default route for the vagrant enp0s3, and I had to remove it. In doing so I was able to get all this working, and it fixed some other issues with my rke cluster.

The relevant ansible playbook tasks were:

  handlers:
    - name: restart netplan
      become: yes
      shell: netplan apply
  tasks:
    - name: Lower gateway metric
      copy:
        src: 50-vagrant.yaml
        dest: /etc/netplan/
      notify:
        - restart netplan

    - name: Look for dhcp4-overrides
      become: yes
      shell: "grep \"^            dhcp4-overrides: { use-routes: false }$\" /etc/netplan/50-cloud-init.yaml || true"
      register: test_dhcp4_overrides

    - name: Override default gateway
      become: yes
      lineinfile:
        dest: /etc/netplan/50-cloud-init.yaml
        line: "            dhcp4-overrides: { use-routes: false }"
      when: test_dhcp4_overrides.stdout == ""
      notify:
        - restart netplan

And 50-vagrant.yaml is a file I created that looks like:

network:
  version: 2
  renderer: networkd
  ethernets:
    enp0s8:
      dhcp4: true
      dhcp4-overrides:
        route-metric: 50

This was for an ubuntu bionic box. In short, I:

lowered the gateway metric for the route I wanted. I'm not sure this was necessary; but why not.
Removed the default gateway if it existed
Restarted netplan if any of the above happened.

My suggestion to anyone experiencing this issue is to see what your default route on your nodes are (route -n, look for 'default'). Make sure it's the right gateway/device and adjust accordingly - my solution should only work for people similarly afflicted and using netplan.

Good luck!

eddieparker on 25 Jun 2019

For GKE private cluster here's a simple solution

https://github.com/helm/charts/pull/15809

nrobert13 on 24 Jul 2019

I'm also experiencing all of the issues listed in this thread.

Commands that I ran:

kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml
kubectl create namespace cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
helm install   --name cert-manager   --namespace cert-manager   --version v0.8.1   jetstack/cert-manager

Output from the kube-apiserver:

I0624 17:14:56.867048       1 controller.go:608] quota admission added evaluator for: certificates.certmanager.k8s.io
I0624 17:14:56.900181       1 controller.go:608] quota admission added evaluator for: issuers.certmanager.k8s.io
I0624 17:14:59.674043       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:00.493680       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[X-Content-Type-Options:[nosniff] Content-Type:[text/plain; charset=utf-8]]
I0624 17:15:00.493691       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0624 17:15:06.565081       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:06.565268       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0624 17:15:06.565291       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.
I0624 17:15:10.182483       1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.admission.certmanager.k8s.io
E0624 17:15:10.199673       1 controller.go:111] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: OpenAPI spec does not exists
I0624 17:15:10.199697       1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.admission.certmanager.k8s.io: Rate Limited Requeue.

Output from k get apiservices.apiregistration.k8s.io shows the following:

NAME                                   SERVICE                             AVAILABLE   AGE
v1alpha1.certmanager.k8s.io            Local                               True        9m
v1beta1.admission.certmanager.k8s.io   cert-manager/cert-manager-webhook   True        8m
v1beta1.certificates.k8s.io            Local                               True        16m

This was performed on a brand new, fresh k8s cluster running on Ubuntu on bare metal using RKE to set up the cluster.

Kubernetes version: 1.13.5
Helm version: 2.13.0
cert-manager version: 0.8.1

I am facing same exact issue with 0.9.1 version as well. any update on this issue ?

vdoodala on 22 Aug 2019

👍7 👎1

I came here because I got mail with ACTION REQUIRED because Lets Encrypt does only support 0.8.0 cert-manager instances (current jetstack/cert-manager version) and onwards in a couple of weeks. Air is getting thin. Experiencing the same issues on 0.9.1.

    kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.9/deploy/manifests/00-crds.yaml
    kubectl label namespace cert-manager certmanager.k8s.io/disable-validation="true"
    helm upgrade -i \
        --namespace cert-manager \
        --set ingressShim.defaultIssuerName=letsencrypt \
        --set ingressShim.defaultIssuerKind=ClusterIssuer \
        --set webhook.enabled=false \
        cert-manager \
        jetstack/cert-manager

That just doesn't work. Thought that was a stable release

AmazingTurtle on 27 Aug 2019

👍1

guys the biggest problem is that you are installing it without webhook.Enabled. If you do that you cannot use clusterissuers because kube apiservice is not there.

So what I did is

kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.9/deploy/manifests/00-crds.yaml
kubectl create ns cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation="true"
helm install cert-manager jetstack/cert-manager --namespace cert-manager --version v0.9.0 --set ingressShim.defaultIssuerName=letsencrypt --set ingressShim.defaultIssuerKind=ClusterIssuer

after that creating clusterissuer works and I can see that certificates are created automatically.

zetaab on 29 Aug 2019

👎3

For debugging the network part, I used https://github.com/nicolaka/netshoot and everything seems fine in that area. I'm 1 hop away from that host and the port is open, tried also from net=host and I got the same results - IP is accessible and port is open.

I think the Service needs to be exposed (ie. give it an External IP). Still doing some investigations.

LE: according to what @nrobert13 suggested on issue helm/charts#15809 , I just added a firewall rule in GKE for opening 6443 and Webhook passes validation now.

dminca on 1 Sep 2019

I needed to update due to the ACTION REQUIRED mails as @AmazingTurtle mentioned. The only way I could make it work is by disabling webhooks. I have no idea what could be the problem. I am using istio so I thought it might be it but I haven't figured out why if that is the case.

MrBlaise on 5 Sep 2019

For me it worked to uninstall and reinstall the cert-manager helm chart many times and then out of the sudden it worked. This is mysterious.

AmazingTurtle on 5 Sep 2019

We were having the same issue using flux helm operator. To share some insights from the past, upgrading helm charts throughout "major" updates/version bumps never really worked. Usually we just delete (--purge) the release before doing this kind of bigger leaps.

So apparently one of our clusters got rid of v1beta1.admission.certmanager.k8s.io apiservice by itself with deletion of the helm release. The other ones got stuck with the aforementioned "failed calling admission webhook".

Coming from v.0.6.X it seems that v0.10 now has v1beta1.webhook.certmanager.k8s.io instead of v1beta1.admission.certmanager.k8s.io.

TLDR; Just tried by deleting the helm release multiple times, but the "old" apiservice didn't get removed. So I went on cleaning up with kubectl delete apiservice v1beta1.admission.certmanager.k8s.io. Everything's gucci.

alexanderbuhler on 16 Sep 2019

👍1

it also worked for me when i allowed port 6443 in firewall rule for my private GKE cluster .
way to troubleshoot is to do : install Custom Resouces , install cert-manager following all standanrd
then check for v1beta1.admission.certmanager.k8s.io cert-manager/cert-manager-webhook False (FailedDiscoveryCheck) using --> kubectl get apiservice

then describe to find out the blocking port .

But can anyone tell me how good is to expose port 6443 on a private GKE cluster :-)

ghost on 19 Sep 2019

But can anyone tell me how good is to expose port 6443 on a private GKE cluster :-)

it's fine because you're just opening it between master <-> nodes

dminca on 22 Sep 2019

after running this command it worked, looks like an access role issue kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=system:anonymous

Please use this command for test purposes, as it grants anyone access to perform any action on the cluster. THIS IS NOT A FIX

sasiedu on 24 Sep 2019

👎2

kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=system:anonymous

i'm not a k8s expert, but this command looks like it grants cluster admin permissions to the system anonymous account, which doesn't sound like a good idea to me. @sasiedu maybe add a disclaimer to your post?

igor47 on 24 Sep 2019

kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=system:anonymous

i'm not a k8s expert, but this command looks like it grants cluster admin permissions to the system anonymous account, which doesn't sound like a good idea to me. @sasiedu maybe add a disclaimer to your post?

@igor47 sorry i didn't add a disclaimer. i was suggesting what might be the problem. didn't ask anyone to use the command, i haven't figured out the actual permissions needed. Thanks for alerting me.

sasiedu on 24 Sep 2019

👍1

I see this too on a virgin Kops cluster (v1.13.0). It fixes itself after several minutes but makes automating the creation of a cluster error-prone.

boosh on 25 Sep 2019

this seems similar to #2109

dmolik on 2 Oct 2019

I'm going to close this as the original issue was created quite a while ago, and lots has changed since then. If there are more specific issues that can be extracted from here, please open a new issue to track that and we can attempt to resolve 😄

munnerz on 16 Oct 2019

Not sure why is this closed, it seems an expired certificate issue with the webhook:

ERROR: logging before flag.Parse: I0922 02:36:19.506819       1 request.go:874] Response Body: {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"extension-apiserver-authentication","namespace":"kube-system","selfLink":"/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication"

and the API log shows:

E0922 02:29:33.714261       1 controller.go:114] loading OpenAPI spec for "v1beta1.admission.certmanager.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: Error trying to reach service: 'x509: certificate has expired or is not yet valid', Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]