Cluster-api: clusterctl init fails when existing cert-manager runs 1.0+

Created on 21 Oct 2020  Â·  38Comments  Â·  Source: kubernetes-sigs/cluster-api

What steps did you take and what happened:

  1. Already have an existing cluster where you want to install a cluster-api management cluster.
  2. Already have cert-manager installed in said cluster:

    $ kubectl api-resources | grep certmanager.k8s.io
    certificaterequests               cr,crs             certmanager.k8s.io             true         CertificateRequest
    certificates                      cert,certs         certmanager.k8s.io             true         Certificate
    challenges                                           certmanager.k8s.io             true         Challenge
    clusterissuers                                       certmanager.k8s.io             false        ClusterIssuer
    issuers                                              certmanager.k8s.io             true         Issuer
    orders                                               certmanager.k8s.io             true         Order
    
  3. Already have cert-manager version 1.x+

    $ kubectl explain certificaterequests
    KIND:     CertificateRequest
    VERSION:  cert-manager.io/v1
    
    DESCRIPTION:
        <empty>
    
  4. Run clusterctl version 0.3.10

    $ asdf current clusterctl
    clusterctl      0.3.10          ~/.tool-versions
    
    $ asdf exec clusterctl version
    clusterctl version: &version.Info{Major:"0", Minor:"3", GitVersion:"v0.3.10", GitCommit:"af6630920560ca0e12179897b96d6ea8bd830b63", GitTreeState:"clean", BuildDate:"2020-10-01T14:30:28Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"darwin/amd64"}
    
  5. Try to install CAPI:

    $ asdf exec clusterctl init --core cluster-api:v0.3.10 --bootstrap kubeadm:v0.3.10 --control-plane kubeadm:v0.3.10
    Fetching providers
    Installing cert-manager Version="v0.16.1"
    Error: action failed after 10 attempts: failed to update cert-manager component apiextensions.k8s.io/v1beta1, Kind=CustomResourceDefinition, /certificaterequests.cert-manager.io: CustomResourceDefinition.apiextensions.k8s.io "certificaterequests.cert-manager.io" is invalid: status.storedVersions[0]: Invalid value: "v1": must appear in spec.versions
    

What did you expect to happen:

Having one or more of these options:

  1. cert-manager not attempted to be installed (as it's CRDs and controllers already exist in this cluster).
  2. Being able to opt-out of installing cert-manager, as it's already installed (as suggested in #3837).
  3. cert-manager version embedded in cluster-api being upgraded to 1.0+ (as discussed in #3781).

Environment:

  • Cluster-api version: 0.3.10
  • Minikube/KIND version: N/A (using GKE)
  • Kubernetes version: (use kubectl version): Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.15", GitCommit:"2adc8d7091e89b6e3ca8d048140618ec89b39369", GitTreeState:"clean", BuildDate:"2020-09-02T11:40:00Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.15-gke.500", GitCommit:"f7db507aabec3b78cba0c27c616f4974213db6fd", GitTreeState:"clean", BuildDate:"2020-09-21T09:20:41Z", GoVersion:"go1.13.15b4", Compiler:"gc", Platform:"linux/amd64"}
  • OS (e.g. from /etc/os-release):

/kind bug
/area clusterctl

areclusterctl kinbug

Most helpful comment

Moving forward, in v1alpha4, I think we should allow users the possibility to opt put from clusterctl managing cert-manager.

I'm referring to #3837, whit a slight variation that opt in (implicit) and opt out (explicit) should apply both to init and upgrade

All 38 comments

@munnerz it seems our mechanism to detect cert-manager being installed in the cluster is broken with v1 series

I'm confused - why are the CRDs from v0.10 and below installed here? certmanager.k8s.io hasn't been used for more than a year now and as far as I know, was never present in cluster-api either.

Are there some steps here we can use to reproduce the issue?

Already have an existing cluster where you want to install a cluster-api management cluster.

I think this step needs expanding out - what state is this existing cluster in? It sounds like it has a very outdated version of cert-manager installed as well as a newer more up to date version

Already have an existing cluster where you want to install a cluster-api management cluster.

I think this step needs expanding out - what state is this existing cluster in? It sounds like it has a very outdated version of cert-manager installed as well as a newer more up to date version

You're absolutely right, I extracted the wrong (old) CRDs — here are the new(er) ones:

$ kubectl api-resources | grep cert-manager.io
challenges                                           acme.cert-manager.io           true         Challenge
orders                                               acme.cert-manager.io           true         Order
certificaterequests               cr,crs             cert-manager.io                true         CertificateRequest
certificates                      cert,certs         cert-manager.io                true         Certificate
clusterissuers                                       cert-manager.io                false        ClusterIssuer
issuers                                              cert-manager.io                true         Issuer
$ kubectl explain certificaterequests.cert-manager.io
KIND:     CertificateRequest
VERSION:  cert-manager.io/v1

DESCRIPTION:
     <empty>

For cross-referencing, it seems https://github.com/kubernetes-sigs/cluster-api/issues/3781 also talks about upgrading to/supporting cert-manager version v1+.

Regarding...

I'm confused - why are the CRDs from v0.10 and below installed here? certmanager.k8s.io hasn't been used for more than a year now and as far as I know, was never present in cluster-api either.

Are there some steps here we can use to reproduce the issue?

&

[...] what state is this existing cluster in? It sounds like it has a very outdated version of cert-manager installed as well as a newer more up to date version

...see my updated kubectl commands above.

Just reiterating that while my cluster might have both older and newer CRDs installed, the actual error I'm getting from clusterctl init is on newer cert-manager.io (and not older certmanager.k8s.io) CRDs:

Error: action failed after 10 attempts: failed to update cert-manager component apiextensions.k8s.io/v1beta1, Kind=CustomResourceDefinition, /certificaterequests.cert-manager.io: CustomResourceDefinition.apiextensions.k8s.io "certificaterequests.cert-manager.io" is invalid: status.storedVersions[0]: Invalid value: "v1": must appear in spec.versions

I will try to reproduce later, but I can't assure you when. might be @wfernandes can help here

/milestone Next

Just reiterating that while my cluster might have both older and newer CRDs installed

Can you expand a little on what you mean by that? CRDs are globally unique.

CustomResourceDefinition.apiextensions.k8s.io "certificaterequests.cert-manager.io" is invalid: status.storedVersions[0]: Invalid value: "v1": must appear in spec.versions

What version of Kubernetes are you running as management cluster?

CustomResourceDefinition.apiextensions.k8s.io "certificaterequests.cert-manager.io" is invalid: status.storedVersions[0]: Invalid value: "v1": must appear in spec.versions

What version of Kubernetes are you running as management cluster?

As per kubectl version (mentioned in the description) above:

Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.15-gke.500", GitCommit:"f7db507aabec3b78cba0c27c616f4974213db6fd", GitTreeState:"clean", BuildDate:"2020-09-21T09:20:41Z", GoVersion:"go1.13.15b4", Compiler:"gc", Platform:"linux/amd64"}

So I'm having trouble trying to reproduce the issue after a quick test.

  1. kind create cluster
    shell $ kind version kind v0.9.0 go1.15.2 darwin/amd64
  2. kubectl apply -f cert-manager-v1.0.3.yaml
    I pulled this yaml down from the 1.0.3 release of cert-manager.
    shell $ kubectl version Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T21:51:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.1", GitCommit:"206bcadf021e76c27513500ca24182692aabd17e", GitTreeState:"clean", BuildDate:"2020-09-14T07:30:52Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
  3. ./clusterctl-darwin-amd64 init
$ ./clusterctl-darwin-amd64 version
clusterctl version: &version.Info{Major:"0", Minor:"3", GitVersion:"v0.3.10", GitCommit:"af6630920560ca0e12179897b96d6ea8bd830b63", GitTreeState:"clean", BuildDate:"2020-10-01T14:30:28Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"darwin/amd64"}
Using configuration File="/Users/wfernandes/.cluster-api/clusterctl.yaml"
$ ./clusterctl-darwin-amd64 init
Using configuration File="/Users/wfernandes/.cluster-api/clusterctl.yaml"
Installing the clusterctl inventory CRD
...
Creating Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Certificate="selfsigned-cert" Namespace="cert-manager-test"
Deleting Namespace="cert-manager-test"
Deleting Issuer="test-selfsigned" Namespace="cert-manager-test"
Deleting Certificate="selfsigned-cert" Namespace="cert-manager-test"
Skipping installing cert-manager as it is already installed
Installing Provider="cluster-api" Version="v0.3.10" TargetNamespace="capi-system"

The way clusterctl verifies if cert-manager is installed is that it tries to create a test certificate of version cert-manager.io/v1alpha2. So in this case, we were able to verify that cert-manager is installed.

Just reiterating that while my cluster might have both older and newer CRDs installed

Can you expand a little on what you mean by that? CRDs are globally unique.

Indeed, but not across renames. 😊 Cert-manager CRDs used to be from certmanager.k8s.io but are nowadays from cert-manager.io.

Since my desired mgmt cluster is old enough to have survived multiple versions of cert-manager, I have CRDs for both cert-manager.io and certmanager.k8s.io in that cluster. Search this issue/webpage for the two occurrences of kubectl api-resources and you can compare the different sets of CRDs I currently have deployed there.


$ ./clusterctl-darwin-amd64 init

Using configuration File="/Users/wfernandes/.cluster-api/clusterctl.yaml"

Installing the clusterctl inventory CRD

...

Creating Namespace="cert-manager-test"

Creating Issuer="test-selfsigned" Namespace="cert-manager-test"

Creating Certificate="selfsigned-cert" Namespace="cert-manager-test"

Deleting Namespace="cert-manager-test"

Deleting Issuer="test-selfsigned" Namespace="cert-manager-test"

Deleting Certificate="selfsigned-cert" Namespace="cert-manager-test"

Skipping installing cert-manager as it is already installed

Installing Provider="cluster-api" Version="v0.3.10" TargetNamespace="capi-system"

I don't see the equivalent output from my failed command of it trying to install cert-manager in your example. Is that hidden behind the ... in your output? Just to verify that the step my init fails on works the same for you (and we have equivalent setups). I'm not by my computer now, so I can't try your method of reproducing at the moment. 😅

So if I have a cluster with cert-manager v0.10.0 installed, then clusterctl init fails for me with the following error:
Error: action failed after 10 attempts: failed to update cert-manager component /v1, Kind=Service, cert-manager/cert-manager: Service "cert-manager" is invalid: spec.clusterIP: Invalid value: "": field is immutable

However, it seems like you have a cluster with both cert-managers installed. Is that correct?
Could you share the following output from the commands below from your cluster? Thanks.

$ kubectl api-resources | grep cert
challenges                                     acme.cert-manager.io            true         Challenge
orders                                         acme.cert-manager.io            true         Order
certificaterequests               cr,crs       cert-manager.io                 true         CertificateRequest
certificates                      cert,certs   cert-manager.io                 true         Certificate
clusterissuers                                 cert-manager.io                 false        ClusterIssuer
issuers                                        cert-manager.io                 true         Issuer
certificatesigningrequests        csr          certificates.k8s.io             false        CertificateSigningRequest

$ kubectl get crds | grep cert
certificaterequests.cert-manager.io                  2020-10-21T17:05:48Z
certificates.cert-manager.io                         2020-10-21T17:05:48Z
challenges.acme.cert-manager.io                      2020-10-21T17:05:48Z
clusterissuers.cert-manager.io                       2020-10-21T17:05:49Z
issuers.cert-manager.io                              2020-10-21T17:05:49Z
orders.acme.cert-manager.io                          2020-10-21T17:05:49Z

Since my desired mgmt cluster is old enough to have survived multiple versions of cert-manager, I have CRDs for both cert-manager.io and certmanager.k8s.io in that cluster.

This might be an issue and could potentially cause conflicts, although I'll let @munnerz chime in :)

However, it seems like you have a cluster with both cert-managers installed. Is that correct?
Could you share the following output from the commands below from your cluster? Thanks.

Here's the output:

$ kubectl api-resources | grep cert
challenges                                           acme.cert-manager.io           true         Challenge
orders                                               acme.cert-manager.io           true         Order
certificaterequests               cr,crs             cert-manager.io                true         CertificateRequest
certificates                      cert,certs         cert-manager.io                true         Certificate
clusterissuers                                       cert-manager.io                false        ClusterIssuer
issuers                                              cert-manager.io                true         Issuer
certificatesigningrequests        csr                certificates.k8s.io            false        CertificateSigningRequest
certificaterequests               cr,crs             certmanager.k8s.io             true         CertificateRequest
certificates                      cert,certs         certmanager.k8s.io             true         Certificate
challenges                                           certmanager.k8s.io             true         Challenge
clusterissuers                                       certmanager.k8s.io             false        ClusterIssuer
issuers                                              certmanager.k8s.io             true         Issuer
orders                                               certmanager.k8s.io             true         Order
managedcertificates               mcrt               networking.gke.io              true         ManagedCertificate

&

$ kubectl get crds | grep cert
certificaterequests.cert-manager.io            2020-10-02T06:16:11Z
certificaterequests.certmanager.k8s.io         2020-09-24T09:19:50Z
certificates.cert-manager.io                   2020-10-02T06:16:11Z
certificates.certmanager.k8s.io                2020-09-24T09:19:50Z
challenges.acme.cert-manager.io                2020-10-02T06:16:11Z
challenges.certmanager.k8s.io                  2020-09-24T09:19:50Z
clusterissuers.cert-manager.io                 2020-10-02T06:16:11Z
clusterissuers.certmanager.k8s.io              2020-09-24T09:19:50Z
issuers.cert-manager.io                        2020-10-02T06:16:11Z
issuers.certmanager.k8s.io                     2020-09-24T09:19:50Z
managedcertificates.networking.gke.io          2020-04-30T12:39:02Z
orders.acme.cert-manager.io                    2020-10-02T06:16:11Z
orders.certmanager.k8s.io                      2020-09-24T09:19:50Z

While the CRDs may be there for/from two versions, our cluster doesn't have cert-manager pods/controllers/operators running using both versions:

$ kubectl get pods -A | grep cert
cluster-cert-manager       cluster-cert-manager-57ddb5798-l9j44                              1/1     Running     0          7d
cluster-cert-manager       cluster-cert-manager-cainjector-6547768c64-cdchl                  1/1     Running     0          7d
cluster-cert-manager       cluster-cert-manager-webhook-599f8fc879-b4qkq                     1/1     Running     0          7d

I'm evaluating if/how we should be cleaning up old/orphaned resources/CRDs from older versions of cert-manager in our own cluster, and might be trying something like this, we'll see where we end up:

$ for crd in $(kubectl api-resources -o name | grep certmanager.k8s.io); do \
   for resource in $(kubectl get -A -o name $crd); do \
      echo "kubectl delete -A $resource"; \
   done && \
   echo "kubectl delete crd $crd"; \
done
kubectl delete crd certificaterequests.certmanager.k8s.io
kubectl delete -A certificate.certmanager.k8s.io/argocd-secret
kubectl delete -A certificate.certmanager.k8s.io/alertmanager-tls
kubectl delete -A certificate.certmanager.k8s.io/grafana-tls
kubectl delete -A certificate.certmanager.k8s.io/prometheus-tls
kubectl delete crd certificates.certmanager.k8s.io
kubectl delete crd challenges.certmanager.k8s.io
kubectl delete crd clusterissuers.certmanager.k8s.io
kubectl delete crd issuers.certmanager.k8s.io
kubectl delete crd orders.certmanager.k8s.io

So if I have a cluster with cert-manager v0.10.0 installed, then clusterctl init fails for me with the following error:
Error: action failed after 10 attempts: failed to update cert-manager component /v1, Kind=Service, cert-manager/cert-manager: Service "cert-manager" is invalid: spec.clusterIP: Invalid value: "": field is immutable

Maybe it helps if I clarify which version of (the current/newer) cert-manager I am running: (v1.0.2):

$ cat cluster-cert-manager/Chart.yaml
apiVersion: v2
name: cluster-cert-manager
description: Resources for cluster wide certification management.
version: 0.1.0
home: https://github.com/jetstack/cert-manager/tree/master/deploy/charts/cert-manager
appVersion: 1.0.2
dependencies:
- name: cert-manager
  version: 1.0.2
  # helm repo add jetstack https://charts.jetstack.io
  repository: "@jetstack"

i.e. https://artifacthub.io/packages/helm/jetstack/cert-manager/1.0.2

So if I have a cluster with cert-manager v0.10.0 installed, then clusterctl init fails for me with the following error:
Error: action failed after 10 attempts: failed to update cert-manager component /v1, Kind=Service, cert-manager/cert-manager: Service "cert-manager" is invalid: spec.clusterIP: Invalid value: "": field is immutable

Judging by that you're able to reproduce this with cert-manager v0.10.0, maybe:

  1. I should focus on cleaning that out from my desired CAPI management cluster.
  2. This issue could be renamed/superseeded to focus on the issue with clusterctl init on a cluster running cert-manager v0.10.0?

    • ...if that is something you deem worth supporting/handling (such as failing gracefully or succeeding in upgrading cert-manager)?

So if I have a cluster with cert-manager v0.10.0 installed, then clusterctl init fails for me with the following error:
Error: action failed after 10 attempts: failed to update cert-manager component /v1, Kind=Service, cert-manager/cert-manager: Service "cert-manager" is invalid: spec.clusterIP: Invalid value: "": field is immutable

Judging by that you're able to reproduce this with cert-manager v0.10.0, maybe:

  1. I should focus on cleaning that out from my desired CAPI management cluster.
  2. This issue could be renamed/superseeded to focus on the issue with clusterctl init on a cluster running cert-manager v0.10.0?

    • ...if that is something you deem worth supporting/handling (such as failing gracefully or succeeding in upgrading cert-manager)?

I see now that my error...

Error: action failed after 10 attempts: failed to update cert-manager component apiextensions.k8s.io/v1beta1, Kind=CustomResourceDefinition, /certificaterequests.cert-manager.io: CustomResourceDefinition.apiextensions.k8s.io "certificaterequests.cert-manager.io" is invalid: status.storedVersions[0]: Invalid value: "v1": must appear in spec.versions

...isn't the same as @wfernandes got on v0.10.0:

Error: action failed after 10 attempts: failed to update cert-manager component /v1, Kind=Service, cert-manager/cert-manager: Service "cert-manager" is invalid: spec.clusterIP: Invalid value: "": field is immutable

@MPV The error I got is a little different from your original error.
failed to update cert-manager component apiextensions.k8s.io/v1beta1, Kind=CustomResourceDefinition, /certificaterequests.cert-manager.io: CustomResourceDefinition.apiextensions.k8s.io "certificaterequests.cert-manager.io" is invalid: status.storedVersions[0]: Invalid value: "v1": must appear in spec.versions

The fact that it says failed to update cert-manager, means that it found a certificaterequests.cert-manager.io CRD but what we see that you have the CRD of version v1 installed.

However, in our code we try to update it to what we have based on the existing logic to the embedded version of cert-manager which happens to be v0.16.1.
https://github.com/kubernetes-sigs/cluster-api/blob/28c941aeb7912411bcb195af47c99bc936f8fe0a/cmd/clusterctl/client/cluster/cert_manager.go#L405-L423

See the full embedded manifest here.

This cert-manager v0.16.1 has only v1alpha2,v1alpha3, and v1beta1 in the spec.versions field of the certificaterequests.cert-manager.io CRD.

I'm still investigating why I didn't get this error before when I was working with the v1.0.3 release manifest.

I see in your previous comment that the helm chart was used to install cert-manger v1.0.2 so I'll try and see if I can install that and reproduce your error.

However, it may be worthwhile to update our cert-manager update logic such that if we detect a cert-manager that is greater than the one embedded into clusterctl, then we just log that out. But there may be other edge cases we might need to understand before moving forward.

Just to provide an update. I was able to reproduce the original error.

Thanks @MPV for the context via slack. I'll update this thread with details once I can confirm the root cause.
I'm pretty close with narrowing it down. 🙂

Steps for reproduction:

  1. Create kind cluster (doesn't matter what version, I'm using the latest kind) kind create cluster
  2. Then create cert-manager using helm charts.
    shell kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.2/cert-manager.crds.yaml helm repo add jetstack https://charts.jetstack.io kubectl create ns cert-manager helm install certy-mccert --namespace cert-manager jetstack/cert-manager
  3. Use clusterctl v0.3.10 to clusterctl init -v5.
... <remove unnecessary logs to reduce verbosity>
Creating Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Installing cert-manager Version="v0.16.1"
Updating Namespace="cert-manager"
... <remove unnecessary logs to reduce verbosity>
Error: action failed after 10 attempts: failed to update cert-manager component apiextensions.k8s.io/v1beta1, Kind=CustomResourceDefinition, /certificaterequests.cert-manager.io: CustomResourceDefinition.apiextensions.k8s.io "certificaterequests.cert-manager.io" is invalid: status.storedVersions[0]: Invalid value: "v1": must appear in spec.versions

Ok so now we can reproduce this error with the helm chart but if we pull down the release manifest directly from cert-manager releases for v1.0.2 and do a clusterctl init -v5 we don't see this error.

So what's going on???

Notice that in the logs above, we have Creating Issuer... and then immediately Installing cert-manager. Huh! But we also create a Certificate as part of the cert-manager test objects but we don't see that here.

So after some digging around, we have this in our existing code.
https://github.com/kubernetes-sigs/cluster-api/blob/5ac19dc6a5f78f98282f13d5159dcb2d91e4d89f/cmd/clusterctl/client/cluster/cert_manager.go#L128-L139

From line 132 above, we seem to be ignoring the error from cm.waitForAPIReady. So I modified clusterctl to print out any error if available and ran clusterctl-modified init -v5. ANNNDDD.....

Yup, there was an error from there.

...
Updating Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Error: failed to create cert-manager component cert-manager.io/v1alpha2, Kind=Issuer, cert-manager-test/test-selfsigned: conversion webhook for cert-manager.io/v1alpha2, Kind=Issuer failed: Post "https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s": service "cert-manager-webhook" not found
sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster.(*certManagerClient).createObj
...

So it seems that it can't find the cert-manager-webhook service.

Output when cert-manager installed with helm chart for v1.0.2

$ kubectl get services -n cert-manager
NAME                                TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
certy-mccert-cert-manager           ClusterIP   10.111.148.195   <none>        9402/TCP   14m
certy-mccert-cert-manager-webhook   ClusterIP   10.96.103.210    <none>        443/TCP    14m

Output when cert-manager installed with direct v1.0.2 manifest

 $ kubectl get services -n cert-manager
NAME                   TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
cert-manager           ClusterIP   10.98.53.182   <none>        9402/TCP   9s
cert-manager-webhook   ClusterIP   10.111.3.49    <none>        443/TCP    9s

I confirmed this by manually applying the test objects on the cluster with the helm/chart cert-manager installed.

 $ kubectl apply -f cmd/clusterctl/config/assets/cert-manager-test-resources.yaml
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
namespace/cert-manager-test configured
Error from server: error when creating "cmd/clusterctl/config/assets/cert-manager-test-resources.yaml": conversion webhook for cert-manager.io/v1alpha2, Kind=Issuer failed: Post "https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s": service "cert-manager-webhook" not found
Error from server: error when creating "cmd/clusterctl/config/assets/cert-manager-test-resources.yaml": conversion webhook for cert-manager.io/v1alpha2, Kind=Certificate failed: Post "https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s": service "cert-manager-webhook" not found
$ kubectl apply -f cmd/clusterctl/config/assets/cert-manager-test-resources.yaml
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
namespace/cert-manager-test configured
Error from server: error when creating "cmd/clusterctl/config/assets/cert-manager-test-resources.yaml": conversion webhook for cert-manager.io/v1alpha2, Kind=Issuer failed: Post "https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s": service "cert-manager-webhook" not found
Error from server: error when creating "cmd/clusterctl/config/assets/cert-manager-test-resources.yaml": conversion webhook for cert-manager.io/v1alpha2, Kind=Certificate failed: Post "https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s": service "cert-manager-webhook" not found

@munnerz Do you perhaps have any context on the error above? Again this error is against a cluster which has cert-manager that was installed by following the steps on the helm/chart website - https://artifacthub.io/packages/helm/jetstack/cert-manager/1.0.2#installing-the-chart

Is it because we don't have a ConvertV1Alpha2 here?: https://github.com/jetstack/cert-manager/blob/v1.0.2/pkg/webhook/handlers/conversion.go

Also I didn't see anything obvious in the controller logs.

Figured it out. As per the instructions on https://artifacthub.io/packages/helm/jetstack/cert-manager/1.0.2#installing-the-chart, we have to apply the crds from the original release manifests as published by cert-manager

$ kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.2/cert-manager.crds.yaml

And in those CRDs we have the following:

spec:
  conversion:
    strategy: Webhook
    webhook:
      clientConfig:
        service:
          name: cert-manager-webhook
          namespace: cert-manager
          path: /convert
      conversionReviewVersions:
      - v1
      - v1beta1

@MPV Please verify if your above CRDs have the following conversion webhook configuration.

You may have to update your CRDs because there is an inconsistency in the documentation on that website. That is you may need to generate your CRD files with the appropriate helm chart values.

I've now looked into what source I've used for installing my CRDs. Turns out it's like this:

Kubernetes 1.16+

$ kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.3/cert-manager.yaml

Kubernetes <1.16

$ kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.3/cert-manager-legacy.yaml

-- https://cert-manager.io/docs/installation/kubernetes/

...and since our cluster was only recently upgraded to 1.16, we have the "legacy" CRDs mentioned above installed.

@wfernandes great work!
@MPV is it possible to close the issue now the root of the problem was identified in a version of cert-manager which does not deploy conversions hooks?

I've now looked into what source I've used for installing my CRDs. Turns out it's like this:

Kubernetes 1.16+

$ kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.3/cert-manager.yaml

Kubernetes <1.16

$ kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.3/cert-manager-legacy.yaml
-- cert-manager.io/docs/installation/kubernetes

...and since our cluster was only recently upgraded to 1.16, we have the "legacy" CRDs mentioned above installed.

The above wasn't entirely correct. It turns out that these instructions were the ones we had followed (and installed the <1.15 version):

Kubernetes 1.15+

$ kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.2/cert-manager.crds.yaml

Kubernetes <1.15

$ kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.2/cert-manager-legacy.crds.yaml

...as per https://artifacthub.io/packages/helm/jetstack/cert-manager/1.0.2

So what @wfernandes suggested here isn't the issue I'm having:

Figured it out. As per the instructions on artifacthub.io/packages/helm/jetstack/cert-manager/1.0.2#installing-the-chart, we have to apply the crds from the original release manifests as published by cert-manager

$ kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.2/cert-manager.crds.yaml

And in those CRDs we have the following:

spec:
  conversion:
    strategy: Webhook
    webhook:
      clientConfig:
        service:
          name: cert-manager-webhook
          namespace: cert-manager
          path: /convert
      conversionReviewVersions:
      - v1
      - v1beta1

@MPV Please verify if your above CRDs have the following conversion webhook configuration.

You may have to update your CRDs because there is an inconsistency in the documentation on that website. That is you may need to generate your CRD files with the appropriate helm chart values.

If I got this correct this is now solved
feel free to re-open if I'm missing something
/close

@fabriziopandini: Closing this issue.

In response to this:

If I got this correct this is now solved
feel free to re-open if I'm missing something
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fabriziopandini Sorry that I was unclear. This isn't solved yet.

Sorry that I haven't had the time to try either/all of these things yet:

  1. Remove old cert-manager CRDs & CRs

    • ...and try clusterctl init again

  2. Upgrade/change cert-manager CRDs from "1.0.3 legacy" to "1.0.3 non-legacy"

    • ...and try clusterctl init again

  3. Upgrade cert-manager (including CRDs from 1.0.3 to newer version in the hope that it doesn't contain/introduce the issue that @wfernandes mentioned around "default vs custom namespaces for cert-manager"

    • ...and try clusterctl init again

Sorry about the delay. Depending on internal priorities, we might at best try in the next few days/weeks. Will update here with the results. Without this being solved we won't be able to get metrics/telemetry from Cluster-API, which is something we're aiming for (but other things around getting CAPI clusters working also needs being worked on here first).

No problem, sorry if I miss something in my quick pass
/reopen

@fabriziopandini: Reopened this issue.

In response to this:

No problem, sorry if I miss something in my quick pass
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@MPV So you are using the CRD manifest specified for k8s <1.15 which doesn't seem to have conversion webhooks.

https://github.com/kubernetes-sigs/cluster-api/blob/5ac19dc6a5f78f98282f13d5159dcb2d91e4d89f/cmd/clusterctl/client/cluster/cert_manager.go#L128-L139

The only thing I can suggest is printing out the error from line 132 above. That is, change the clusterctl code and do a make clusterctl. Then you can run clusterctl init -v5 against your cluster.

        // Skip re-installing cert-manager if the API is already available
-       if err := cm.waitForAPIReady(ctx, false); err == nil {
+       err := cm.waitForAPIReady(ctx, false)
+       if err == nil {
                log.Info("Skipping installing cert-manager as it is already installed")
                return nil
        }
+       if err != nil {
+               log.V(5).Info("YUP GOT AN ERROR", "ERR", err.Error())
+       }

This will provide us with more information regarding what may be happening during clusterctl init -v5 _in your cluster_. But it does feel that you may still need to update your CRDs.
You may need to do some upgrade steps but not sure 🤷

What about moving the issue to the cert-manager issue, given that as far as I understand the problems here seme more related to different cert-manager deployments.
WRT to clusterctl, what we are doing in clustectl is exactly what is described in https://cert-manager.io/docs/installation/kubernetes/#verifying-the-installation, so as soon as you get an installation passing this procedure, clusterctl should use it if the version is newer that the one already installed.

Moving forward, in v1alpha4, I think we should allow users the possibility to opt put from clusterctl managing cert-manager.

@fabriziopandini Which issue are you referring to: https://github.com/kubernetes-sigs/cluster-api/issues/3781 or #3837?

Moving forward, in v1alpha4, I think we should allow users the possibility to opt put from clusterctl managing cert-manager.

I'm referring to #3837, whit a slight variation that opt in (implicit) and opt out (explicit) should apply both to init and upgrade

Was this page helpful?
0 / 5 - 0 ratings