Describe the bug:
With installation of cert-manager, deletion of other namespace stuck in terminating state.
Expected behaviour:
namespace deletion should not be impacted by installation of cert-manager.
Steps to reproduce the bug:
kubectl apply -f cert-manager-github.txt
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: test-foo
kubectl delete ns test-fookubectl delete -f cert-manager-github.txtAnything else we need to know?:
With installation of cert-manager, kubectl api-resources is no longer able to list all APIs:
$ kubectl api-resources
error: unable to retrieve the complete list of server APIs: admission.certmanager.k8s.io/v1beta1: the server is currently unable to handle the request
This issue is similar to https://github.com/jetstack/cert-manager/issues/1355 but the cause is not improper deletion of cert-manager installation.
Environment details::
/kind bug
+1
Found a workaround for the time being:
https://nasermirzaei89.net/2019/01/27/delete-namespace-stuck-at-terminating-state/
Ok, I think the renaming of namespace from cert-manager to platform-cert-manager caused the problem for me. I replaced "Namespace: cert-manager" with "Namespace: platform-cert-manager" in this file
cert-manager-working.txt. Of course, for the namespace resource itself.
I just re-dployed cert-manager with cert-manager-working.txt after I delete all resources I created with platform-cert-manager namespace. I no longer have problems with namespace deletion anymore.
When uninstall cert-manager, it's important that you delete all resources created by the installation manifest.
There are certain resources (notably, the ValidatingWebhookConfiguration resource) that are not namespaced, and as such, when you delete the cert-manager namespace the validating webhook still exists. Because the actual webhook pod runs in the cert-manager namespace however, it means that the webhook is no longer accessible which consequently means that Kubernetes garbage collector is unable to query the webhook, and so it blocks deletion of namespace resources (as it is unable to query the webhook in order to ensure that the namespace being deleted no longer contains resources)..
The TL;DR, is that you should uninstall cert-manager using kubectl delete -f cert-manager.yaml (where cert-manager.yaml is the file that you originally ran kubectl apply -f with).
Hope that helps! I'm going to close this issue, as it's not really a bug.
It'd be great to have an "uninstalling cert-manager" guide as part of our docs in future, as I think this would help with issues like this 馃槃
/close
@munnerz: Closing this issue.
In response to this:
When uninstall cert-manager, it's important that you delete all resources created by the installation manifest.
There are certain resources (notably, the ValidatingWebhookConfiguration resource) that are not namespaced, and as such, when you delete the cert-manager namespace the validating webhook still exists. Because the actual webhook pod runs in the cert-manager namespace however, it means that the webhook is no longer accessible which consequently means that Kubernetes garbage collector is unable to query the webhook, and so it blocks deletion of namespace resources (as it is unable to query the webhook in order to ensure that the namespace being deleted no longer contains resources)..
The TL;DR, is that you should uninstall cert-manager using
kubectl delete -f cert-manager.yaml(wherecert-manager.yamlis the file that you originally rankubectl apply -fwith).Hope that helps! I'm going to close this issue, as it's not really a bug.
It'd be great to have an "uninstalling cert-manager" guide as part of our docs in future, as I think this would help with issues like this 馃槃
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@munnerz This is one of at least two issues I've just read where the answer appears to be "delete cert-manager properly"; however this issue occurs when you haven't deleted cert-manager. I'm seeing this issue occur on a cluster where I have cert-manager installed, as I want it to be, but it's presence appears to prevent any namespace deletion.
Running the delete command with kubectl delete mynamespace -v=7 shows me this:
I0405 13:58:19.691921 16373 round_trippers.go:383] GET https://<my-api-endpoint>/apis/admission.certmanager.k8s.io/v1beta1?timeout=32s
I0405 13:58:19.691962 16373 round_trippers.go:390] Request Headers:
I0405 13:58:19.691994 16373 round_trippers.go:393] Accept: application/json, */*
I0405 13:58:19.692019 16373 round_trippers.go:393] User-Agent: kubectl/v1.11.7 (linux/amd64) kubernetes/65ecaf0
I0405 13:58:19.702770 16373 round_trippers.go:408] Response Status: 503 Service Unavailable in 10 milliseconds
I0405 13:58:19.707140 16373 request.go:1144] body was not decodable (unable to check for Status): couldn't get version/kind; json parse error: json: cannot unmarshal string into Go value of type struct { APIVersion string "json:\"apiVersion,omitempty\""; Kind string "json:\"kind,omitempty\"" }
Solution:
My particular case of this issue came about because I am using private GKE clusters, and in this particular case the GCP firewall rule which is automatically created for private clusters (which lets the hidden master nodes talk to the private worker nodes) had not been modified to let the masters talk to the cert-manager on port 6443. This section of the docs: https://docs.cert-manager.io/en/latest/getting-started/webhook.html#running-on-private-gke-clusters refers to this, though providing an example is still listed as a TODO. Finding the right config took a little doing when I first started using cert-manager.
@munnerz I have this problem while using Helm (helm delete --purge cert-manager).
Just to clarify, shouldn't it remove all resources just as a manual installation would?
It'd be great to have an "uninstalling cert-manager" guide as part of our docs in future, as I think this would help with issues like this 馃槃
@munnerz waiting for it! 馃憤
It took me a long time googling for it until I found your comment:
The TL;DR, is that you should uninstall cert-manager using kubectl delete -f cert-manager.yaml (where cert-manager.yaml is the file that you originally ran kubectl apply -f with).
and so I understood that what I need was this simple command:
kubectl delete -f https://github.com/jetstack/cert-manager/releases/download/v0.8.1/cert-manager.yaml
馃槈
While this ticket is closed, isn't it worth considering that cert-manager breaks the ability to delete namespaces in general? We regularly wipe namespaces for development and testing, but resources ultimately persist, unless we expressly delete them, and/or delete all certmanager resources first. Is this expected behavior?
For future travelers, the only workaround I've found is:
for i in $(kubectl api-resources --namespaced | awk '{ print $1 }' 2>&1 | grep -v error | tail -n +2 | grep -v persistent ) ; do kubectl delete $i --all ; done
I want to restate what @UrbanWizardry wrote: for me the problem isn't uninstalling cert-manager, but that resources that were created by cert-manager in another namespace aren't able to be deleted, so they hold up the deletion of the namespace at Terminating.
From trying to delete the resources individually, I get errors like:
$ kubectl delete -n development certificate.cert-manager.io/ingress-tls
Error from server: conversion webhook for &{map[apiVersion:cert-manager.io/v1alpha2 kind:Certificate metadata:map[creationTimestamp:2020-04-09T17:23:46Z
generation:1 name:ingress-tls namespace:development ownerReferences:[map[apiVersion:extensions/v1beta1 blockOwnerDeletion:true controller:true kind:Ingre
ss name:ingress uid:42957329-135a-44e4-8823-e104a94d9e74]] uid:897ca221-698a-4ea0-94b9-e092932ac467] spec:map[dnsNames:[<snip>] is
suerRef:map[group:cert-manager.io kind:ClusterIssuer name:letsencrypt-production] secretName:ingress-tls] status:map[conditions:[map[lastTransitionTime:2
020-04-09T17:23:46Z message:Certificate does not exist reason:NotFound status:False type:Ready]]]]} failed: Post https://cert-manager-webhook.cert-manage
r.svc:443/convert?timeout=30s: service "cert-manager-webhook" not found
For anyone wandering into here, I managed to solve this by updating my CRDs as noted here. The problem was in my case because I installed cert-manager in a different namespace than the default cert-manager which is baked into the CRDs.
Most helpful comment
@munnerz I have this problem while using Helm (
helm delete --purge cert-manager).Just to clarify, shouldn't it remove all resources just as a manual installation would?