Cert-manager: Deleting a certificate resource does not delete its secret

Created on 7 Feb 2018  路  13Comments  路  Source: jetstack/cert-manager

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:
When you delete a certificate resource, the secret containing the actual certificate will not be deleted.

What you expected to happen:
I think it's safe to assume that deleting a certificate resource means the user does not want to keep the secret around, or wants to force a new certificate to be requested. For instance, when switching a ClusterIssuer from letsencrypt-staging to letsencrypt-prod.

How to reproduce it (as minimally and precisely as possible):

  1. Make sure you have a valid Issuer.
  2. Create a certificate resource or let the ingress-shim create it.
  3. Run kubectl get secrets -o yaml to check if the secret has been created
  4. Delete the certificate
  5. Run kubectl get secrets -o yaml; the secret persists
  6. Recreate the certificate, or wait until the ingress-shim recreates it
  7. Run kubectl get secrets -o yaml and compare its contents; it hasn't been overwritten.

Environment:

  • Kubernetes version (use kubectl version): 1.8.6
  • Cloud provider or hardware configuration**: AWS
  • Install tools: kops
kinfeature lifecyclrotten

Most helpful comment

The problem seems to still be present, can we reopen this issue?

All 13 comments

I'm been a big user of kube-cert-manager which deletes Secrets when the Certificate or Ingress that triggered its creation is deleted. I would say though this is a feature, and not a bug. It is a desirable feature and I agree it would nice to add to cert-manager.

It has some wrinkles to implement. When the user deletes a Certificate, all that cert-manager has is the name of the deleted Certificate. So how does it know which Secret to delete? kcm relied on a small database to track that, cert-manager would need to annotate or label Secrets with the name of the Certificate used to create it. Then when a Certificate is deleted, check all Secrets to see if any were created by the deleted Certificate.

From experience, when an Ingress is deleted, it is not always desirable to delete the associated Certificates and Secrets. At least not in a hurry. Often the Ingress is going to be recreated a few seconds later.

@munnerz I would suggest for cert-manager that Secrets have a certmanager.k8s.io/certificate-name: annotation for the Certificate that created it, and a certmanager.k8s.io/delete-with-certificate: true annotation by default. And that Certificate resources have an optonal cascadeDeleteSecrets: true/false field that determines whether the created Secret get true or false for the certmanager.k8s.io/delete-with-certificate: true.

For ingress-shim the Ingress would have a certmanager.k8s.io/cascade-delete-secrets: true/false annotation that would determine if certificates created by it would have true or false for cascadeDeleteSecrets.

The cert-manager should use use deletes on Certificates as well as a (not very frequent) reconciliation timer to trigger a reconciliation of Secrets, checking each Secret. If the Secret has delete-with-certificate: true then check if the named Certificate exists, if it does not, then delete the Secret.

This could be achieved through the use of Kubernetes garbage collection more easily, by simply setting the OwnerReference of the secret to the Certificate resource.

I can see the desire for such a feature, but I can also see why someone may not want it. I think we could make this a field on a Certificate resource to toggle the behaviour, perhaps with a cluster wide default.

Defaulting is something we've touched on before, and I'm still not certain the best way we should handle it. With the new MutatingWebhookConfiguration resource type, we could potentially allow users to do this out-of-tree instead of requiring us to keep adding fields for every default/toggle under the sun...

/kind feature

Also see #87

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle rotten
/remove-lifecycle stale

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to jetstack.
/close

The problem seems to still be present, can we reopen this issue?

Apparently this has been fixed by #819

Yay! Looks like is has been fixed, but you have to manually enabled the fix by adding the enable-certificate-owner-ref option in the extraArgs setting. (So yeah, not obvious nor documented AFAIK!) This enables clean-up of orphaned Secrets.

I personally think this should be the default behavior as I commented above you year. Let me make two more arguments for that:

The Secret is derivative, generated data, created when you create a Certificate. If I create a Certificate resource, then delete a Certificate resource, the system should be in the same state as when I started. There shouldn't be garbage left behind.

An ACME Secret without a controlling Certificate is just garbage. It won't be renewed because the Certicate is gone. If you are worried about causing outages if someone deletes a Certificate by mistake @munnerz, then consider that if you don't delete the Secret right away, then all you have achieved is setting an outage 'time bomb' for the user in 1-90 days 馃槃

@whereisaaron it'd be best to open a new issue to track this, as this being closed will not be very discoverable! 馃槃

So yeah, not obvious nor documented AFAIK!

Not as far as I'm aware 馃槵

Cheers @munnerz - I don't actually want to resurface the argument, just leaving it here for posterity 馃憤
It would be a breaking change now we're not alpha. I'm happy I have the option, and now I know how to enable it. Once I have tested it a bit I'll look to patch the chart info or may propose an named chart option for it.

Updating the chart would be great as I'm not really a fan of having to --set "extraArgs={--enable-certificate-owner-ref=true}"

But I can confirm that the fix is working so far.

Was this page helpful?
0 / 5 - 0 ratings