I think this is a side effect of upgrading from 0.5 to 0.7, but we ended up with some null transition times which lead to validation failures:
map[string]interface {}{
"lastTransitionTime":interface {}(nil),
"message":"Order validated",
"reason":"OrderValidated",
"status":"False",
"type":"ValidateFailed"
}
That was in the conditions array, and because there is now a validator on lastTransitionTime: status.conditions.lastTransitionTime in body must be of type string: "null" when doing a kubectl apply. I've been manually clearing them but this should probably get cleaned up automatically.
/kind bug
We're having the same issue after an upgrade from 0.5.2 to 0.7.2.
It only seems to be affecting a single certificate in our environment, so a workaround might be enough.
@coderanger, do you mind elaborating on what you mean by this?
I've been manually clearing them
Are you just doing kubectl delete certificate and letting cert-manager request a new one?
Hm, this is an awkward one. That said, I wonder how the lastTransitionTime got set to nil here? I'm pretty sure we don't ever explicitly not set the transition time 馃槃
We migrated from v0.6.2 to v0.7.0 in our environments today, and two out of three had the same problem. Manually dropping the OrderValidated condition entry from the certificate worked fine and fixed it. The working environment did not have the entry in the array in the first place. As this is our newest environment it might be, that this condition was somehow pulled through different versions and v0.7.0 is the first one having problems with it.
What is the workaround for this problem, specifically? This is impacting our production clusters. @munnerz @cbeneke @coderanger -- Can someone please post the literal steps to solving this?
We worked around it by running kubectl delete on the certificate resource. cert-manager immediately restores it; the underlying data is stored in a secret.
FWIW, this issue did not actually prevent cert-manager from successfully re-issuing a cert with letsencrypt in our case. It successfully renewed a cert despite the error persisting and lastTransitionTime being nil.
After deleting the certificate resource, cert-manager correctly populates this value in the new cert metadata.
When I looked into the backup, the null items were when something failed. I removed the items kubectl delete -f backup.yaml --- delete the rows -- kubectl create-f backup.yaml
Sadly I didn't keep my failing backup, but this was one of them
- lastTransitionTime: null
message: Order validated
reason: OrderValidated
status: "False"
type: ValidateFailed
@ceralena what do you mean with:
After deleting the certificate resource,
cert-managercorrectly populates this value in the new cert metadata.
I deleted the certificate and it creates a new one, but the error persists.
@johan-smits Did you delete the kubernetes certificate object or the generated secret? For me the issue was also solved by editing the certificate:
$ kubectl edit ceriticate <cert-name>
and deleting the status lines, which included the lastTransitionTime: null object:
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata: {...}
spec: {...}
status:
conditions:
<--- Delete from here --->
- lastTransitionTime: null
message: Order validated
reason: OrderValidated
status: "False"
type: ValidateFailed
<--- Delete until here --->
- lastTransitionTime: {...}
message: {...}
reason: {...}
status: {...}
type: {...}
notAfter: {...}
@munnerz After updating to version 0.7.1 that contains #1576 I still have this issue: status.conditions.lastTransitionTime in body must be of type string: "null"
Pod details:
Container ID: docker://3fbe6570b878484fad2a38b71bf1af1cc357d701f1443230015a369ca1ad0c18
Image: quay.io/jetstack/cert-manager-controller:v0.7.1
Image ID: docker-pullable://quay.io/jetstack/cert-manager-controller@sha256:2d893311cc28ec656bb5e902698fe856532c9465636b583c15daf346baace1c6
I've opened #1628 which should put this issue to bed 馃槃 it will feature in the upcoming v0.8 release. It may also be worthwhile backporting this to v0.7, although testing of v0.8 has been successful so far, so it may not be worth the hassle.
Had the same problem upgrading from v0.5.2 to v0.7.2.
We had too many Certificate resources to manually remove the status.conditions entries causing problems so I decided to just patch the status.conditions array with an empty array on all certificates and let cert-manager re-sync and update conditions. This _seems_ to work fine:
kubectl get certificate --all-namespaces --no-headers | xargs -L1 bash -c 'kubectl patch -n $0 certificate $1 --type=json -p="[{\"op\":\"replace\",\"path\":\"/status/conditions\",\"value\":[]}]"'
Thanks so much @gonstr
Most helpful comment
Had the same problem upgrading from
v0.5.2tov0.7.2.We had too many
Certificateresources to manually remove thestatus.conditionsentries causing problems so I decided to just patch thestatus.conditionsarray with an empty array on all certificates and let cert-manager re-sync and update conditions. This _seems_ to work fine: