What happened:
I'm running a CRD controller. On the deployment of the CRD, the controller creates a set of k8s services, statefulset, role, rolebinding etc. The operator also sets the ownerReference (CRD) with ownerReference.blockOwnerDeletion=true of those objects.
Now, when I delete the CRD with foregroudDeletion policy.
CRD is hanging, I checked the dependent objects, the deletionTimestamp and finalizer are set. But somehow the garbage collector isn't cleaning up those.
- apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2020-07-28T04:22:13Z"
deletionGracePeriodSeconds: 0
deletionTimestamp: "2020-07-28T04:23:21Z"
finalizers:
- foregroundDeletion
What you expected to happen:
The created services, statefulset, role, rolebinding etc. will be deleted first, and once all those are deleted by the garbage collector the CRD is removed.
Anything else we need to know?:
Also, when I described the services, I encountered warning like below:
Warning ClusterIPNotAllocated 90s (x3 over 19m) ipallocator-repair-controller Cluster IP 10.102.62.89 is not allocated; repairing
Warning FailedToUpdateEndpointSlices 18m endpoint-slice-controller Error updating Endpoint Slices for Service demo/topology-es-master: Error updating topology-es-master-xwwdt EndpointSlice for Service demo/topology-es-master: endpointslices.discovery.k8s.io "topology-es-master-xwwdt" not found
Environment:
$ kind version
kind v0.8.1 go1.14.2 linux/amd64
$ kubectl version --short
Client Version: v1.18.3
Server Version: v1.18.2
this appears to be https://github.com/kubernetes/kubernetes/issues/87603
we're not doing anything special for clusterIP allocation or garbage collector settings, so I'm pretty sure this is purely an upstream kubernetes bug you've found.
@BenTheElder Any idea how to work around this? I can delete with backgroundDeletion policy, but I also need to make sure that the dependent objects are removed.
I don't think there's a good workaround, there's a fix in progress upstream. I've commented there.
I am seeing the same problem with our own CRDs (not Service!). Everything works in Minikube, CRC, and OpenShift. But in Kind the finalizer foregroundDeletion never gets deleted and the resource is stuck.
@ctron I need a little more information than that, is this across the same kubernetes version?
kind's Kubernetes behaviors are generally upstream kubernetes source + kubeadm for configuration (we deviate as little as possible from defaults)
It seems to work on Minikube (1.17.3), OpenShift (1.18.3), but fails on Kind (0.8.1 -> kindest/node:v1.18.2).
Let me know if you need more information.
both kind and minikube use kubeadm under the hood, so i'm curious to what is the difference here.
please try a matching minikube version (k8s = v1.18.2):
https://github.com/kubernetes/minikube/releases/tag/v1.10.1
Bump Default Kubernetes version v1.18.2 and update newest
@neolit123 Unfortunately that isn't possible due to: https://github.com/kubernetes/minikube/issues/8414
i'd appreciate if this is reproduced with a raw kubeadm setup too.
I looks like I can select the Kubernetes version with Minikube using --kubernetes-version=v1.18.2 … I will try that.
So I can confirm that using Minikube with 1.18.2 shows the same problem. Looks like this is a regression in Kubernetes.
you might try --image=kindest/node:v1.17.5@sha256:ab3f9e6ec5ad8840eeb1f76c89bb7948c77bbf76bcebe1a8b59790b8ae9a283a for kind v0.8.1 in the meantime.
https://github.com/kubernetes-sigs/kind/releases/tag/v0.8.0#New-Features
Just tested with Kubernetes 1.18.6, same issue
Since this is reproduced in minikube, and the original in https://github.com/kubernetes/kubernetes/issues/87603, I'm going to close this in the KIND tracker.
If w fix is identified upstream and a release is cut, we'll be sure to provide a pre-built image with it.
In the meantime we do provide pre-built other images, and it's possible (though currently a bit of work) to build your own images at fairly arbitrary versions.
Btw … switching back to 1.17.x with Kind works as well.
Excellent. If you can identify the kubernetes bug please file an issue with the kubernetes/kubernetes tracker so we can get it fixed upstream.
Or kubernetes/kubeadm if it turns out to be some kubeadm setting.