Cluster-api: Unable to delete a cluster when infrastructureRef is defined incorrectly

Created on 16 Oct 2019 · 24Comments · Source: kubernetes-sigs/cluster-api

What steps did you take and what happened:
Creating a cluster definition with the incorrect infrastructureSpec results in a cluster resource that can't be deleted, also exhibiting the same behaviour on the namespace where it exists.

Example Cluster:

apiVersion: cluster.x-k8s.io/v1alpha2
kind: Cluster
metadata:
  name: blank
  namespace: blank
spec:
  clusterNetwork:
    services:
      cidrBlocks: ["10.96.0.0/12"]
    pods:
      cidrBlocks: ["192.168.0.0/16"]
    serviceDomain: "cluster.local"
    apiServerPort: 6433
  infrastructureRef:
    apiVersion: blank.cluster.k8s.io/v1alpha1
    kind: blankCluster
    name: blankTest
    namespace: blank

Applying this will create a new cluster resource in the namespace blank as expected:

k create namespace blank; k create -f ./blank.yaml

What did you expect to happen:

That deleting this erroneous cluster resource or it's namespace, that it would be cleaned up from the cluster. However at this point it will hang indefinitely (even with force):

k get cluster -n blank
NAME    PHASE
blank   provisioning
k delete cluster blank -n blank
cluster.cluster.x-k8s.io "blank" deleted
<hang>

Anything else you would like to add:

As pointed out by @detiber, editing the resource and removing the infrastructureRef that it will be removed as expected:

k edit cluster blank -n blank
cluster.cluster.x-k8s.io/blank edited
k delete cluster blank -n blank
Error from server (NotFound): clusters.cluster.x-k8s.io "blank" not found

Environment:

Cluster-api version: 0.2.5
Minikube/KIND version: N/A (vanilla deployment on VMs)
Kubernetes version: (use kubectl version): 1.14.1
OS (e.g. from /etc/os-release): Ubuntu 18.04

/kind bug

kinbug prioritawaiting-more-evidence

Source

thebsdbox

All 24 comments

/priority important-soon
/milestone v0.3.0

detiber on 16 Oct 2019

@thebsdbox I am facing the same issue. Will work on it.

prankul88 on 1 Nov 2019

👍1

@prankul88 Are you still working on this issue? Feel free to /assign and /lifecycle active. Thanks!

wfernandes on 9 Nov 2019

@wfernandes Yes I am working on it.

/assign
/lifecycle active

prankul88 on 9 Nov 2019

👍1

@prankul88 Thanks for reaching out. This is what I managed to find regarding this issue.

Result

I could reproduce your behavior in an environment with only CAPI components installed. That is no infrastructure provider components. The issue was not reproducible in an environment where the appropriate infrastructure provider components were installed as well. See 2nd experiment below.

The cluster controller performs a "foreground deletion", that is, it tries to delete the cluster's dependents before deleting the cluster object itself. In this case, the dependents are objects with owner references and the infrastructureRef.
However, we get the following error (focus on the no matches for kind "AWSCluster")
E1119 21:50:06.916110 1 controller.go:218] controller-runtime/controller "msg"="Reconciler error" "error"="failed to get infrastructure.cluster.x-k8s.io/v1alpha3/AWSCluster \"test-aws-not-here\" for Cluster foobar/test: no matches for kind \"AWSCluster\" in version \"infrastructure.cluster.x-k8s.io/v1alpha3\"" "controller"="cluster" "request"={"Namespace":"foobar","Name":"test"}
from the line 260 below
https://github.com/kubernetes-sigs/cluster-api/blob/31f119ce49479505c6bf9201316e4f9ef6a0361d/controllers/cluster_controller.go#L255-L262
because the AWSCluster CRD does not exist in this environment. Because we get this error the cluster object deletion doesn't follow through and the kubectl command hangs.
However, in an environment where AWS provider components are installed, it falls into the apierrors.IsNotFound(err) case during which reconcileDelete returns nil, after which the cluster object is successfully deleted.

_IMO I believe the controller is behaving as expected. This issue is only reproducible when the environment is incorrectly setup._

@detiber or @ncdc can verify my findings. I'm happy to explain further if this doesn't make sense.

Experiments

1. Environment with ONLY CAPI components installed.

I could reproduce your behavior in an environment with only CAPI components installed. That is no infrastructure provider components.

Environment:
cluster-api GIT REF: 31f119ce4

Steps to test:

kind create cluster --name=onlycapi
export KUBECONFIG="$(kind get kubeconfig-path --name="onlycapi")"
make release-manifests
kubectl apply -f out/cluster-api-components.yaml
kubectl create ns foobar
echo "apiVersion: cluster.x-k8s.io/v1alpha3 kind: Cluster metadata: name: test namespace: foobar spec: clusterNetwork: pods: cidrBlocks: - 192.168.0.0/16 infrastructureRef: apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3 kind: AWSCluster name: test-aws-not-here namespace: foobar" | kubectl apply -f -
Saw the following in the CAPI controller logs.
E1119 21:46:49.476864 1 controller.go:218] controller-runtime/controller "msg"="Reconciler error" "error"="no matches for kind \"AWSCluster\" in version \"infrastructure.cluster.x-k8s.io/v1alpha3\"" "controller"="cluster" "request"={"Namespace":"foobar","Name":"test"}
kubectl delete cluster test -n foobar
The above command hangs. Below is the log line outputted in the CAPI controller.
```
E1119 21:50:06.916110 1 controller.go:218] controller-runtime/controller "msg"="Reconciler error" "error"="failed to get infrastructure.cluster.x-k8s.io/v1alpha3/AWSCluster \"test-aws-not-here\" for Cluster foobar/test: no matches for kind \"AWSCluster\" in version \"infrastructure.cluster.x-k8s.io/v1alpha3\"" "controller"="cluster" "request"={"Namespace":"foobar","Name":"test"}

```

2. Environment with CAPI and CAPA components installed.

Environment:
cluster-api-provider-aws git ref: 40a6a24f

Steps to Test

Create a kind cluster
kubectl create ns foobar
Deploy CAPA provider components along with CAPI components
Apply the cluster above and verify the object has been created. Notice there are errors in the CAPI and CAPA controllers because the infra ref is incorrect or does not exist.
kubectl delete cluster test -n foobar
Here the command does NOT hang and expected behavior is actual.

wfernandes on 20 Nov 2019

@wfernandes Hi, thank you for the inputs.
I will try to give a clear description of what I have been doing to reproduce the situation.

I had set up with both CAPI and CAPO components installed.

kind create cluster --name=clusterapi-test
kubectl create -f https://github.com/kubernetes-sigs/cluster-api/releases/download/v0.2.7/cluster-api-components.yaml
kubectl create -f https://github.com/kubernetes-sigs/cluster-api-provider-openstack/releases/download/v0.2.0/infrastructure-components.yaml

Case 1: Created cluster object with incorrect InfrastructureRef (errors in CAPI Controller due to that)

```apiVersion: cluster.x-k8s.io/v1alpha2
kind: Cluster
metadata:
name: capi-quickstart
spec:
clusterNetwork:
pods:
cidrBlocks: ["192.168.0.0/16"]
infrastructureRef:
apiVersion: blank


Now running `k delete cluster capi-quickstart` does get stuck in "deleting" phase.


**Case 2:**  Created cluster object (yaml below) - `k apply -f cluster.yaml` where OpenStackCluster.Spec is incorrect (errors in CAPO controller)

```apiVersion: cluster.x-k8s.io/v1alpha2
kind: Cluster
metadata:
  name: capi-quickstart
spec:
  clusterNetwork:
    services:
      cidrBlocks: ["10.96.0.0/12"]
    pods:
      cidrBlocks: ["192.168.0.0/16"] # CIDR block used by Calico.
    serviceDomain: "cluster.local"
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
    kind: OpenStackCluster
    name: capi-quickstart
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: OpenStackCluster
metadata:
  name: capi-quickstart
spec:
  cloudName: ${OPENSTACK_CLOUD}
  cloudsSecret:
    name: cloud-config
  nodeCidr: ${NODE_CIDR}
  externalNetworkId: ${OPENSTACK_EXTERNAL_NETWORK_ID}
  disablePortSecurity: true
  disableServerTags: true

k get clusters

NAME              PHASE
capi-quickstart   provisioning

Now when I try deleting cluster, I expect the cluster to first delete all the objects with its ownerReferences and InfrastructureRef (i.e OpenStackCluster object must be deleted) and then deletes itself. But the command k delete cluster capi-quickstart gets stuck at "deleting" phase.

prankul88 on 20 Nov 2019

@prankul88 Thanks for the clarification of your steps of reproduction. I believe that I was able to reproduce the behavior with CAPI and CAPA provider components and have a better understanding of what's going on.

Experiment

After setting up an environment with CAPI/CAPA components, I applied the yaml below to create your Case 2.
apiVersion: cluster.x-k8s.io/v1alpha3 kind: Cluster metadata: name: wff-test namespace: default spec: clusterNetwork: pods: cidrBlocks: - 192.168.0.0/16 infrastructureRef: apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3 kind: AWSCluster name: wff-test namespace: default --- apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3 kind: AWSCluster metadata: name: wff-test namespace: default spec: region: us-east-2 sshKeyName: SUPER-DUPER-BAD-SSH-KEY
Now when we issue a DELETE to a Cluster object (kubectl delete cluster wff-test) , the capi-controller-manager eventually issues a DELETE request to delete the Infrastructure reference.
https://github.com/kubernetes-sigs/cluster-api/blob/08b29593e201a02a9805b8f839e5e22b789b0ff3/controllers/cluster_controller.go#L266-L270
The Infrastructure provider controller such as capa-controller-manager gets that request and begins to reconcileDelete.
- If the deletion is successful, the finalizer on the AWSCluster object is removed, which then allows for that object to be deleted from etcd. Once that happens, the capi-controller-manager will find that the InfraRef is gone and promptly remove the finalizer on the Cluster object. Then k8s will remove the Cluster object from etcd. When that happens the kubectl command is returned.
- If the deletion is unsuccessful, the finalizer on the AWSCluster object is kept on the object which keeps it in etcd. Because the AWSCluster object is around, the Cluster object doesn't get cleaned up.
  
  The removal of the finalizer behavior is the same for the OpenStackCluster controller.
  
  https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/f993fb3706a8a01956442cd512cce9a979e1dc69/controllers/openstackcluster_controller.go#L273-L276
kubectl sets the PropagationPolicy to Background by default. After it issues a DELETE request on the Cluster object, it does a GET to the Cluster object to verify if it is removed from etcd. Since the Cluster object remains in etcd, kubectl hangs for this case.
kubectl delete cluster wff-test -v9 provides more info explaining this point.

Result

So the easiest way to get around this particular issue is to either:

kubectl delete cluster wff-test --wait=false which does not wait for finalizers and returns immediately. However, the objects will still remain in etcd.
Delete the finalizer from the AWSCluster object (kubectl edit awscluster wff-test) and manually verify the environment does not have any resources left around.
Or you can edit the infrastructure ref on the Cluster object like @detiber suggested initially.

Ideally, there would be a way to propagate the error from the capa-controller-manager to capi-controller-manager to kubectl. If there is an error in deleting the AWSCluster object we could add a FailureReason and FailureMessage field on the AWSClusterStatus field. Then in the capi-controller-manager we could check the AWSClusterStatus before issuing a DELETE to see if there is an error. This error can be tracked as an event on the Cluster object or as a metric if necessary

However, I'm not sure how to propagate the error to kubectl
🤷‍♂

wfernandes on 21 Nov 2019

@prankul88 As always, let me know if anything doesn't make sense and I'll be happy to explain some more.
I suggest trying to get more input from the community to better understand what the expected behavior should be.

wfernandes on 21 Nov 2019

@wfernandes Really appreciate the details you have mentioned. Will wait for more inputs before moving forward.

prankul88 on 22 Nov 2019

Hello,

I raised the issue mainly as it is certainly confusing behaviour for end-users that don't know where to look or even that they will need to start manually editing various object spec. It's more of a UX issue if the end-user can't be notified that the delete operation is failing due to a mis-aligned reference I suppose.

thebsdbox on 22 Nov 2019

👍1

@thebsdbox I completely agree with you! The end-user must be intimated of the possible failure.

prankul88 on 22 Nov 2019

I think there are a few different scenarios here:

Cluster infraRef points to a Kind (CRD) that does not exist
Cluster infraRef points to a resource that does not exist (CRD is there, but specific instance is not)
Cluster reconciler encounters some sort of other error (permissions, downtime, whatever) when trying to retrieve the infrastructure resource
Cluster reconciler can't proceed with deletion because the infrastructure resource is "stuck" deleting

Does that sound accurate? Am I missing anything?

ncdc on 20 Dec 2019

@prankul88 I just wanted to check-in to see if you were still planning on working on this and if there was anything you needed to move this forward?

joonas on 17 Jan 2020

@prankul88 checking in again - any updates here?

ncdc on 22 Jan 2020

@joonas @ncdc I am planning to take this forward. Need a little knowledge on how to propagate the error to kubectl

Cluster infraRef points to a Kind (CRD) that does not exist

This does not create the cluster object. So no issues there.

prankul88 on 24 Jan 2020

This does not create the cluster object. So no issues there.

I'm not sure what you mean. The original report states "Creating a cluster definition with the incorrect infrastructureSpec results in a cluster resource that can't be deleted...". Could you please clarify?

Need a little knowledge on how to propagate the error to kubectl

Let's take things one step at a time. Which scenario are you trying to address, and where is the error getting "lost"?

ncdc on 24 Jan 2020

@ncdc I will try to explain the scenarios which you had mentioned above in your comment.

Cluster infraRef points to a Kind (CRD) that does not exist

This is valid,(sorry for the last comment, confused it with something else). It is stuck at "deleting" phase as expected. The delete command would try to delete the Infrastructure reference and then itself eventually. Since infraRef points to a Kind that doesnot exist, it is stuck in the "deleting" phase.

Cluster infraRef points to a resource that does not exist (CRD is there, but specific instance is not)

This does not delete the InfraRef object with commad k delete cluster <name> but I don't think this issue is targeting this scenario.

Cluster reconciler encounters some sort of other error (permissions, downtime, whatever) when trying to retrieve the infrastructure resource

Cluster reconciler can't proceed with deletion because the infrastructure resource is "stuck" deleting

These are more likely to be covered.

Please let me know if something is missing. Thanks

prankul88 on 28 Jan 2020

Thanks. Could you pick one scenario and describe in detail what you'd like the user experience to be?

kubectl delete ...
???
etc

ncdc on 28 Jan 2020

@ncdc Sure. Let's take case1 where Cluster infraRef points to a Kind (CRD) that does not exist

I apply file :
```apiVersion: cluster.x-k8s.io/v1alpha2
kind: Cluster
metadata:
name: capi-quickstart
spec:
clusterNetwork:
pods:
cidrBlocks: ["192.168.0.0/16"]
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: blank

name: capi-quickstar

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: DockerCluster
metadata:
name: capi-quickstart


Currently the output of command `k delete cluster capi-quickstart` is something like

cluster.cluster.x-k8s.io "capi-quickstart" deleted

and does not actually complete the command. Rather when I do `k get clusters`, cluster is still in "Deleting" phase.

NAME PHASE
capi-quickstart deleting
```

My expected behaviour after the objects are created is something like/

k delete cluster capi-quickstart
[kubectl must throw the logs why the cluster can't be deleted] Eg, Cannot delete object: no matches for kind 'blank' in infrastructureRef
The command must exit and not have a race condition in any case.

prankul88 on 28 Jan 2020

@prankul88 I do not believe we'll be able to implement what you've described. When you issue kubectl delete, it has a --wait flag that defaults to true. If a resource has finalizers, the apiserver sets the deletion timestamp on the resource, and the --wait=true flag causes kubectl to wait for the resource's finalizers to be removed, and for the resource ultimately to be removed from etcd. If there is still a finalizer on the resource, which is what happens in case 1, there is nothing kubectl can do in its current form to give you any additional information as to what is going on. If you ctrl-c the kubectl delete call, the resource still has its deletion timestamp set, and the apiserver is still waiting for all the finalizers to be removed. This is the standard behavior for all Kubernetes resources, both built-in types and custom resources, and there is no way to alter the behavior of either the apiserver or kubectl without making changes to Kubernetes.

I think it may be sufficient to modify ClusterReconciler.reconcileDelete() to have it skip over 404 not found errors here:

https://github.com/kubernetes-sigs/cluster-api/blob/065eb539766dede097e206a7b549b5902d15f14a/controllers/cluster_controller.go#L256

ncdc on 30 Jan 2020

👍1

Bumping this from the milestone for now, I think we should open smaller issues targeting what @ncdc listed in https://github.com/kubernetes-sigs/cluster-api/issues/1566#issuecomment-568034735 and address these separately in v0.3.x

/milestone Next

vincepri on 12 Feb 2020

/priority awaiting-more-evidence

Is there any action items that we can take from here or is this good to close?

vincepri on 11 Mar 2020

Closing this due to inactivity, please feel free to reopen

/close

vincepri on 20 Apr 2020

@vincepri: Closing this issue.

In response to this:

Closing this due to inactivity, please feel free to reopen

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.