Currently deletion of linkerd from an k8s cluster is done via the linkerd install --ignore-cluster command. This has worked so far but after the work carried out in #3600, some questions emerged. Namely, the previous command relies on the fact that we can generate a yaml file that is identical to the one used by the installation phase. In certain scenarios, when we cannot ignore the cluster and we need to fetch some data from it to generate the file (i.e. using --external-issuer=true and fetching the trust anchors from a secret) the standard approach simply does not work.
I think what we need to have is a generic linkerd delete/uninstall/any-other-suitable-name command that will simply generate a yaml file that can be used to delete the linkerd resources from the k8s cluster. This command should work even if there is no connection to the cluster.
Should it actually be a text only command that outputs the minimum yaml needed to do kubectl delete and get rid of linkerd or it should actually talk to the cluster and perform the whole deletion ?
If its a just yaml command, does it make sense to have a separate helm chart for it?
Are there any edge cases we need to think about (i.e. having a iiexternal-issuer flag that leaved the namespace alone once #3600 is merged?)
cc @grampelberg
Should it actually be a text only command that outputs the minimum yaml needed to do kubectl delete and get rid of linkerd or it should actually talk to the cluster and perform the whole deletion ?
So far, we don't modify the cluster at all. I think it makes sense to do that still. Maybe linkerd delete should output just the component names instead of all the YAML?
If its a just yaml command, does it make sense to have a separate helm chart for it?
Hmm, with helm, it'd just be helm delete and that would clean everything up, right?
@zaharidichev I would like to work on this issue.
@supra08 Thanks for your interest. It looks to me there are more questions than answers in this issue so far. Any thoughts on how you would go about solving this? The goal is to provide a command to delete all "live" Linkerd namespace-scoped and cluster-scoped resources, regardless of the Linkerd version.
For additional context, currently, we are using either linkerd install --ignore-cluster|k delete -f - or k delete [all,clusterrole,clusterrolebinding,...] -l linkerd.io/control-plane-ns to achieve the deletion.
Is it possible to accomplish a full delete of Linked by using kubectl delete with a label selector or using kubectl apply --prune with a label selector? Is there a label that we have on all Linkerd resources?
@ihcsim should we also delete running proxies?
@grampelberg Good question. I don't see why someone would want to purge the control plane, but keep the proxies running in the data plane. Sounds like this command can first uninject all the proxies, then delete the control plane. If we don't want this command to modify a live cluster, it will need permissions to select pods from all namespaces.
FWIW, an operator that will automatically uninject pods, when it receives a Delete Linkerd namespace event feels more correct than doing it with a CLI.
@ihcsim assuming folks use auto-injection, we could just delete the injector first and have folks clean up the proxies, then output the yaml to remove the control plane.
FWIW, this is all way more complicated than having a command which outputs enough to clean up the control plane. We should keep this issue scoped to that for now.
@grampelberg So to summarize, if we choose to do this, we will limit it to just the control plane. And the new command will:
linkerd.io/control-plane-ns (see https://github.com/linkerd/linkerd2/pull/2971). This ensures that we aren't coupled to the CLI version, where the output of linkerd install might be different from the live resourceskubectl delete@adleong kubectl delete requires us to specify all the kinds. (See https://github.com/linkerd/linkerd2/issues/3622#issuecomment-588500514). The all kind only returns namespace-scoped resources.
@ihcsim we should either:
From a complexity perspective, option 1 feels like the most robust, but I definitely don't understand the details there.
I had been into the docs to get a better understanding of the issue. From what I infer, option 1 as suggested by @grampelberg looks like a better option.
But on what aspects is it different from the linkerd install | kubectl delete ...? The config yaml would be the same as from the stdout of the linkerd install. We are mainly aiming for a better linkerd command to achieve this. Am I correct about this?
@ihcsim
look up install options from the configmap, output the identical install and let users pipe to delete.
@grampelberg Are there any install options that will alter the number of generated resource kinds? As @supra08 pointed out, won't that just be the same as linkerd install | k delete -f - (other than there is less YAML in the output)? Also, this still requires the CLI to be the same version as the control plane. IMO, a command like linkerd delete should be able to clean up all Linkerd resources, regardless of the control plane's version.
@ihcsim sure, especially once we introduce addons.
I am a bit unsure about the resources that is being referred. Does it include the proxies? And by addons are we referring to this?
@grampelberg @ihcsim
@supra08 The resources refer to everything defined in the linkerd install output. It includes the control plane components, each has its own Linkerd proxy sidecar container.
And by addons are we referring to this?
Yes, there are some WIP PRs on addons.
Do you feel like you have enough information to get started? If not, don't hesitate to post your questions here.
Yeah, thanks for the info.
I guess I can begin working on it!
@supra08 are you still working on this? else I'd like to work on this
Yeah! just need 2 days time for this.
Okay great! :)
@supra08 Can I start work on this?
Yeah sorry, I can not continue currently, you can start working on it. 馃憤
Hi, is anyone currently looking at this issue? If not I'd like to pick it up.
@christyjacob4
Edit: I see that nobody is actively working on the issue, so I'd like to try my hand at it. Apologies in advance if I stepped on anyone's toes :)
I have read through all of the comments and would have a further clarification to make before I go on.
1) look up install options from the configmap, output the identical install and let users pipe to delete.
2) pick the important things we need to clean up (CRD, namespace, APIService), output only those in the minimal possible sense (just names and kinds) and let users pipe that to delete.
these are the two options that have been suggested, the first one would be a bit cleaner in my opinion, but aside from the comments raised before, if we get the install flags from the configmap and output the identical version of it to be deleted, won't we miss out on resources manually added later on?
For example, on a GKE cluster (not public)
install: |
{"cliVersion":"stable-2.7.0","flags":[]}
this would give the identical output of linkerd install, if additional RBAC has been applied to the cluster (e.g random-user-tap-admin) then that will not be removed. Is this something we care about, or would the user have to clean up additional roles/deployments/services themselves?
@grampelberg @ihcsim
Thank you in advance for your time, hope the above makes sense.
Forgive me for jumping in late into this issue.
From my understanding, it is sufficient to gather the cluster-scoped resources (APIService, ClusterRoles, etc) and the namespace, and delete that. It's not needed to regenerate all the namespace-scoped resources (Deployments, Services, etc) as deleting the namespace will take care of deleting everything under it. Which also includes the Secrets, Issuer and Certificate that were created outside linkerd install (and was the initial problem raised in this issue).
Like it was mentioned above, something like:
kubectl get APIService,ClusterRoleBinding,CustomResourceDefinition,MutatingWebhookConfiguration,ValidatingWebhookConfiguration,PodSecurityPolicy,Namespace -l linkerd.io/control-plane-ns=linkerd -oyaml
So this new linkerd uninstall command can be nothing more than a wrapper around this kubectl command. Am I missing something?
I really like @alpeb's suggestion. The only downside that I can think of is that we need to hardcode the list of cluster-scoped kinds, and this could theoretically change at some point. But I think this is outweighed by the benefit that this command will still work even if the list of installed resources changes between versions, as it often does (such as when we add new RBAC, for example).
Hi, I've read the new comments and thought I'd weigh in with my own opinion.
The way I have implemented it seems a bit heavy and unnecessarily complex, so I welcome the idea of simplifying it as @alpeb suggested. I am thinking of using the k8s clientset to get a list of all of the resources, output to stdout then that can be piped to kubectl delete. I also will get rid of the subcommands (e.g linkerd uninstall config), since they won't be necessary and I am not sure if they will add any value. I've originally added them in to be consistent with the rest of the CLI.
If this sounds good, I'll go ahead and change the PR, otherwise, let me know if you have further suggestions to make to the approach.
@adleong @alpeb @grampelberg @ihcsim
@Matei207 this sounds great to me! thanks for taking this on!
Most helpful comment
Forgive me for jumping in late into this issue.
From my understanding, it is sufficient to gather the cluster-scoped resources (APIService, ClusterRoles, etc) and the namespace, and delete that. It's not needed to regenerate all the namespace-scoped resources (Deployments, Services, etc) as deleting the namespace will take care of deleting everything under it. Which also includes the Secrets, Issuer and Certificate that were created outside
linkerd install(and was the initial problem raised in this issue).Like it was mentioned above, something like:
So this new
linkerd uninstallcommand can be nothing more than a wrapper around this kubectl command. Am I missing something?