Non zero priority for linkerd pods
Some clusters have eager processes that will try to keep the nodes busy in all situation. When that happens, linkerd pods (currently installed with a priority = 0) will easily become unresponsive, endangering the behaviour of meshed pods/services (which can easily be all of them)
One of:
What do you want to happen? Add any considered drawbacks.
Default to max + override possible for other use cases I couldn't think of out there
Is there another way to solve this problem that isn't as good a solution?
@grampelberg mentionned kustomize as a potential solution, in which case it might deserve a documentation sample
linkerd upgrade --priorityclass=mypriorityclass
If you can, explain how users will be able to use this. Maybe some sample CLI
output?
N/A
Couple questions come to mind:
--ha only flag? system-cluster-critical sounds valuable.I spent some time reading the docs, these are my 2 cents on this.
@grampelberg
ha can have a good priority default.We can use the system-cluster-critical, but I see that core k8s components like api-server, scheduler. controller-manager use them. Are we more important them? but I dont think we will see places where the cluster has to make decisions btw core components and linkerd. This can be a problem in a case, where the cluster is small but the controller-replicas were given a high number which is rare. This is a viable option.
good example on cluster autoscaling with pod priority. Looks like it dosent do anything differently. Any specific questions here?
I think it does respect the taint settings, and will not place it if there are no nodes having pods with less priority when it is NoExecute and will place it, if it is PreferNoExecute it tries and may choose that tainted node, if it can't find a better pod to preempt with less priority.
@Esardes I just put some docs together around this, does linkerd/website#384 work for you?
@grampelberg this is perfect thank you
it might be worth changing the doc from priorityClassName system-cluster-critical to one that would be user created as I am not sure we can actually use it for our pods (its value is set at 2 billions and I understood that as cluster users/admins we were limited to 1 billion)
@Esardes that鈥檚 great feedback, I wanted those docs to be more of an example than anything else. I still think this is a valid feature request and should go into 鈥攈a or something by default with the correct value.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
Most helpful comment
@Esardes that鈥檚 great feedback, I wanted those docs to be more of an example than anything else. I still think this is a valid feature request and should go into
鈥攈aor something by default with the correct value.