Linkerd2: Priority/Priorityclass for linkerd pods

Created on 21 Jun 2019 · 6Comments · Source: linkerd/linkerd2

Feature Request

Non zero priority for linkerd pods

What problem are you trying to solve?

Some clusters have eager processes that will try to keep the nodes busy in all situation. When that happens, linkerd pods (currently installed with a priority = 0) will easily become unresponsive, endangering the behaviour of meshed pods/services (which can easily be all of them)

How should the problem be solved?

One of:

be able to specify a priority or (and?) priorityClass when installing linkerd on a k8s cluster
by default use the maximum value possible (1 billion)

What do you want to happen? Add any considered drawbacks.
Default to max + override possible for other use cases I couldn't think of out there

Any alternatives you've considered?

Is there another way to solve this problem that isn't as good a solution?
@grampelberg mentionned kustomize as a potential solution, in which case it might deserve a documentation sample

How would users interact with this feature?

linkerd upgrade --priorityclass=mypriorityclass

If you can, explain how users will be able to use this. Maybe some sample CLI
output?
N/A

arecontroller areinstall help wanted

Source

Esardes

Most helpful comment

@Esardes that’s great feedback, I wanted those docs to be more of an example than anything else. I still think this is a valid feature request and should go into —ha or something by default with the correct value.

grampelberg on 2 Jul 2019

👍2

All 6 comments

Couple questions come to mind:

Is it safe to turn this on in all cases? Should it be a --ha only flag?
Is it possible to just use an existing PriorityClass? system-cluster-critical sounds valuable.
How does this impact cluster autoscaling?
Are there any gotchas with node taints?

grampelberg on 21 Jun 2019

I spent some time reading the docs, these are my 2 cents on this.

@grampelberg

I don't think a default option would be great, As this may not be something that users may expect and can prempt more imp workloads (that have less priority number). A flag option would be great, So that users can set the priority considering their other workloads. ha can have a good priority default.

We can use the system-cluster-critical, but I see that core k8s components like api-server, scheduler. controller-manager use them. Are we more important them? but I dont think we will see places where the cluster has to make decisions btw core components and linkerd. This can be a problem in a case, where the cluster is small but the controller-replicas were given a high number which is rare. This is a viable option.
good example on cluster autoscaling with pod priority. Looks like it dosent do anything differently. Any specific questions here?
I think it does respect the taint settings, and will not place it if there are no nodes having pods with less priority when it is NoExecute and will place it, if it is PreferNoExecute it tries and may choose that tainted node, if it can't find a better pod to preempt with less priority.

Pothulapati on 21 Jun 2019

@Esardes I just put some docs together around this, does linkerd/website#384 work for you?

grampelberg on 28 Jun 2019

❤1

@grampelberg this is perfect thank you
it might be worth changing the doc from priorityClassName system-cluster-critical to one that would be user created as I am not sure we can actually use it for our pods (its value is set at 2 billions and I understood that as cluster users/admins we were limited to 1 billion)

Esardes on 2 Jul 2019

grampelberg on 2 Jul 2019

👍2

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.