Community: devel/sig-scheduling: Document the scheduling queues and events

Created on 10 Feb 2021  路  9Comments  路  Source: kubernetes/community

refer #5237

Document the different scheduling queues (active, backoff, unschedulable) and how pods move from one to another.

/sig scheduling

aredeveloper-guide help wanted sischeduling

Most helpful comment

/help

This is a good chance to learn scheduling queuing design - the hard way :) I can be the shepherd.

At a high-level, I'd like the doc to explain the following aspects in details: (but not limited to these)

  • What is activeQ, unschedulableQ and backoffQ?
  • How a pod is moved if it cannot be scheduled? (to unschedulableQ/backoffQ) How moveRequestCycle is correlated?
  • How a pod is moved back to activeQ?
  • How cluster events (e.g., a new node is added) impact the Pods movement between queues? (This can be held as we're introducing new mechanics on how a cluster event triggers pods movement)
  • Cluster admin perspective: what's the default period settings of initial/maximum backoff, as well as flushing backoffQ/unschedulableQ, and which ones can be customized and how?
  • What are the metrics related to queueing behavior.
  • More...

All 9 comments

cc @Huang-Wei

/area developer-guide

/help

This is a good chance to learn scheduling queuing design - the hard way :) I can be the shepherd.

At a high-level, I'd like the doc to explain the following aspects in details: (but not limited to these)

  • What is activeQ, unschedulableQ and backoffQ?
  • How a pod is moved if it cannot be scheduled? (to unschedulableQ/backoffQ) How moveRequestCycle is correlated?
  • How a pod is moved back to activeQ?
  • How cluster events (e.g., a new node is added) impact the Pods movement between queues? (This can be held as we're introducing new mechanics on how a cluster event triggers pods movement)
  • Cluster admin perspective: what's the default period settings of initial/maximum backoff, as well as flushing backoffQ/unschedulableQ, and which ones can be customized and how?
  • What are the metrics related to queueing behavior.
  • More...

@Huang-Wei:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

This is a good chance to learn scheduling queuing design - the hard way :) I can be the shepherd.

At a high-level, I'd like the doc to explain the following aspects in details: (but not limited to these)

  • What is activeQ, unschedulableQ and backoffQ?
  • How a pod is moved if it cannot be scheduled? (to unschedulableQ/backoffQ) How moveRequestCycle is correlated?
  • How a pod is moved back to activeQ?
  • How cluster events (e.g., a new node is added) impact the Pods movement between queues? (This can be held as we're introducing new mechanics on how a cluster event triggers pods movement)
  • Cluster admin perspective: what's the default period settings of initial/maximum backoff, as well as flushing backoffQ/unschedulableQ, and which ones can be customized and how?
  • What are the metrics related to queueing behavior.
  • More...

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

/assign

First version of the document: https://docs.google.com/document/d/1gy1b6dceiKCdM3oFalfJNfmiHtb46bTUz0a2pC_W5Bo/edit

Feel free to comment.

I can definitely see myself adding diagrams about various movement of pods between queues. Anyone knows a useful tool (maybe online) which can draw nice diagrams?

Thanks @ingvagabund , very nice document. I added some comments.

I can definitely see myself adding diagrams about various movement of pods between queues. Anyone knows a useful tool (maybe online) which can draw nice diagrams?

I usually use draw.io or Google drawing (better on collaboration). Both are online tools.

Thank you @ingvagabund, that's indeed a nice document! I learnt many things I didn't know before.

Given that this doc is targeted towards contributors (file path is in contributors/devel/sig-scheduling), I think it'd help to point to source files where some of important logic is defined. I think line number information for very specific details will become out-of-date very quickly, but the file name should be a good starting point.

I think line number information for very specific details will become out-of-date very quickly, but the file name should be a good starting point.

we could link to line numbers at the current latest hash (I try to do this in GH comments for this reason), or include code snippets. But either way, yeah pointing to where the logic lives is a good idea

Was this page helpful?
0 / 5 - 0 ratings