Investigate if it would be doable to create a Custom Scheduler for scheduling TaskRun pods, e.g. co-scheduling pods that share workspace PVC volume.
When the affinity assistant was introduced it solved problems with concurrent access to workspace volumes and deadlock if pods were scheduled to different AZ.
Using pod-affinity to achieve Node Affinity for TaskRun pods was the least complex solution that was evaluated.
The current solution works for common cases, but it is not a perfect solution. E.g. there may be problems when TaskRun require different amount of resources and the Nodes need to be autoscaled up as in https://github.com/tektoncd/pipeline/issues/3049
Adding a custom scheduler will probably introduce more complexity and code. But it probably solve the problem in a more generic way than using the Affinity Assistant.
It seems like writing a custom scheduler is pretty straightforward: https://github.com/kelseyhightower/scheduler
but dealing with edge cases would probably be a lot of effort. I wonder if it would be possible to write a best-effort scheduler that runs first, but bails out to the real one in complex situations.
I suggest that we can evaluate that if Affinity Assistant can be implement by the Scheduling Framework.
FYI:https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/20180409-scheduling-framework.md
And we maybe can enhance the coscheduling to support the requirement. https://github.com/kubernetes-sigs/scheduler-plugins/tree/master/pkg/coscheduling
@jlpettersson @dlorenc
@denkensk
That's very interesting, I would like to take a try.
I think a good concrete next step here would be for someone to experiment/prototype with the scheduler framework and report back to the community with any findings/demos, and help us concretely understand what the code would look like to, for instance, replace AA with custom scheduling.
Based on those findings we could start a design doc to more concretely outline requirements and next steps, or maybe determine that delving into scheduling really isn't worth the effort and shouldn't be pursued at this time.
@vincent-pli is that something you'd be interested in exploring and driving?
@ImJasonH
Yes, as @dlorenc mentioned, the custom scheduler is straightforward and I can copy/paste from https://github.com/kubernetes-sigs/scheduler-plugins/tree/master/pkg/coscheduling as @denkensk said.
Anyway, I will make a demo and back here.
@ImJasonH @denkensk
I have make a very draft implements here: https://github.com/vincent-pli/coscheduler-same-node
Please take a look.
This looks cool @vincent-pli
What is the next step? Can I help?
We should probably imitate the logic of the Affinity Assistant in a scheduler and add it to the experimental repository. So that we eventually can replace the Affinity Assistant with the scheduler.
@jlpettersson @ImJasonH
I think we can add it to the experimental repository firstly then make some further discussion for further implements.
I'm glad to create a PR for it if needed.
@jlpettersson @ImJasonH
I think we can add it to the experimental repository firstly then make some further discussion for further implements.
I'm glad to create a PR for it if needed.
That would be great, I'd be happy to do any reviews and approve any PRs to add it to experimental. If we decide to try to move it into Tekton core we'd need a TEP, but it sounds like it should be usable without that in the near term at least.
Thanks!
Great, let's add it to the experimental firstly.
Most helpful comment
@denkensk
That's very interesting, I would like to take a try.