Pipeline: Schedule Pods in resource-constrained environments

Created on 5 Apr 2019 · 4Comments · Source: tektoncd/pipeline

Expected Behavior

In a resource-constrained environment like a namespace with resource limits imposed (or just an insufficiently provisioned cluster), creating a TaskRun (Pod) that exceeds those limits should not fail the TaskRun, but should instead continually try to create the Pod until it either succeeds or times out.

Actual Behavior

Pods fail to start and the TaskRun is failed ~immediately.

Steps to Reproduce the Problem

Define a namespace with resource constraints (e.g., 10 CPU, 10 GB RAM)
Create 15 TaskRuns each requesting 1 CPU and 1 GB RAM, running hello world or something simple
~10 of those will be scheduled and will succeed, the rest will fail due to insufficient resources.

Additional Info

It's unclear whether users would expect TaskRuns waiting for sufficient resources to queue in order of the time they were created, or whether they'd expect the Kubernetes scheduler to do whatever it needs to do to schedule the Pods. As an initial implementation it's probably fine to have Kubernetes schedule Pods, and not have to worry about enforcing FIFO.

good first issue help wanted kinfeature

Source

ImJasonH

Most helpful comment

I'm going to move this into a design doc. There are enough variables here to seed some discussion and it'd be good to get broader input before committing to one approach.

sbwsg on 25 Apr 2019

👍2

All 4 comments

/assign @sbwsg

sbwsg on 17 Apr 2019

🎉2

Been working through some of the implementation details in a POC but want to drop current working notes here since I likely won't be able to work on it more until tomorrow.

Catching a pod failure is relatively straightforward; checking the error message produced by the createPod() func in pkg/reconciler/v1alpha1/taskrun/taskrun.go reveals the reason. From here it's quick to parse out the error message and look for e.g. "exceeded quota" in the string. This relies on a somewhat brittle contract though. I'll also need to check for the different messages generated both by LimitRanges as well as ResourceQuotas since it looks like they both enforce resource limits on a pod. I'm currently looking around to see if there's a less brittle approach to this error checking.
Once the resource constraint error is detected the pod then needs to be restarted. In my POC implementation this works by simply Enqueue()ing the TR to be re-assessed on the next reconcile loop. This results in many rapid reruns however when really it would be nicer to see an exponential backoff strategy similar to that used by k8s' job controller. The job controller uses a particular kind of workqueue to implement this ("ExponentialFailureRateLimiter") but TaskRun's controller Impl uses the "RateLimitedQueue", which is set up via knative's controller.NewImpl() func. So I'm looking at other alternatives to implement this.

sbwsg on 23 Apr 2019

I'm going to move this into a design doc. There are enough variables here to seed some discussion and it'd be good to get broader input before committing to one approach.

sbwsg on 25 Apr 2019

👍2

I've started the design doc here including use cases, a draft implementation, some open questions and possible alternative implementations that I'm still working through.

sbwsg on 29 Apr 2019

Was this page helpful?

0 / 5 - 0 ratings