Pipeline: taskrun pod of resource.request settings is not good enough for customization

Created on 22 Jul 2020 · 17Comments · Source: tektoncd/pipeline

Expected Behavior

If I set the resource.request in each Tekton task step, during create pod, I would like to keep this setting for my original resource.request value

Actual Behavior

The tektoncd pkg/pod logic will override the container.resource.request value follow by this doc:
https://github.com/tektoncd/pipeline/blob/master/docs/taskruns.md#specifying-limitrange-values

It is not good, because I would like to control the resource.request same as request.limit for my taskrun steps/containers

Steps to Reproduce the Problem

From the task API defination, you can see the task step comes from the corev1.Container:
https://github.com/tektoncd/pipeline/blob/master/pkg/apis/pipeline/v1beta1/task_types.go#L119

type Step struct {
    corev1.Container `json:",inline"`

    // If Script is not empty, the Step cannot have an Command or Args.
    Script string `json:"script,omitempty"`
}

It mean, I can define the resource.limit and resource.request for each step myself:
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/api/core/v1/types.go#L2260

type ResourceRequirements struct {
    Limits ResourceList `json:"limits,omitempty" protobuf:"bytes,1,rep,name=limits,casttype=ResourceList,castkey=ResourceName"`
    Requests ResourceList `json:"requests,omitempty" protobuf:"bytes,2,rep,name=requests,casttype=ResourceList,castkey=ResourceName"`
}

So I created a taskrun and want my first step/container uses 1cpu/1Gi mem and second uses 2cpu/2Gi mem
I don't like or want tekton to help me do any intelligent check, I just want to keep my original resource settings!:

apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
  generateName: helloworld-
spec:
  taskSpec:
    steps:
    - name: helloworld1
      image: ubuntu
      resources:
        limits:
          cpu: "1"
          memory: "1Gi"
        requests:
          cpu: "1"
          memory: "1Gi"        
      script: echo helloworld
    - name: helloworld2
      image: ubuntu
      resources:
        limits:
          cpu: "2"
          memory: "2Gi"
        requests:
          cpu: "2"
          memory: "2Gi"        
      script: echo helloworld

But the final pod is not my expected:
The first one's resource.requests is changed to 10m/128Mi from limitRange settings:

    Limits:
      cpu:     2
      memory:  1Gi
    Requests:
      cpu:                10m
      ephemeral-storage:  0
      memory:             128Mi

The second one is kept as before:

    Limits:
      cpu:     2
      memory:  2Gi
    Requests:
      cpu:                2
      ephemeral-storage:  0
      memory:             2Gi

I know this is work as design now by reviewing the doc and code:
https://github.com/tektoncd/pipeline/blob/master/docs/taskruns.md#specifying-limitrange-values
But I think it is not correct, if I set the resource.request by myself, it mean I want this setting, we don't need tekton to help me to refine that.
My case is I want to make the resource.limit = resource.request, the current logic break my requirement.

Additional Info

The code logic is here:
https://github.com/tektoncd/pipeline/blob/master/pkg/pod/resource_request.go#L67-L77

Current logic is:

    for i := range containers {
        if containers[i].Resources.Requests == nil {
            containers[i].Resources.Requests = limitRangeMin
            continue
        }
        for _, resourceName := range resourceNames {
            if maxIndicesByResource[resourceName] != i {
                containers[i].Resources.Requests[resourceName] = limitRangeMin[resourceName]
            }
        }
    }

containers[i].Resources.Requests[resourceName] = limitRangeMin[resourceName]
Right now ,tekton keeps the resource.limit as original values, but change the resource.request like above ^^

My point is:

If the user want tekton to control the request.request for the container, he doesn't need to set the resource.request in his step
If the end user want to control my himself and the containers[i].Resources.Requests != nil, the tekton should skip this logic and keep using my ORIGINAL configuration.

And just keep as this:

    for i := range containers {
        if containers[i].Resources.Requests == nil {
            containers[i].Resources.Requests = limitRangeMin
            continue
        }
    }

Please let me know if there is any other cosideration.

Thanks!

kinbug

Source

zhangtbj

👍3

Most helpful comment

Would be potentially helpful to convert the slides to a doc. I think the visuals/the brevity of it is really helpful.

danielhelfand on 22 Jul 2020

👍2

All 17 comments

Thank you @zhangtbj to open this issue. I am in same team with @zhangtbj .

In our user case, we are using tekton in multiple tenant IKS cluster, so the CPU/Memory/Disk resource for tekton pod is very important, we want to avoid over-commit on the cluster, we want to set resource request = resource limit for every container.

And for some tekton container, it does very simple job, we also want to set it as small resource as possible. Thank you

ZhuangYuZY on 22 Jul 2020

I don't like or want tekton to help me do any intelligent check, I just want to keep my original resource settings!:

I agree with this.

This is related about LimitRanges: https://github.com/tektoncd/pipeline/issues/2666

jlpettersson on 22 Jul 2020

/cc @vdemeester @ImJasonH @bobcatfish

afrittoli on 22 Jul 2020

I tend to agree with this too. I think the initial thoughts around this was that resources request on step would sum up even though we do not run them all at the same time (and thus, only the max of any request is required). But that smart behavior proves to be more painful than it seems to be worth, especially with LimitRanges where it can lead to unschedulable pods.

vdemeester on 22 Jul 2020

cc @danielhelfand

ImJasonH on 22 Jul 2020

As @vdemeester says, the reason we try to be intelligent is because we know the steps won't run at the same time, and we expect users to understand that too when making resource requests/limits. By default the current implementation creates Pods which would sum up all the requests and result in a much larger Pod request than needed.

For example, a Task with 10 steps each individually requesting 10 CPU and 1Gi of RAM would result in a Pod requesting 100 CPU and 10Gi of RAM, 90% of which would sit unused during each step (and this request might be limited by a LimitRange or by available resources on the cluster).

In the future if K8s adds some feature that allowed us to run steps as initContainers instead of containers, K8s itself would apply the same sort of logic and consider only the max request, and not the sum. Or if the API is running on an platform that is not K8s, the platform should reserve compute resources based on the max request, not the sum. Such a platform might be able to dynamically allocate resources per-step instead of per-container as K8s does, and might not even have a concept of LimitRange, since that concept is not included in the Tekton API.

From your report I'm unclear on why you want to override Tekton's smarts. Can you help me understand what's going wrong when Tekton's Pod requests get modified?

I wouldn't have a problem with a ConfigMap option to disable resource request smarts, but I would want us to keep the behavior the same for now until we find that most users are unhappy with it.

ImJasonH on 22 Jul 2020

As another option we could add a field to TaskSpec or TaskRunSpec that specifies requests for the whole run, which would be invalid to specify along with per-step resources.

That makes it clearer that you're specifying for the entire run (i.e., underlying Pod) and removes the need for some of the per-step smarts.

ImJasonH on 22 Jul 2020

Hi @ImJasonH and all,

Thanks for the suggestion.

I have a question about the pod schedule.

Do you mean the pod schedule is based on the sum of all container resource.request?

If keep using the current tekton way, such as a Task with 10 steps each individually requesting 10 CPU and 1Gi of RAM (no limitRange in namespace).

The taskrun pod just requests 10CPU and 1Gi RAM resource, and each step/container will use the 10CPU and 1Gi RAM resource during its execution time?

And the total resource cost of this pod is the max CPU and memory request * pod execution time, not the sum of each container resource.limit * container execution time?

And where I can find the introduction about it?

Thanks!

zhangtbj on 22 Jul 2020

Do you mean the pod schedule is based on the sum of all container resource.request?

Yes, traditional K8s Pods request the sum of all resources requested by its containers. See https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#how-pods-with-resource-requests-are-scheduled

It does this because K8s Pods expect to start all containers at once, and continue to execute all containers indefinitely.

Tekton by contrast expects that steps will be run sequentially and won't need to share resources. It currently handles this by requesting a Pod with containers that request the max resources for one container, and minimal resources for the rest. LimitRanges complicate this by redefining the minimum valid request, but Tekton still tries to request a Pod with minimal resources requested to run all of the steps.

ImJasonH on 22 Jul 2020

For further context on why this behavior was put in place, see #598.

As another option we could add a field to TaskSpec or TaskRunSpec that specifies requests for the whole run, which would be invalid to specify along with per-step resources.

That makes it clearer that you're specifying for the entire run (i.e., underlying Pod) and removes the need for some of the per-step smarts.

I also suggested this in #2931 as I think it's misleading to allow users to set resource requests for each step.

But that smart behavior proves to be more painful than it seems to be worth, especially with LimitRanges where it can lead to unschedulable pods.

I too have always had concerns with regard to Quality of Service for Pods and do wonder if it makes sense to allow users to define requests for each step.

Perhaps keeping the current approach as a default behavior, but allowing users to override requests. And better documentation on use cases with Tekton.

danielhelfand on 22 Jul 2020

I wrote a brief presentation to hopefully illustrate why Tekton modifies requests the way it does. I've shared it with tekton-dev@

https://docs.google.com/presentation/d/1-FNMMbRuxckAInO2aJPtzuINqV-9pr-M25OXQoGbV0s/edit

ImJasonH on 22 Jul 2020

🚀1

Hi @danielhelfand ,

Yep, it is a little confusing that I set resource.request, but the tekon change to a new setting which I may don't know.

And it is better to have better documentation on use cases with Tekton.

zhangtbj on 22 Jul 2020

Thanks @ImJasonH ,

The document is very clear, thanks!

I just have two questions on the Other details page:

Only covered CPU requests, not memory or ephemeral-disk or other resources (GPUs)
Do you mean the memory or ephemeral-disk or other resources are same as the CPU request exmaple in your ppt?

initContainers: Pod request is max(initContainers) + sum(containers)
Is that correct?

Because I saw the introuduction here:
https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#resources

The Pod's effective request/limit for a resource is the higher of:
- the sum of all app containers request/limit for a resource
- the effective init request/limit for a resource

So what I thought is the Pod request is max(initContainers) OR sum(containers) ?

zhangtbj on 22 Jul 2020

Do you mean the memory or ephemeral-disk or other resources are same as the CPU request exmaple in your ppt?

I mean that the semantics of how K8s translates the requests of all the containers in a Pod into the request for that Pod as a whole when scheduling that Pod is (as far as I know) the same whether you're talking about CPUs, or memory, or ephemeral-disk, or GPUs.

In the example, those 9 CPUs could easily have been 9 GiB of memory, or 9 GiB of disk, or 9 GPUs.

So what I thought is the Pod request is max(initContainers) OR sum(containers) ?

Yes! You're absolutely right.

The request is actually max(max(initContainers), sum(containers)), not the sum. I'll update the slides accordingly. Thanks for catching that!

ImJasonH on 22 Jul 2020

👍1

Would be potentially helpful to convert the slides to a doc. I think the visuals/the brevity of it is really helpful.

danielhelfand on 22 Jul 2020

👍2

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.