Autoscaler: Vertical-pod-autoscaler - How much time does it need to change the requested CPU/Memory?

Created on 28 Nov 2018 · 8Comments · Source: kubernetes/autoscaler

I wonder how much time does vertical-pod-autoscaler need to change the requested CPU/Memory.

Also...does it have an internal registry with the historical resources of each pod?

lifecyclrotten vertical-pod-autoscaler

Source

luarx

Most helpful comment

Nice! Now, I understand better how VPA recommender works, thank you!

About a flag to control if VPA scales up/down faster, I think that both a global flag (with a default value) and a specific setting per VPA object (if it's not present it would use the global flag) would be useful options.
For example, maybe I want to set how it works in a whole cluster but for critical PODs which could have a huge amount of request per second in a small timeframe I want to be able to scale up CPU/Memory faster.

luarx on 29 Nov 2018

👍3

All 8 comments

Regarding the first question - if you're using Auto or Recreate update mode, it should up to few minutes assuming that:

current resource request is outside the (lowerBound, upperBound) range, regardless of how it compares with target recommendation,
your Pod is controlled via Deployment or other Controller,
the PodDisruptionBudget does not restrict eviction further,
your cluster is relatively idle.

Regarding the second one - VPA recommender gathers & keeps resource usage. Is that what you meant by historical resources or was that more about how PodSpec looked in history?

kgolab on 28 Nov 2018

Great!
I was asking those questions because I checked that if there is a Pod (controlled by a Deployment) running for some hours...if I increment the CPU load for 10 minutes it doesn't scale because it needs more time probably.

But if the pod is running for a few minutes it scales faster. I suppose that it's the normal behaviour of the vpa because it has to compare current CPU/memory usage with the information which VPA recommender gathers, right?

Could it be possible to scale up/down faster in the future specifying a flag?

luarx on 28 Nov 2018

The behaviour you observe is indeed intentional - VPA uses 90th percentile of samples for the base recommendation so the more samples you already have (the longer Pods have been running), the more samples are required to "overrule" old state.

As for your question - I'm wondering if a flag (so global value for a whole cluster and all its workloads) or rather a setting per vpa object to control each workload separately would be more useful. WDYT?

kgolab on 29 Nov 2018

Nice! Now, I understand better how VPA recommender works, thank you!

luarx on 29 Nov 2018

👍3

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 27 Feb 2019

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot on 29 Mar 2019

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot on 28 Apr 2019

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.