After https://github.com/kubernetes/autoscaler/pull/2049 VPA is adjusting the specified limit proportionally to the requested resources.
This seems problematic when recommended resources become fairly low. E.g. in a statefulset we ran with limit: 1000m and requested: 500m and once recommended target got lowered to 25m (due to low usage) with a limit of 50m. While this works normally, any spike in consumption now means that the pod is immediately throttled, the health check fails and the pod is recreated, but never seems to cause an increase of the recommended value.
I would see three ways to prevent this:
1) set a min target for each VPA resource, which seems a bit like anti-pattern as the magic of VPA is that we have to worry less about resources.
2) remove the limits & requests from the statefulset, which works, but also means a runaway pod could affect the entire node as CPU is no longer throttled.
3) modify VPA to set the MaxAllowed as pod limits if no other limit is specified, which would allow sudden resource changes in certain boundaries
Overall I like 3) the most, would you accept a PR implementing this change? Is there anything I'm missing?
/assign bskiba
3 seems a bit problematic, as this means that effectively VPA would start changing pod's qos class which we are explicitly trying to avoid.
My recommendation is a variation of 2 - remove the resource limit for CPU on the pod.
In general we don't recommend setting limits on CPU (except in very very specific cases maybe), since with CPU, a pod will run away only in case there are free CPU resources on the node. If there are other pods requesting cpu on the same node, the pod will be throttled to what it actually requested, so it will never cannibalize the node resources.
With limit gone, the pod will be throttled only if a pod is very tightly packed, but if it has any free space to burst over the request, VPA should notice and bump the resource request up accordingly.
We do recommend limits on memory but with memory you will get Out of Memory exceptions in situations you described, which will cause VPA to bump the memory recommendation up.
I have been trying to understand the implications of letting VPA managing limit, and I have came to similar thought --- since CPU is "elastic" and cgroups can provide an effective isolation, removing CPU limit so it can't be changed in a surprised way seems to make sense; on the other hand, memory usage run-away can cause the entire system to OOM and trigger expensive reaping process and cause other issues, so limits on memory must remain at least to reduce the risk.
However, I also want to suggest that managing limit should be made optional if it is not already so. I think the main use case of vpa is to make requested resource closer to actual resource so scheduling and load balancing can be efficient. Limit does not play a role here, afaik, and can have complexity that the main use case users do not want to carry.
Instead of 3) I would also be happy if the proportional scaling of requested & limits can be disabled if the limit != requested.
I see this definitely useful when both are equal to remain inside the same QoS class, but what is the benefit of changing the limit proportional in other cases? I found that fairly surprising.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Closing this in favour of #2387 that seems to be asking for same thing (configure if limits should be modified)
/close
@bskiba: Closing this issue.
In response to this:
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
I have been trying to understand the implications of letting VPA managing limit, and I have came to similar thought --- since CPU is "elastic" and cgroups can provide an effective isolation, removing CPU limit so it can't be changed in a surprised way seems to make sense; on the other hand, memory usage run-away can cause the entire system to OOM and trigger expensive reaping process and cause other issues, so limits on memory must remain at least to reduce the risk.
However, I also want to suggest that managing limit should be made optional if it is not already so. I think the main use case of vpa is to make requested resource closer to actual resource so scheduling and load balancing can be efficient. Limit does not play a role here, afaik, and can have complexity that the main use case users do not want to carry.