Serving: HPA on concurrency

Created on 8 Feb 2019  路  2Comments  路  Source: knative/serving

Proposal

HPA-class PodAutoscalers should be able to scale on concurrency as well as CPU. This requires exposing the Knative calculated concurrency metric as a K8s custom metric. And creating a v2beta2 HPA with a pointer to the custom metric. (v1 HPA does not support custom metrics).

The custom metrics server implementation can vary from provider-to-provider. But we will probably want to configure the Prometheus instance (which comes out-of-the-box) to collect the custom metrics and export it with the Promethus adapter.

areautoscale kinfeature

Most helpful comment

I think we should implement a very lightweight version of the custom-metrics-server (https://github.com/kubernetes-incubator/custom-metrics-apiserver). After that's done and it works we can write up guides on how to connect the Knative installation to prometheus + configuring the prometheus custom-metrics adapter to work just as well.

That way we'd prevent people to need to buy into Prometheus (there have been concerns about its "heaviness").

All 2 comments

I think we should implement a very lightweight version of the custom-metrics-server (https://github.com/kubernetes-incubator/custom-metrics-apiserver). After that's done and it works we can write up guides on how to connect the Knative installation to prometheus + configuring the prometheus custom-metrics adapter to work just as well.

That way we'd prevent people to need to buy into Prometheus (there have been concerns about its "heaviness").

I'm going to knick this off as part of the custom-metrics work I'm doing

/assign

Was this page helpful?
0 / 5 - 0 ratings