Instead of reporting all metrics through the control plane, it would be ideal to expose metrics natively from a proxy's admin server, via a Prometheus export.
Note that if TLS is added to the proxy's admin service (on which tap is exposed), we'll likely be able to serve the metrics from this server (so we don't have to start an additional http server), because prometheus's collector can--as far as I can tell--use H2 to read metrics if the server supports TLS/ALPN.
The proxy should initially expose the following Prometheus metrics and labels:
inbound_request_total {dst_<X>} (counter)
inbound_request_duration_ms {dst_<X>} (counter)
inbound_response_total {dst_<X>, status_code, grpc_status_code} (counter)
inbound_response_latency_ms {dst_<X>, status_code, grpc_status_code, le} (histogram)
inbound_response_duration_ms {dst_<X>, status_code, grpc_status_code, le} (histogram)
outbound_request_total {authority, src_<X>, dst_<Y>} (counter)
outbound_request_duration_ms {authority, src_<X>, dst_<Y>} (counter)
outbound_response_total {authority, src_<X>, dst_<Y>, status_code, grpc_status_code} (counter)
outbound_response_latency_ms {authority, src_<X>, dst_<Y>, status_code, grpc_status_code, le} (histogram)
outbound_response_duration_ms {authority, src_<X>, dst_<Y>, status_code, grpc_status_code, le} (histogram)
<X> is taken from the CONDUIT_PROMETHEUS_LABELS env var. For example, if CONDUIT_PROMETHEUS_LABELS="deployment=foo" then the proxy should expose inbound_request_total {dst_deployment=foo} (depends on #426)<Y> is taken from the prometheus label data from the destination service. For example, if the proxy is sending a request to an address with the prometheus label data of "job=bar" then the proxy should expose outbound_request_total {..., dst_job=bar} (depends on #429)Open Questions:
[Edit: this comment is out of date and no longer reflects the current thinking. See description for up-to-date information]
The proxy should initially expose the following Prometheus metrics and labels:
inbound_request_total {http_method, target_deployment}
inbound_response_total {http_method, target_deployment, status_code, grpc_status_code}
inbound_response_latency_ms {http_method, target_deployment, status_code, grpc_status_code, le}
outbound_request_total {source_deployment, target_service, target_deployment}
outbound_response_total {source_deployment, target_service, target_deployment, status_code, grpc_status_code}
outbound_response_latency_ms {source_deployment, target_service, target_deployment, status_code, grpc_status_code, le}
target_deployment on inbound metrics depends on #426target_service should be set to the normalized from of the :authority header (e.g. s.n.svc.cluster.local)When the proxy container is shutting down, should it linger for a while to allow the last set of metrics to be retrieved by Prometheus? (I don't know, myself.)
@briansmith more generally, we probably want to honor some graceful draining period so that requests may complete while we stop accepting new requests. Aside from that, I don't think we need to do too much work to ensure that prometheus has a 100% accurate view of the world (i.e. some requests may go unaccounted, and that's ok).
I would like signoff here from @olix0r @siggy and @seanmonstar on the list of metrics and labels to expose. The list is given in the issue description.
le?*_response_latency_ms is the latency of the service only. Should we have another metric that includes the time spent in the proxy as well?le is "less than or equal to", specific to Prometheus histograms:
https://prometheus.io/docs/prometheus/latest/querying/functions/#histogram_quantile
histograms have _bucket, _sum, and _count metrics, they look like this:
# HELP http_response_size_bytes A histogram of response sizes for requests.
# TYPE http_response_size_bytes histogram
http_response_size_bytes_bucket{le="100"} 14
http_response_size_bytes_bucket{le="200"} 15
http_response_size_bytes_bucket{le="300"} 48
http_response_size_bytes_bucket{le="400"} 1789
http_response_size_bytes_bucket{le="500"} 1789
http_response_size_bytes_bucket{le="1000"} 4087
http_response_size_bytes_bucket{le="2000"} 4089
http_response_size_bytes_bucket{le="3000"} 4102
http_response_size_bytes_bucket{le="4000"} 4119
http_response_size_bytes_bucket{le="5000"} 4135
http_response_size_bytes_bucket{le="10000"} 4376
http_response_size_bytes_bucket{le="20000"} 5864
http_response_size_bytes_bucket{le="30000"} 5864
http_response_size_bytes_bucket{le="40000"} 5864
http_response_size_bytes_bucket{le="50000"} 5864
http_response_size_bytes_bucket{le="1e+06"} 5864
http_response_size_bytes_bucket{le="2e+06"} 5864
http_response_size_bytes_bucket{le="3e+06"} 5864
http_response_size_bytes_bucket{le="4e+06"} 5864
http_response_size_bytes_bucket{le="5e+06"} 5864
http_response_size_bytes_bucket{le="+Inf"} 5864
http_response_size_bytes_sum 2.3006904e+07
http_response_size_bytes_count 5864
@seanmonstar
I assume *_response_latency_ms is the latency of the service only. Should we have another metric that includes the time spent in the proxy as well?
We probably want some sort of handletime_us histogram tracking time-in-proxy
@adleong
target_service should be the normalized form of the :authority header
Can we just call this authority? It's not target-specific. And it's not necessarily a Service.
Is there an existing convention around source_ and target_ outside of Conduit?
What about using src_ and dst_? (we don't talk about targets anywhere, but we do talk about destinations).
Somewhat related to this ticket, TIL about Prometheus' labelmap config, which automates copying Kubernetes labels to Prometheus labels at collection time:
https://github.com/prometheus/prometheus/blob/release-2.1/documentation/examples/prometheus-kubernetes.yml#L6
Some open questions for me:
1) What is the duration in metrics like inbound_request_duration_ms? Time spent in proxy? How would inbound_request_duration_ms differ from outbound_response_duration_ms?
2) Do we want to include request/response sizes?
3) #428 does no mention use of authority. How does it differ from target_deployment, and if we're not querying for it, why are we recording it?
4) Should we export health metrics as well? Memory usage, etc?
This is also unclear to me:
Should we ever reset counters?
I'm not that familiar with Prometheus. After it scrapes out /metrics endpoint, should we be resetting all counters? Is it assumed that the counts in the output are the counts from between scrapes?
Prometheus assumes counters are not reset between scrapes. I'm also unclear when/why we would reset.
@siggy
What is the duration in metrics like inbound_request_duration_ms?
The amount of time that from when the request starts until the request completes. I.e. effectively the time between when the HEADERS frame is sent and the TRAILERS frame is sent
Note that I don't think it's critical to expose this from the start, but will be helpful for the sake of completeness.
How would inbound_request_duration_ms differ from outbound_response_duration_ms?
inbound_request_duration_ms measures the timing for request body streams for requests originating from outside the pod.
outbound_request_duration_ms measures the timing for response body streams for requests originating within the pod.
Note that inbound_request and outbound_response are totally unrelated -- inbound_ applies to streams originating outside the pod, and outbound_ applies to streams originating inside the pod.
Do we want to include request/response sizes?
Probably. Not critical to shipping this feature though.
Should we export health metrics as well? Memory usage, etc?
Yes, but let's not couple that with this.
428 does no mention use of authority. How does it differ from target_deployment, and if we're not querying for it, why are we recording it?
it's equivalent to what we were calling target_service, see my comment above.
I suggest that the next step for this is to write the documentation for Proxy Metrics, submitted as a PR. It will help to consolidate the various parts of conversation in an authoritative doc, and also means we have to update that doc if the impl changes ;)
Question: why do we even need inbound_ and outbound_ prefixes? can this simply be added as a label like orientation="inbound"? this would make it easy to count, for instance, the total number of requests through a proxy...
orientation could certainly be a label. One potential drawback of this is that it is harder to tell which combinations of labels will be exposed together. For example, the src_<X> label will not be populated on the incoming metrics because the proxy cannot tell which pod grouping a request came from.
I've opened issues #540, #541, #542, #543, and #544 to track several of the subtasks involved in this issue.
closing in favor of individual issues.