Quarkus: Metrics and Health port

Created on 16 Mar 2020  Â·  19Comments  Â·  Source: quarkusio/quarkus

Description
Please add support for serving /metrics (and /health) and the application resources on different ports. E.g. /metrics is available on localhost:9000/metrics while simultaneously the application is available on localhost:8080/.

Implementation ideas

aresmallrye kinenhancement triaginvalid

Most helpful comment

I think there are 2 potential reasons:

  • /metrics and /health return information on internal frameworks used (database type, etc), which can make easier for an attacker to exploit known vulnerabilities.
    This could probably be mitigated by restricting these path in ingress/route configuration ?

  • sharing the same thread pools as the one used for actual traffic. In case of spike of traffic (or undersized pool), the metrics and health request may get queued, and you could lose your monitoring when it matters the most, and/or pod gets killed which cascades to other pods.
    I'd imagine this is more a concern for synchronous RESTEasy endpoints (or undertow Servlets), and less for asynchronous.

All 19 comments

@tobiasstadler can you please elaborate on why you think this is necessary?

My guess is that exposing /metrics and /health in a public port would not be interesting in some cases. I remember reading a discussion about this some time ago in the ML

One easy solution is to add an external HTTP Server acting as a proxy and filter requests to these endpoints

This would be quite complicated to do I think, it would need a new vert.x HttpServer, and the Undertow equivalent, probably with completely separate configuration, HTTPS support, security settings, etc.
So I'm not sure if it's really worth it. If access to the metrics/health endpoints needs to be controlled, that can be done by applying security settings on those paths, or, as @gastaldi said, using a proxy (I suppose Kubernetes must have something for this)

Given the work this probably needs (not to mention the additional runtime overhead), I would also really like to know why this approach is used in practice.

My guess is that exposing /metrics and /health in a public port would not be interesting in some cases. I remember reading a discussion about this some time ago in the ML

yes, that is my intent.

Of course, I could do that with a HTTP proxy, but from an user/operator perspective it is easier to "just" set some config values.

I think there are 2 potential reasons:

  • /metrics and /health return information on internal frameworks used (database type, etc), which can make easier for an attacker to exploit known vulnerabilities.
    This could probably be mitigated by restricting these path in ingress/route configuration ?

  • sharing the same thread pools as the one used for actual traffic. In case of spike of traffic (or undersized pool), the metrics and health request may get queued, and you could lose your monitoring when it matters the most, and/or pod gets killed which cascades to other pods.
    I'd imagine this is more a concern for synchronous RESTEasy endpoints (or undertow Servlets), and less for asynchronous.

Yes, actually I ended up finding this issue because of use case Nb 2 of @rquinio
That is : with K8s I want be able to distinguish between : I have a spike of traffic VS I have a liveness issue. Else I must set the K8s liveness timeout greater than the user app request timeout, and I certainly don't want to have false positive on liveness (in particular if it turns out that I don't have enough CPUs to handle all user requests, if my pods are killed that's going to add insult to injury)

FYI : Wildfly doesn't have the problem because health end-points are on the management listener if I'm not mistaken.

In addition to the security aspect, exposing metrics to the world when the application is horizontally scaled, just doesn't make sense, unless you can pin requests to specific nodes hosting the application.
When it comes to implementation, you don't necessarily need to have completely separate httpservers with completely different configurations. You could have one server listening on multiple ports and just verify the local port on the "internal" resources. This would expose the application endpoints on the metrics port, but that's much less of a problem than exposing internal resources to the world

When it comes to implementation, you don't necessarily need to have completely separate httpservers with completely different configurations. You could have one server listening on multiple ports and just verify the local port on the "internal" resources.

@sigmunau, I'm not that familiar with Vert.x, but is that really possible? Looking at the javadoc at https://vertx.io/docs/apidocs/io/vertx/core/http/HttpServer.html I don't see anything suggesting that one HttpServer can be listening on multiple ports. Or what else did you mean by that?

@jmartisk I guess I was a bit optimistic. It's unclear to me whether or not you can call listen() multiple times, but the fact that it has an actualPort() method suggests that you can't. So just disregard my implementation suggestion.

IMO, there is general consensus that you don’t put a quarkus (or wildfly or whatever) thing in an internet-facing role. There is always something in between (usually a caching proxy or other gateway). Restricting access to ports (edit: I meant endpoints) should be done at that layer.

If you end up using “one listener for multiple ports” .. then the “monitoring traffic vs. other traffic” notion is done anyway. You’re always down to one pool of threads.

While you might get a “good feeling” from running on a different port, it isn’t really bringing you any benefit. More interesting (yet still light-weight) things can be done with secure cookies (or similar) to make sure connections from outside (or inside) are or are not able to directly access endpoints.

@geoand @gastaldi any suggested workarounds? Here is how Micronaut provide the functionality
https://docs.micronaut.io/latest/guide/management.html

Management Port
By default all management endpoints are exposed over the same port as the application. You can alter this behaviour by specifying the endpoints.all.port setting:

endpoints:
    all:
        port: 8085
In the above example the management endpoints will be exposed only over port 8085.

Description
Please add support for serving /metrics (and /health) and the application resources on different ports. E.g. /metrics is available on localhost:9000/metrics while simultaneously the application is available on localhost:8080/.

Implementation ideas

@tobiasstadler what was your workaround?

We don't have that capability currently and TBH I don't remember others asking for it

@sirAlexander nginx in front of quarkus

There is good points to extract management endpoint:

  • security
  • resource sharing

It is common for JVM Cloud-native frameworks to expose distinct management (Spring Boot, Helidon, Micronaut etc) endpoint. Only for Quarkus we must implement workaround on incoming proxy.
It is produce annoyance for operations team - they should have sacral knowledge that service is Quarkus-base and it must be specially exposed to customers. Recently we receive security alert from our b2b partner about sensitive info on one endpoint...

IMO management endpoint should have distinct port and thread pool (1 thread usually). No-one in JVM world is complain on JMX threads. No complains on Jolokia too. Why it is bad design for Quarkus?

P.S.: Securing management endpoint (like https://edwin.baculsoft.com/2020/08/securing-quarkus-metric-api/) solves security point but not resources point.

I'm not convinced that securing access to the metrics and health endpoint solves the problem. Both health and metrics endpoint would still be accessible at the application context, which I believe is not the desired behavior.

Creating a separate HTTP Server for the management endpoints makes more sense to me. This way I could bind it to 127.0.0.1 and make is accessible from the localhost only. The application HTTP Server could be bound to 0.0.0.0. This approach has been successfully used in Wildfly.

Closing this issue as it's being tracked to all non application endpoints and not just metrics/health in https://github.com/quarkusio/quarkus/issues/13602 already

Was this page helpful?
0 / 5 - 0 ratings