Caddy: Support per-host Prometheus metrics

Created on 8 Oct 2020  ยท  8Comments  ยท  Source: caddyserver/caddy

Initiated through this forum post โ€“ involving @hairyhenderson โ€“, I'd like to open the discussion about extending the metrics directive to support metrics (request count, etc.) on a per-host basis.

I'd like to have metrics exposed for every single site block (speaking in Caddyfile terminology), i.e. metrics about different sites / vhosts result in different Prometheus metrics labels.

Since determining the labels automatically based on the incoming requests' host headers is problematic (see discussion above), a solution would be to manually assign a tag to the metrics directive. For instance:

example.org {
    root /var/www/html
    file_server
    metrics /metrics {
        host "example.org"
    }
}

I'm not really involved with Caddy's code base or development process, so this is rather supposed to be a starter for further discussion.

discussion feature request

Most helpful comment

Apologies @rgdev, and everyone else! I (obviously) wasn't able to get to this over the holidays ๐Ÿ˜…... I've recently started a new job and so am _still_ swamped.

But! I now have a post-it-note on my screen telling me to work on this. So, that'll probably help ๐Ÿ˜‰

All 8 comments

Thanks for logging this @muety!

Just to bring the goal into view, the v1 prometheus module had the ability to set a value to be used by the host label, which effectively solved the cardinality problem (i.e. the fact that there isn't a bounded set of values that can be used if it's determined automatically by looking at the Host header).

Caddy v2 is quite different from v1, and one of those differences is that request matchers are used extensively, and now the hostname in the site block is actually just a different kind of matcher.

This is maybe best explained by using caddy adapt to convert the following Caddyfile into the canonical JSON format:

example.com {
  respond /foo "hello world"
}
example2.com {
  respond /bar "hello world"
}
example2.com:8443 {
  respond /baz "hello world"
}

Here's the result of running caddy adapt:

{
  "apps": {
    "http": {
      "servers": {
        "srv0": {
          "listen": [ ":443" ],
          "routes": [ {
              "match": [ { "host": [ "example2.com" ] } ],
              "handle": [ {
                  "handler": "subroute",
                  "routes": [ {
                      "handle": [ { "body": "hello world", "handler": "static_response" } ],
                      "match": [ { "path": [ "/bar" ] } ]
                    } ]
                } ],
              "terminal": true
            },
            {
              "match": [ { "host": [ "example.com" ] } ],
              "handle": [ {
                  "handler": "subroute",
                  "routes": [ {
                      "handle": [ { "body": "hello world", "handler": "static_response" } ],
                      "match": [ { "path": [ "/bar" ] } ]
                    } ]
                } ],
              "terminal": true
            } ]
        },
        "srv1": {
          "listen": [ ":8443" ],
          "routes": [
            {
              "match": [ { "host": [ "example2.com" ] } ],
              "handle": [ {
                  "handler": "subroute",
                  "routes": [ {
                      "handle": [ { "body": "hello world", "handler": "static_response" } ],
                      "match": [ { "path": [ "/baz" ] } ]
                    } ]
                } ],
              "terminal": true
            } ]
        }
      }
    }
  }
}

So right now, the metrics support knows about the server (by name, i.e. srv0/srv1) and the handler (by module name, i.e. subroute, static_response, etc...). But there is not yet any support for matchers.

Since matchers are evaluated late (i.e. only at request time), and the only thing that matters is _whether_ a request was matched (rather than _how_ it was matched), it's impossible with the current code to be able to determine which host or path was matched in the metrics.

This is all a long-winded way of saying that I _think_ what we need here is some sort of integration into the different matchers so that we can track the host label for the given host matcher.

The advantage of this approach is that we could use a similar approach to track a path label too.

Anyway, this is just a vague brain-dump of possibilities ๐Ÿ˜‰

In the meanwhile, I set up grok_exporter as a workaround to read Caddy's log files and expose metrics from those to Prometheus. However, having them provided directly by Caddy and in greater detail is still desirable :)

I just wanted to set up Prometheus (haven't looked at it once before today, so I am quite a newbie) and ran into this issue. I wanted to add my own perspective.

For me, a single, static "host" option I can set manually would be enough.

But if cardinality is no issue, then I would really like to go all the way and be able to use a string with placeholders (e.g. https://caddyserver.com/docs/caddyfile/concepts#placeholders) to modify the label (or possibly multiple labels). Of course, this could incur significant runtime costs and probably not all placeholders can be applicable for each metric. I haven't checked the last part though.

PS: Of course, this is also just a braindump, I don't know if this is feasible.

Would love support for this, but I see the problem pointed out by @hairyhenderson.

However, I wonder if it makes sense to add a generic tags to pass onto metrics?

This would allow you to set something like the following:

example.com {
  respond /foo "hello world"
  tags {
    service foo
    host example2
    environment testing  
  }
}
example2.com {
  respond /bar "hello world"
  tags {
    service bar
    host example2
    environment hello  
  }
}
example2.com:8443 {
  respond /baz "hello world"
  tags {
    service baz
    host example2
    environment testing  
  }
}

This would allow fine grain queries on the metrics or during an ETL process like @muety mentioned.

Maybe as a stop-gap solution there can be just boolean option to include per-host metrics. yes, i understand the consequences, just give me the hosts. i spent a week for porting old v1 plugin and it's kind of works stand-alone, but crashing constantly with caddy-docker-proxy, so it would be great to have official solution.
i'm using caddy for TLS termination for old legacy microservice application build around nginx. i want to migrate these nginx configs to caddy, but i want to gather metrics to guide the transition. For now i'm using grok_exporter, but it's cumbersome, because i need to store logs and parse them to get this logs.

Thanks for the input @maroziza - understood! The last few months have been very busy for me so I haven't been able to spend much time working on Caddy. I'll have some time off in the next few weeks and hopefully I'll be able to get around to this!

any update on this ? The addition of per-host metrics would be a game changer for the caddy monitoring stack!

Apologies @rgdev, and everyone else! I (obviously) wasn't able to get to this over the holidays ๐Ÿ˜…... I've recently started a new job and so am _still_ swamped.

But! I now have a post-it-note on my screen telling me to work on this. So, that'll probably help ๐Ÿ˜‰

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ericmdantas picture ericmdantas  ยท  3Comments

billop picture billop  ยท  3Comments

dafanasiev picture dafanasiev  ยท  3Comments

mikolysz picture mikolysz  ยท  3Comments

mholt picture mholt  ยท  3Comments