Another thing would be to also expose metrics about the promtail exporter. Such as input records, output errors etc.
@ventris promtail already includes metrics about itself.
@ventris we've exposed a bunch of metrics from promtail already, and have ideas for more (see #327). If you have any suggestions, let me know! We also have example dashboards from promtail here: https://github.com/weaveworks/common/pull/146
@SuperQ we're starting to think about where to expose metrics from, and supporting mtail like usecase is top of our list. We're also looking at https://github.com/fstab/grok_exporter, have you seen that?
Open questions:
promtail seems obvious, but we also support fluentd-based ingestion, and would be nice to grab metrics from that too. This implied doing it in Loki might be preferred. Perhaps we can do it in both?count(logs{job="foo"}[1m] | "^bar$") as a recording rule of the number of records which match "^bar$" in the last 5mins, correctly propagating labels etc...@slim-bean is thinking about this too.
@tomwilkie Yes, I've seen the grok_exporter as well.
My usual suggestion for where to produce metrics from logs is at the closest point to ingestion. To me it makes sense to do this in promtail, as it is reading, tagging, and compressing logs. I don't think it would be good to waste CPU time decompressing logs just to process them for metrics. Also doing things in promtail makes things scale better. Doing things in the Loki service means having to scale it up much more to handle the processing.
I personally like the mtail way of being able to define extractions via regexp. There are lots of cases where the log lines also contain numbers to extract.
With the 0.1.0 release there is included a pipeline which allows for extracting data from log messages using regex/json and further creating metrics from this data
Most helpful comment
@tomwilkie Yes, I've seen the grok_exporter as well.
My usual suggestion for where to produce metrics from logs is at the closest point to ingestion. To me it makes sense to do this in promtail, as it is reading, tagging, and compressing logs. I don't think it would be good to waste CPU time decompressing logs just to process them for metrics. Also doing things in promtail makes things scale better. Doing things in the Loki service means having to scale it up much more to handle the processing.
I personally like the mtail way of being able to define extractions via regexp. There are lots of cases where the log lines also contain numbers to extract.