In many many places, we make sure that the metric has thanos prefix even though it comes from different packages outside of Thanos, e.g: https://github.com/thanos-io/thanos/blob/8f492a9f073f819019dd9f044e346a1e1fa730bc/cmd/thanos/compact.go#L246
This is wrong. I just learned that recently, because we cannot reuse the same dashboards and alerts as the same binaries (e.g Prometheus) that uses exactly the same code.
This problem was also described by Brian long time ago https://www.robustperception.io/target-labels-not-metric-name-prefixes
Let's fix it! This will be breaking change for ppl so let's be careful to mention this in CHANGELOG and update mixins (:
AC:
We had a good offline conversation with @pracucci and these are some takeaways:
thanos_ and think that only those are from Thanos (it's not true, probably it's only 70% of it as we are still inconsistent).To me:
Point (1) gives extreme benefits. IMO it is extremely important to maintain the old system metric name. There is nothing worse than misunderstanding what you are looking on. To fix point (2), I would suggest extending UI in some way to include a quick option to look on metric names for certain job/pod label value e.g {__ name__=~".*" , job="thanos"} ... So users can quickly have the answer that question.
cc @brancz @brian-brazil @beorn7 any thoughts? :thinking: and @prmsrswt @juliusv for some UI idea!
2 is what's covered by the blog post, better tooling has always been the best way to approach that rather than misusing metric names as a crutch.
I think we need waaaaay more than one blog post. We don't want only advanced super experienced people to use Prometheus/Thanos (and I mean USE in terms of use metrics from those, not operating those). So, we need to be clearer and louder than this. More docs, examples, talks, tools would definitely will help.
So I assume such UI enhancement might make sense? :thinking:
Yeah, in general it would be good to have a metrics exploration UI, not just for this use case. I'd imagine you would allow the user to provide a number of constraints (label matchers) via a nice UI, and then you can show either just metric names, or full series that are available under those constraints.
For your use case you basically want an easy UI shortcut to do count by(__name__) ({job="thanos"}).
Yeah, a metrics exploration UI seems like a nice idea. From what I understand we need a quick shortcut to view all Thanos metrics, using the labels.
Hello 馃憢 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 馃
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.
_Still valid_
Hello 馃憢 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 馃
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.
_Still valid_
Hello 馃憢 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 馃
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.
_Still valid_
I agree. So for the fetcher metrics https://github.com/thanos-io/thanos/blob/master/pkg/block/fetcher.go#L77, maybe we can add thanos_ prefix directly to metric name, not using extprom.WrapRegistererWithPrefix? WDYT
We use them all over the place. So general switch would be needed, very painful to users. Anyway, we at least should not repeat this problem for new metrics.
Before we switch fully we need to think about discoverability problem as well.
Hello 馃憢 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 馃
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.
Hello 馃憢 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 馃
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.