Prometheus: PromQL: Parentheses should work around any node type, in all contexts

Created on 10 Jan 2020 · 55Comments · Source: prometheus/prometheus

EDIT: This should be a syntax error instead of being allowed.

ORIGINAL:

Parentheses should always just act like a no-op.

But they lead to query errors when used around some function parameters, like:

rate((foo[5m]))

...or:

label_join(foo, ("foo"), "bar")

This leads to errors like:

Error executing query: unexpected error: interface conversion: promql.Expr is *promql.ParenExpr, not *promql.StringLiteral

The reason is that some function implementations assume concrete implementation types, like here: https://github.com/prometheus/prometheus/blob/64194f7d45cbb7fa2f41c8f219706308f65d511a/promql/functions.go#L66 and https://github.com/prometheus/prometheus/blob/64194f7d45cbb7fa2f41c8f219706308f65d511a/promql/functions.go#L754-L755.

Currently we don't have any ways of generating non-literal strings or matrices without matrix selectors, so I think we only need to ensure that parentheses are skipped around these child nodes before trying to use them.

componenpromql help wanted kinbug kinenhancement low hanging fruit

Source

juliusv

All 55 comments

I disagree with this, the only place where () make sense is whith subqueries and ops.

roidelapluie on 10 Jan 2020

I can do a lot of things with parentheses that don't make sense, like quantile((1), foo) or ((((rate(foo[5m]))))), so I don't see why those would work, but the other situations shouldn't. At the very least, it should be a syntax error if we don't want to allow it, it shouldn't just be an unexpected crash somewhere in the query.

juliusv on 10 Jan 2020

() for scalar makes sense so you can do (1+1)*2

((rate(5m]))) also makes sense because you could do (3+(backend_enabled*rate(5m]))..
it is really different from rate((((xx[5m])))).

If we allow (vector) to be converted to vector, then rate((x)[4m]) could become valid? That seems really strange to me.

roidelapluie on 10 Jan 2020

I'm with @roidelapluie on this one.

brian-brazil on 10 Jan 2020

() for scalar makes sense so you can do (1+1)*2

((rate(5m]))) also makes sense because you could do (3+(backend_enabled*rate(5m]))..
it is really different from rate((((xx[5m])))).

Hm, IMO just because no non-function operations are defined with strings or matrices as operands doesn't mean that parentheses should not work. But if we do not want to allow them, then it should be a syntax error.

If we allow (vector) to be converted to vector, then rate((x)[4m]) could become valid? That seems really strange to me.

That'd at least be a syntax error already, as a matrix selector is its own node type.

juliusv on 10 Jan 2020

If we allow (vector) to be converted to vector, then rate((x)[4m]) could become valid? That seems really strange to me.

That'd at least be a syntax error already, as a matrix selector is its own node type.

@juliusv not since #6590

roidelapluie on 10 Jan 2020

@juliusv not since #6590

Yeah, wasn't aware that changed visible parsing behavior. Now it yields "invalid parameter 'query': 1:12: parse error: ranges only allowed for vector selectors". Actually if it is a full child now, I'm not even against allowing this kind of expression. But I'm sure you two would disagree :)

juliusv on 10 Jan 2020

The thing is that you can not do:
"a"+"b"

or ({a="bc"} and {c="d"})[5m]

Unless that would be allowed (but what would it do ?), seems strange to add "useless" parenthesis.

roidelapluie on 10 Jan 2020

I guess it's a matter of taste or consistency. I don't care strongly about allowing it either, but I would prefer to allow no-op parens around any expression type (why special-case?). But anyway, it shouldn't crash so late in the game, it should be a syntax error then.

Funnily I checked our PromQL docs, and we have zero mentions of parentheses, so their behavior appears to be completely undefined from a spec kind of view.

juliusv on 10 Jan 2020

I get what you mean; indeed it should be a syntax error.

roidelapluie on 10 Jan 2020

@roidelapluie Ok, edited title to that effect.

juliusv on 10 Jan 2020

@juliusv not since #6590

Yeah, wasn't aware that changed visible parsing behavior. Now it yields "invalid parameter 'query': 1:12: parse error: ranges only allowed for vector selectors". Actually if it _is_ a full child now, I'm not even against allowing this kind of expression. But I'm sure you two would disagree :)

That wasn't changed by #6590, but already by #6548.

Note that parentheses are also currently completely fine with scalars.

I guess it's a matter of taste or consistency. I don't care strongly about allowing it either, but I would prefer to allow no-op parens around any expression type (why special-case?). But anyway, it shouldn't crash so late in the game, it should be a syntax error then.

I agree with that standpoint. IMHO the only function parentheses should serve is to indicate evaluation order. I don't really see a point in complicating this for both the users and the implementation.

Funnily I checked our PromQL docs, and we have zero mentions of parentheses, so their behavior appears to be completely undefined from a spec kind of view.

Having some kind of complicated rule set there, describing when parentheses are allowed would be quite weird.

slrtbtfs on 13 Jan 2020

Yeah, it's weird, because I can't think of any other language that doesn't allow parentheses around any node type, but then again there's no other language that has value types which don't have any operators defined on them at all (which introduce the necessity of parentheses). Still I find it surprising that parentheses are illegal just because incidentally a node type has no operators defined on it.

juliusv on 13 Jan 2020

The logic of providing a syntax error here would be quite complex.

We have a wide range of different syntax nodes and basically would have to specify which may be surrounded by parentheses.

Also, how do we deal with redundant parentheses that might increase readability, e.g. a+(b*c).

I'd rather propose to change to remove the promql.ParenExpr type from the AST and evaluate everything in parentheses to its inner expression.

This would provide a reasonably elegant solution to the problem that's quite easy to implement.

slrtbtfs on 13 Jan 2020

I'd rather propose to change to remove the promql.ParenExpr type from the AST and evaluate everything in parentheses to its inner expression.

Not sure if I understand you correctly, but wouldn't removing the parentheses from the AST completely make it impossible to render out again exactly what the user typed in (and interfere with positional metadata)?

juliusv on 13 Jan 2020

wouldn't removing the parentheses from the AST completely make it impossible to render out again exactly what the user typed in (and interfere with positional metadata)?

That's already the case, we lose extra whitespace for example.

brian-brazil on 13 Jan 2020

@brian-brazil but I thought the new parser work was there to start to retain positional information of all tokens (among other things).

juliusv on 13 Jan 2020

And if we have a formatting tool, we'd want it to be able to retain parentheses where a user put them, even if they were not strictly required (but useful for clarity)?

juliusv on 13 Jan 2020

Positional information in the original string is not the same as being able to render exactly what the user typed from the AST.

brian-brazil on 13 Jan 2020

Why wouldn't it be (modulo maybe some casing around sum / SUM or so), unless you throw some tokens away / don't keep their positional information? Anyway, I think we'll want to be able to do that.

juliusv on 13 Jan 2020

We lose all the whitespace for one. If we're rendering a canonical output we'll only use the AST, not any other information from the original parsing - just like we do today.

brian-brazil on 13 Jan 2020

I feel whitespace is ok to lose in that case, but not even retaining the ability to preserve parentheses would be sad.

juliusv on 13 Jan 2020

And if we have a formatting tool, we'd want it to be able to retain parentheses where a user put them, even if they were not strictly required (but useful for clarity)?

With that goal in mind, removing the ParenExpr type is indeed not the best idea.

slrtbtfs on 13 Jan 2020

Yeah, I would keep parentheses as AST nodes, but just completely ignore them for evaluation (always dig out the first non-paren node within them).

juliusv on 13 Jan 2020

That will probably imply string support in promql.

e.g. "ok" would become a valid promql expression that will return a string; because that is the issue there.

(1+2) returns a scalar
(a) returns a vector
(a[90m]) returns a matrix
but ("xx") would be new and return a string

roidelapluie on 13 Jan 2020

All of these are already valid PromQL expressions, however some of them will cause a panic in the engine.

slrtbtfs on 13 Jan 2020

Yeah, they are already valid expressions, they just cause a panic when wrapped in parens when used in certain contexts (e.g. rate((foo[5m])) panics while (foo[5m]) works fine, without rate() wrapped around it), or when they are the root node (just "foo"). The latter should also just work.

juliusv on 13 Jan 2020

Btw., Prometheus 1.x was able to evaluate "foo" just fine and show the result as a string in the data table in the UI. That broke somewhere in 2.x, which now panics for that case.

juliusv on 13 Jan 2020

Interesting.

~However I don't think we should revert that back since there likely is a lot of other software relying on the current behaviour.~

slrtbtfs on 13 Jan 2020

I can't see how that would break anyone... if anyone is relying on the fact that root string nodes cause a query error, that would be a very weird use case.

juliusv on 13 Jan 2020

@brian-brazil @roidelapluie

Do you have a good proposal, for which cases exactly parentheses should be allowed?

I personally don't have a good idea here, except always allowing them.

slrtbtfs on 13 Jan 2020

If allowing there everywhere does not cause harm, I propose to allow them everywhere then? It is a switch from my initial position but I think it can make sense.

Note: it does not mean that we need to document it clearly. or encourage it.

roidelapluie on 13 Jan 2020

Scalars and instant vectors is where I'd expect it to work.

brian-brazil on 13 Jan 2020

@brian-brazil I still expect parens to work as a no-op around any type (in any language, ever).

Even if we decided to keep the special case to not allow them around certain node types, then rate((foo[5m])) panicking vs. (foo[5m]) being ok still seems broken.

juliusv on 13 Jan 2020

👍1

@juliusv How about foo[(5m)] :smiling_imp:

slrtbtfs on 13 Jan 2020

Are durations a node type now? With associated value type (string/scalar/vector/matrix)? 😅 If not, then no...

juliusv on 13 Jan 2020

No, but their own type in the grammar.

I was just taking your no-op statement literally. :smiley:

slrtbtfs on 13 Jan 2020

Yep, only around any valid PromQL expressions, not syntactic sub-elements :)

juliusv on 13 Jan 2020

The only parser change then would be allowing expressions like (foo)[5m], which I'm not sure is a good idea, everything else in the parser should already be as you described.

The hard part will be fixing the engine.

slrtbtfs on 13 Jan 2020

Scalars and instant vectors is where I'd expect it to work.

If we disallow it for range vectors now, this would break existing queries.

slrtbtfs on 13 Jan 2020

👍1

The only parser change then would be allowing expressions like (foo)[5m], which I'm not sure is a good idea, everything else in the parser should already be as you described.

Hmm, I'm wondering if from the user's perspective, foo[5m] should now be exposed as two separate nodes, just because of recent internal changes. So far it had always just been a single node.

juliusv on 13 Jan 2020

The issue I see with allowing parentheses here is that then it would be harder to explain, why (foo)[5m] is allowed, but (foo + bar)[5m] isn't.

slrtbtfs on 13 Jan 2020

Yeah, exactly. I wouldn't want to allow parentheses there either, and not see them as two independent nodes at all.

juliusv on 13 Jan 2020

So IMHO the sanest proposal right now seems to be changing nothing in the parser but changing the engine to always handle ParenExpr as a no-op.

slrtbtfs on 13 Jan 2020

Yep!

juliusv on 13 Jan 2020

6610 does already part of that, so it would make sense to just extend this PR.

slrtbtfs on 13 Jan 2020

👍1

Sounds good to me!

juliusv on 13 Jan 2020

@juliusv Would make sense to change back the issue title then.

slrtbtfs on 14 Jan 2020

@slrtbtfs Thanks, changed it to a newer more appropriate wording.

juliusv on 14 Jan 2020

The new wording implies (foo)[5m] should be legal.

We decided earlier that this should not be the case.

slrtbtfs on 14 Jan 2020

@slrtbtfs Ah I'm still thinking from the user's perspective, where foo[5m] should just be seen as one node and not two independently evaluated ones. Not sure how to word it then... but I guess it may be fine?

juliusv on 14 Jan 2020

Yes, I guess we can leave it like that.

slrtbtfs on 14 Jan 2020

This issue can be closed, as it is fixed in #6612.

slrtbtfs on 23 Jan 2020

👍1

Hello guys, could you explain if this fix will allow the following:

I have CPU usage as a range-vector and at the same time I have pod labels as just labels.
I would like to combine 2 queries for being able to do the following:

avg(
  rate((
    max(kube_pod_labels) by (pod, label_app)
    * on(pod) group_right(label_app) 
    container_cpu_usage_seconds_total{container!="POD", image!=""}
  )[5m])
) by (label_app, pod, container, namespace)

Also sometimes I would just want to have the graph for specific application:

avg(
  rate((
    max(kube_pod_labels{label_app="some_label") by (pod, label_app)
    * on(pod) group_right(label_app) 
    container_cpu_usage_seconds_total{container!="POD", image!=""}
  )[5m])
) by (label_app, pod, container, namespace)

Currently it is not possible. The closes solution is move the group path outside, but it will lead to decrease of performance. Second one is to use subquery but somehow it produces incorrect result.

sshishov on 12 Apr 2020

@sshishov Thanks for your report. It looks as if this is actually a question about usage and not development.

It makes more sense to ask questions like this on the prometheus-users mailing list rather than in a GitHub issue. On the mailing list, more people are available to potentially respond to your question, and the whole community can benefit from the answers provided.

roidelapluie on 12 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Prometheus 2.2.0 missing metrics after upgrade

hgranillo · 59Comments

External Labels not available in alert Annotations

byrnedo · 52Comments

Prometheus version 2.0 memory leak ?

saady · 53Comments

Remote storage

juliusv · 170Comments

Timezone support in the display layer

LordFPL · 50Comments