Jq: A high-precedence pipe operator would be nice

Created on 31 Dec 2019  路  22Comments  路  Source: stedolan/jq

This feels nice, though perhaps the new operator should be <| or <|> or alike:

$ ./jq -n 'true and 5|if .==5 then true else false end')
false
$ ./jq -n 'true and 5>|if .==5 then true else false end'
true
$ 
diff --git a/src/lexer.l b/src/lexer.l
index c35fceb..c472f0f 100644
--- a/src/lexer.l
+++ b/src/lexer.l
@@ -62,6 +62,7 @@ struct lexer_param;
 "label" { return LABEL; }
 "break" { return BREAK; }
 "__loc__" { return LOC; }
+">|" { return HIGHPRECPIPE; }
 "|=" { return SETPIPE; }
 "+=" { return SETPLUS; }
 "-=" { return SETMINUS; }
diff --git a/src/parser.y b/src/parser.y
index bdb281f..81aa5f8 100644
--- a/src/parser.y
+++ b/src/parser.y
@@ -76,6 +76,7 @@ struct lexer_param;
 %token LABEL "label"
 %token BREAK "break"
 %token LOC "__loc__"
+%token HIGHPRECPIPE ">|"
 %token SETPIPE "|="
 %token SETPLUS "+="
 %token SETMINUS "-="
@@ -112,6 +113,7 @@ struct lexer_param;
 %precedence '?'
 %precedence "try"
 %precedence "catch"
+%right HIGHPRECPIPE


 %type <blk> Exp Term
@@ -449,6 +451,10 @@ Exp '|' Exp {
   $$ = block_join($1, $3);
 } |

+Exp ">|" Exp {
+  $$ = block_join($1, $3);
+} |
+
 Exp ',' Exp {
   $$ = gen_both($1, $3);
 } |
feature request

Most helpful comment

Putting aside my subjective attitude to the proposed syntax, I would like to say that the example doesn't provide a good motivation for the feature in general.

The second command which correctly produces the expected result doesn't look any more readable to me than a classic parenthesized solution:

$ ./jq -n 'true and 5>|if .==5 then true else false end'

vs

$ ./jq -n 'true and (5|if .==5 then true else false end)'

IMO, the latter version is much clearer and as such, preferred.

All 22 comments

Putting aside my subjective attitude to the proposed syntax, I would like to say that the example doesn't provide a good motivation for the feature in general.

The second command which correctly produces the expected result doesn't look any more readable to me than a classic parenthesized solution:

$ ./jq -n 'true and 5>|if .==5 then true else false end'

vs

$ ./jq -n 'true and (5|if .==5 then true else false end)'

IMO, the latter version is much clearer and as such, preferred.

I've also seen ... | [foo | bar, baz] surprising users.

@vdukhovni tells me he has to use parenthesis way too much and the he'd like such an operator. He points out that F# has a |> function composition operator that has the opposite operand order of the Haskell composition operator, and that some haskellers adopt F#'s |>, and that it would be just familiar enough. Also, |> has a "pipe to this specific thing here" feel. So I'm inclined to go with |>.

Speaking of Haskell, (>>>) is the reverse of function composition (.). Also this topic reminds me the monad operators ((>>=), (>>)).

Yes, []| is like Haskell's >>=.

Speaking of Haskell, (>>>) is the reverse of function composition (.). Also this topic reminds me the monad operators ((>>=), (>>)).

Sure, but I think Nico is looking for flipped function application, not flipped function composition. The Haskell operator for that (from Data.Function) is &(flipped $), but the F# equivalent seems more mnemonic for jq.

Perhaps the main critique is that it may not be clear a-priori which has the higher fixity | or |>. In e.g. select( x |> f and y |> g) vs. select( x | f and y | g), other than of course we already know that | has a low fixity in jq, so |> would have "greater" fixity. (That mnemonic device could be part of the documentation).

|+ might be more mnemonic: "like |, but more better".

|+ might be more mnemonic: "like |, but more better".

That's for you guys to decide, you know the jq ecosystem better than I. My take is that |> being familiar from F# is a strong argument in its favour over |+. But your bikeshed...

Also |> looks like a sort of arrow.

And by the way, why "right associative"? I'd expect:
x |> f1 |> f2 to mean ((x | f1) | f2) not (x | (f1 | f2))...

OK, I will accept that there are use cases where high priority piping operator is needed

Playing with <, >, |, I definitely prefer |> over other combinations ~, or <|, given the latest comment from @vdukhovni we may want to introduce different associativity types.~

However, just to use my 2 cents, I'd like to throw ! into the picture. It seems relevant to me since it resembles the pipe itself, but it's also an exclamation point, emphasizing the operation.

x ! f1 ! f2 or
x |! f1 |! f2 (i don't like this option)

example:

> ./jq -cn 'def x2: .*2; [ 1 , 2 , 3 | x2 ], [ 1 , 2 , 3 ! x2 ]'
[2,4,6]
[1,2,6]

@vdukhovni

And by the way, why "right associative"? I'd expect:
x |> f1 |> f2 to mean ((x | f1) | f2) not (x | (f1 | f2))...

I believe that piping operation has associative property, i.e. (a | b) | c === a | (b | c), hence left or right is not relevant here. The only thing that matters is the precedence of this operator over the other operators and control structures.
Having that said, for the sake of coherency the associativity is set to be "right" like it is for the pipe itself

I believe that piping operation has _associative property_, i.e. (a | b) | c === a | (b | c), hence left or right is not relevant here. The only thing that matters is the precedence of this operator over the other operators and control structures.

@nicowilliams , if the above is true we may actually want to reflect that by changing %right to %precedence for both | and HIGHPRECPIPE operators.

I've been meaning to use ! as the opposite off ? for indexing operations: they thing must exist, null won't be produced instead, and you'll get an error. Then .a!? -> empty if . has no "a", while .a! -> error if . has no "a". Hmmm.

like in Kotlin, force unwrapping an optional could be !!

I believe that piping operation has _associative property_, i.e. (a | b) | c === a | (b | c), hence left or right is not relevant here. The only thing that matters is the precedence of this operator over the other operators and control structures.
Having that said, for the sake of coherency the associativity is set to be "right" like it is for the pipe itself

Oh, I forgot that in jq all expressions are implicit functions of ., and | is therefore composition as much as it is application. While application is not (fully) associative, composition is.

In Haskell & has to be left associative, its type is a -> (a -> b) -> b so the next function in consecutive application is a (b -> c) and one cant apply a b -> c directly to an (a -> b), so & is left-associative (while $, its flipped partner is right-associative).

In jq it logically does not matter: a | (b | c) is indeed the same as ( a | b ) | c. Of course ultimately one often can't evaluate (b | c) without knowing a when both depend on ., so left to right evaluation seems to be more natural, but if for some reason AST construction prefers right to left, with a evaluated via a single thunk, that is fine. Whatever fits the parser's design best...

Right-associative is as I found it. I get a deja vu feeling, like I noticed the silliness of it, but ignored it.

One might argue that right associative is easier to reason about/work with when expressions short-circuit:

empty | (b | c)

backtracks directly just once, while conceptually`

(empty | b) | c

has to escape two composition contexts?

I'm not sure if this is a good test but looks like it still doesn't matter to JQ
(the case of empty you've suggested is trivial cause the program actually exits in two cycles)


a | (b | c)

> ./jq -n --debug-dump-disasm --debug-trace=all 'def a: .+1; def b: .+2; def c: .+3; a | (b | c)'
0000 TOP
0001 CALL_JQ a:0
0005 CALL_JQ b:1
0009 CALL_JQ c:2
0013 RET
a:0:
  0000 PUSHK_UNDER 1
  0002 DUP
  0003 CALL_BUILTIN _plus
  0006 RET
b:1:
  0000 PUSHK_UNDER 2
  0002 DUP
  0003 CALL_BUILTIN _plus
  0006 RET
c:2:
  0000 PUSHK_UNDER 3
  0002 DUP
  0003 CALL_BUILTIN _plus
  0006 RET

0000 TOP
0001 CALL_JQ a:0    null
0000 PUSHK_UNDER 1  null
0002 DUP    null || 1
0003 CALL_BUILTIN _plus null | null | 1
0006 RET    1
0005 CALL_JQ b:1    1
0000 PUSHK_UNDER 2  1
0002 DUP    1 || 2
0003 CALL_BUILTIN _plus 1 | 1 | 2
0006 RET    3
0009 CALL_JQ c:2    3
0000 PUSHK_UNDER 3  3
0002 DUP    3 || 3
0003 CALL_BUILTIN _plus 3 | 3 | 3
0006 RET    6
0013 RET    6
6
0013 RET        <backtracking>
0013 RET        <backtracking>



(a | b) | c

> ./jq -n --debug-dump-disasm --debug-trace=all 'def a: .+1; def b: .+2; def c: .+3; (a | b) | c'
0000 TOP
0001 CALL_JQ a:0
0005 CALL_JQ b:1
0009 CALL_JQ c:2
0013 RET
a:0:
  0000 PUSHK_UNDER 1
  0002 DUP
  0003 CALL_BUILTIN _plus
  0006 RET
b:1:
  0000 PUSHK_UNDER 2
  0002 DUP
  0003 CALL_BUILTIN _plus
  0006 RET
c:2:
  0000 PUSHK_UNDER 3
  0002 DUP
  0003 CALL_BUILTIN _plus
  0006 RET

0000 TOP
0001 CALL_JQ a:0    null
0000 PUSHK_UNDER 1  null
0002 DUP    null || 1
0003 CALL_BUILTIN _plus null | null | 1
0006 RET    1
0005 CALL_JQ b:1    1
0000 PUSHK_UNDER 2  1
0002 DUP    1 || 2
0003 CALL_BUILTIN _plus 1 | 1 | 2
0006 RET    3
0009 CALL_JQ c:2    3
0000 PUSHK_UNDER 3  3
0002 DUP    3 || 3
0003 CALL_BUILTIN _plus 3 | 3 | 3
0006 RET    6
0013 RET    6
6
0013 RET        <backtracking>
0013 RET        <backtracking>

Of course ultimately one often can't evaluate (b | c) without knowing a when both depend on .

I think that it actually can be evaluated if by evaluation we still mean finding a filter, not a value. Techincally, (b | c) is fully equivalent to some function f which evaluates b on its input and then c on the result:

jv f(input) { 
    return c(b(input));
}

diff shows no difference, indeed.

I think I'll settle for |>. Using ! for this and also !! for force-unwrap seems likely to be confusing. As @vdukhovni says, it's a bikeshed. Unless @wtlangford or @stedolan pipe up about it, I'm inclined to go with |>.

Fair enough. Let鈥檚 see what others have to say and if nothing then |> it is.
At the end of the day during editing one will just need to add or remove single char after the pipe to alter the result.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lhunath picture lhunath  路  3Comments

tischwa picture tischwa  路  4Comments

rclod picture rclod  路  4Comments

ghost picture ghost  路  4Comments

neowulf picture neowulf  路  3Comments