Julia: Custom infix operators

Created on 17 Jun 2016  ·  65Comments  ·  Source: JuliaLang/julia

There is a discussion at https://groups.google.com/forum/#!topic/julia-dev/FmvQ3Fj0hHs about creating a syntax for custom infix operators.

...

Edited to add note: @johnmyleswhite has pointed out that the comment thread below is an invitation to bikeshedding. Please refrain from new comments unless you have something truly new to add. There are several proposals below, marked by "hooray" emoticons (exploding cone). You can use those icons to skip discussion and just read the proposals, or to find the different proposals so you can vote "thumbs up" or "thumbs down".

Up/downvotes on this bug as a whole are about whether you think that Julia should have any custom infix idiom. Up/downvotes for the specific idea below should go on @Glen-O's first comment. (The bug had 3 downvotes and 1 upvote before that was clarified.)

...

Initial proposal (historical interest only):

The proposal that seems to have won out is:

    a |>op<| b #evaluates (in the short term) and parses (in the long term) to `op(a,b)`

In order to have this work, there are only minor changes necessary:

  • Put the precedence of <| above that of |>, instead of being the same.
  • Make <| group left-to-right.
  • Make the function <|(a,b...)=(i...)->a(i...,b...). (as pointed out in the discussion thread, this would have standalone uses, as well as its use in the above idiom)

Optional:

  • create new functions >|(a...,b)=(i...)->b(a...,i...) and |<(a,b...)=a(b...) with appropriate precedences and grouping.

    • Pipe first means evaluation, and pipe last maintains it as a function, while the > and < indicate which one is the function.

  • create new functions >>|(a...,b)=(i...)->b(i...,a...) and <<|(a,b...)=(i...)->a(b...,i...) with appropriate precedence and grouping.
  • create synonyms », , and(/or) pipe for |>; «, , and(/or) rcurry for <|; and(/or) lcurry for <<|; with the single-character synonyms working as infix operators.
  • create an @infix macro in base which does the first parser fix below.

Long term:

  • teach the parser to change a |>op<| b to op(a,b), so there's no extra overhead involved when running the code, and so that operators can actually be defined in infix position. (This is similar to how the parser currently treats the binary a:b and the ternary a:b:c differently. For maximum customizability, it should do this for matched synonyms, but not for unmatched synonyms, so that e.g. a |> b « c would be still be treated as two binary operators.)
  • teach the parser to understand commas and/or spaces so that the ellipses in the above definitions work as expected without extra parentheses.

(relates to https://github.com/JuliaLang/julia/issues/6946)

parser speculative

Most helpful comment

Stefan is not more senior than me.

All 65 comments

Echoing the julia-dev thread, I think it would be useful to quote Stefan's main comment on this proposal:

Just to set expectations here, I don't think there's going to be much in the way of "syntactic innovation" before Julia 1.0. (The only exception I can think of is the new f.(v) vectorized calling syntax.) While having some way of making arbitrary functions behave as infix operators might be nice, it's just not a pressing issue in the language.

As someone who's participated in a good proportion of the history of Julia development, I think it would be better to focus energy on semantic changes rather than syntactic ones. There are lots of extremely important semantic problems left to solve before Julia reaches 1.0.

Note in particular that implementing this feature isn't simply a one-off diff that only the author needs to think about: everyone will have to think about how their work interacts with this feature going forward, so the change actually increases the long-term workload of every person who works on the parser.

I think that johnmyleswhite's comments are very apropos regarding the "long term" parser changes suggested. But the "minor changes" and "optional" groups are, as far as I can see, pretty self-contained and low-impact.

That is: the parser changes needed to enable the minimal version of this proposal involve only precedence and grouping for normal binary operators, the kind of changes that are more-or-less routine in other cases. A parser developer working on something unrelated would no more need to keep track of this than they need to keep track of the meaning of all of the numerous already-existing operators.

Personally I find this syntax quite ugly and difficult to type. But I do agree it would be good to have more general infix syntax.

I think the right way to think about this is as a syntax-only issue: what you want is to use op with infix syntax, so defining other functions and operators to get that is roundabout. In other words it should all be done in the parser.

I would actually consider reclaiming | for this, and using a |op| b. Arguably general infix syntax is more important than bitwise or. (We've talked about reclaiming bitwise operators before; they do seem like a bit of a waste of syntax as it is.)

a f b is available outside of array concatenation and macro call syntaxes.

a f b might work, but it seems pretty fragile. Imagine trying to explain to somebody why a^2 f b^2 f c^2 is legal but a f b c and a+2 f b+2 f c+2 aren't. (I know, that last one assumes that the precedence is prec-times, but no matter what the precedence is, this general kind of thing is a concern).

As to a |op| b: initially I favored a similar proposal, a %op% b, as you can see in the google groups thread. But the nice thing about the proposed |> and <| is that they are each individually useful as binary operators, and they naturally combine to work as desired (given the right precedence and grouping, that is.) This means that you can implement this in the short term using existing parser mechanisms, and thus avoid creating headaches for parser developers in the future, as I said in my response to johnmyleswhite above.

So while I like a |op| b and certainly wouldn't oppose it, I think we should look for a way to have two different operators to simplify the required parser changes. If we're going for maximum typeability and not opposed to having | mean "pipe" rather than "bitwise or", then what about a |op\\ b or a |op& b?

"headaches for parser developers" is the lowest possible concern.

"headaches for parser developers" is the lowest possible concern.

As a parser developer, I unequivocally agree with this.

|> and <| are both perfectly good infix operators, but there is zero benefit to implementing general operator syntax using two other operators. And much more needs to be said on just how verbose and unappealing that syntax is.

there is zero benefit to implementing general operator syntax using two other operators.

To be clear, the long term vision here is that there would be binary f <| y, binary x |> f, and ternary x |> f <| z, where the first one is just a function but the second two are implemented as transformations in the parser.

The idea that this could be implemented using two ordinary functions |> and <| is just a temporary bridge to that vision.

And much more needs to be said on just how verbose and unappealing that syntax is.

That's a fair point. How about replacing |> and <| with | and &? They make sense both as a pair and individually, although they might be a bit jarring to a bit hockey player.

Stealing both | and & for this would not be a good allocation of ASCII, and I suspect many would prefer the delimiters to be symmetric.

If people want a x |> f <| y ternary operator for other reasons, that's fine, but I think it should be considered separately. I'm not sure the parser should transform |> to a flipped <|. Other similar operators like < don't work that way. But that's also a separate issue.

Stealing both | and & for this would not be a good allocation of ASCII, and I suspect many would prefer the delimiters to be symmetric.

OK.

I understand that > and < are hard to type. In terms of symmetry and typability on a standard keyboard, I guess the easiest might be something like &% and %&, but that's seriously ugly, R parallel or no. /| and |/ might be worth considering too.

...

I'm not sure the parser should transform |> to a flipped <|

I think you've misunderstood. a |> b should parse to b(a). (The version without special parsing would be ((x,y)->y(x))(a,b), which evaluates to the same thing, but with more overhead.)

a |> b should parse to b(a)

Ah, ok, got it.

I think that we could bikeshed about which characters to use for years. I'd trust @StefanKarpinski (as the most senior person in this conversation so far) to make a ruling, and I'd be fine with that. Even if it's something I've argued against (such as a f b.)

Here's some options to see what appeals:
a |>op<| b (leaving current |> unchanged)
a |{ op }| b (nearby and same shift state on many common keyboards, not too ugly. A bit strange as standalones.)
a \| op |\ b or a /| op |/ b or combinations thereof
a $% op %$ b (relatively typable, R-inspired. But kinda ugly.)
a |% op %| b
a |- op -| b
a |: op :| b
a | op \\ b
a | op ||| b
a op b

Stefan is not more senior than me.

Looks as if you just nominated yourself, then, for BDFL powers on this issue! ;)

a @op@ b ?

I guess my vote is to use all 4 of \|, |\, /|, and |/. Down for evaluation, up for currying; bar towards the function. So:
a \| f (or f |/ a) -> f(a)
a /| f (or f |\\ a) -> (b...)->f(a,b...)
f |\ b (or b //| f) -> (a...)->f(a...,b)
and thus:
a \| f |\ b (or a /| f |/ b) -> f(a,b)
a \| f |\ b |\ c (or a /| b /| f |/ c) -> f(a,b,c)

Each of the 4 main operators, except perhaps |/, is useful on its own. The redundancy would certainly be un-Pythonic, but I think that the logical neatness is Julian. And as a practical matter, you can use whichever version of the infix idiom you find easier to type; they are both equally readable, in that once you've learned one you naturally understand both.

Obviously, it would make equal sense if you swapped all slashes, so that up arrows were for evaluation and down for currying.

I'm still waiting for word from On High (and I apologize for my newbie clumsiness in guessing what that meant). But if anybody taller than this bikeshed makes a ruling, for this or any other version with at least two new symbols, I'd be happy to write a short term patch (using functions) and/or a proper one (using transformations).

We try to avoid having a BDFL to the extent possible :)

I just thought I'd note a few quick things.

First, the other benefit (the "standalone uses") of the notation that is being proposed is that <| can be used in other contexts, in a way that improves readability. For example, if you have an array of strings, A, and want to pad all of them on the left to 10, right now, you have to write map(i->lpad(i,10),A). This is relatively difficult to read. With this notation, it becomes map(lpad<|10,A), which I think you'll agree is significantly cleaner.

Second, the idea behind this is to keep the notation consistent. There's already a |> operator, which exists to change the "fix" of a function call from prefix to postfix. This just extends the notation.

Third, the possibility of using direct infix as a f b has a bigger problem. a + b and a * b would end up having to have the same precedence, since + and * are function names, and it would be infeasible for the system to have variable precedence. That, or it would have to treat existing infix operators differently, which could cause confusion.

For example, if you have an array of strings, A, and want to pad all of them on the left to 10, right now, you have to write map(i->lpad(i,10),A). This is relatively difficult to read. With this notation, it becomes map(lpad<|10,A), which I think you'll agree is significantly cleaner.

I emphatically do not agree. The proposed syntax is – forgive me – ASCII salad, verging on some of the worst offenses of Perl and APL, without precedent in other languages to give the casual reader a clue of what's happening. The current syntax, while a few characters longer (five?), is pretty clear to anyone who knows that i->expr is a lambda syntax – which it is in a large and growing set of languages.

a + b and a * b would end up having to have the same precedence, since + and * are function names, and it would be infeasible for the system to have variable precedence. That, or it would have to treat existing infix operators differently, which could cause confusion.

I don't think this is a real problem; we can just say what the precedence of a f b infix is, and keep all existing precedence levels as well. This works because precedence is determined by the name of the function; any function called "+" will have "+" precedence.

Yes, we already do this for the 1+2 in 1+2 syntax, and it hasn't been a problem.

I don't think this is a real problem; we can just say what the precedence of a f b infix is, and keep all existing precedence levels as well. This works because precedence is determined by the name of the function; any function called "+" will have "+" precedence.

I didn't mean it's difficult to write the parser to make it work. I meant it leads to consistency issues, hence me saying "or it would have to treat existing infix operators differently, which could cause confusion". Among other things, consider that ¦ and don't look all that different in concept, yet one is a predefined infix operator, while the other is not.

I emphatically do not agree. The proposed syntax is – forgive me – ASCII salad, verging on some of the worst offenses of Perl and APL, without precedent in other languages to give the casual reader a clue of what's happening. The current syntax, while a few characters longer (five?), is pretty clear to anyone who knows that i->expr is a lambda syntax – which it is in a large and growing set of languages.

Perhaps I should be clearer on what I'm saying. I'm saying that being able to describe the operation as "lpad by 10" is a lot clearer than i->lpad(i,10) makes it. And in my view, lpad<|10 is the nearest you can get to that, in a non-context-specific form.

Maybe it would help if I describe where I'm coming from. I'm a mathematician and mathematical physicist, first and foremost, and "lambda syntax", while sensible from a programming standpoint, isn't the clearest for those who are less experienced in programming. Julia is, as I understand it, primarily aimed at being a scientific computing language, hence the strong resemblance to MATLAB.

I must ask - how is lpad<|10 any more "ASCII salad" than, say, x|>sin|>exp? Yet the |> notation was added. Compare with, say, bash scripting, where | is used to pass the argument on the left to the command on the right - if you know it's called "pipe", it makes a _little_ more sense, but if you're not skilled in programming, it's not going to make sense. In that regard, |> actually makes more sense, as it looks vaguely like an arrow. And then <| is a natural extension to the notation.

Compare with some of the other suggestions, such as %func%, which _does_ have a precedent in another language, but which is completely opaque for people who don't have extensive knowledge of programming in the language.

Mind you, I looked back a bit at one of the older discussions, and I see that there HAS been a notation used in another language that would be quite nice, in theory. Haskell apparently uses a |> b c d to represent b(a,c,d). If spaces following a function name allowed you to specify "parameters" in this way, it would work nicely - map(lpad 10,A). The only problem arises with the unary operators - map(+ 10,A) would produce an error, for instance, as it would interpret at "+10" instead of i->+(i,10).

On a f b: the precedence issues may not be as bad as Glen-O suggested, but unless user-defined infix functions have the very lowest precedence, they do exist. Say, for the sake of argument we give them prec-times. In that case,
a^2 f b^2 => f(a^2,b^2)
a+2 f b+2 => a+f(2,b)+2
a^2 f^2 b^2 => (f^2)(a^2,b^2)
a f+2 b => syntax error?

This is all a natural consequence of how you'd write the parser, so it's not particularly a headache in that sense. But it's not particularly intuitive for the casual user of the idiom.

On the usefulness of a curry idiom
I agree with Glen-O that (i)->lpad(i,10) is simply worse than lpad<|10 (or, if we so choose, lpad |\ 10, or whatever). The i is an entirely extraneous cognitive burden and potential source of errors; in fact, I swear that when I was typing that just now, I unintentionally typed (i)->lpad(x,10) initially. So, having an infix curry operation seems to me like a good idea.
However, if that's the intention, then whatever infix idiom we settle on, we can create our own curry operation. If it's a f b, then something like lpad rcurry 10 would be fine. The point is readability, not keystrokes. So I think this is only a weak argument for <|.

On a |> b c d
I like this proposal a lot. I think that we could make it so that |> accepted spaces on either side, so a b |> f c d => f(a,b,c,d).

(Note: If both my suggestion of a b |> f c d and Glen-O's of map(lpad 10,A), this does create a corner case: (a b) |> f c d => f((x)->a(x,b),c,d). But I think that's tolerable.)

This still has similar issues in terms of operator precedence as a f b. But somehow I think they're more tolerable if you can at least talk about them in terms of the precedence of the operator |>, rather than being the precedence of the ternary operator of with.

Try lpad.(["foo", "bar"], 10) on 0.5. The existing |> isn't exactly loved by all.

@tkelman: I see the issue, but what's your point? You think we should fix the existing |> before we add extra uses for it? If so, how?

I personally think we should get rid of the existing |>.

Try lpad.(["foo", "bar"], 10) on 0.5. The existing |> isn't exactly loved by all.

I think you've missed the point. Yes, the func.() notation is nice, and bypasses the issue in some situations. But I use the map function as a simple demonstration. Any function that takes a function as argument would be benefited by this setup. As an example, purely to demonstrate my point, you might want to sort some numbers based on their least common multiple with some reference number. Which looks neater and easier to read: sort(A,by=i->lcm(i,10)) or sort(A,by=lcm 10)?

I'd like to note once again that any way to define infix operators will allow creating an operator that does what Glen-O wants <| to do, so that at worst he'll be able to write something like sort(A,by=lcm |> currywith 10). The point of this page is to discuss how to make some a...f...b => f(a,b). I understand that whether the existing |> or the proposed <| are worthwhile operators has some relationship to that point, but let's try not to get too sidetracked.

Personally, I think the a |> b c proposal is the best one so far. It follows an existing convention from Haskell; it is logically related to the existing |> operator; it is both reasonably readable and reasonably easy-to-type. The fact that I feel that it naturally extends to other uses is secondary. If you disagree, please at least mention your feelings on the core idiom, not just the proposed secondary uses.

I meant it leads to consistency issues, hence me saying "or it would have to treat existing infix operators differently, which could cause confusion".

I agree it's difficult to decide on the precedence for a f b. For example in clearly benefits from comparison precedence, but it's quite likely many functions used as infix would not want comparison precedence. However I don't see any consistency issue. Different operators have different precedence. Adding a f b doesn't force our hand to give + and * the same precedence.

Note that |> already has precedence adjacent to comparison. For any other precedence, frankly, I think parentheses are fine.

If you don't agree with me, and if we were using a |> f b, then there could be similar operators |+>, |*>, and |^>, which worked the same as |>, but had the precedence of their central operator. I think that's overkill but it's a possibility.

Another possibility for solving the precedence issue is to use a syntax for custom infix operators that includes parentheses of some kind, eg (a f b).

Related discussions: https://github.com/JuliaLang/julia/issues/554, https://github.com/JuliaLang/julia/issues/5571, https://github.com/JuliaLang/julia/pull/14476, https://github.com/JuliaLang/julia/issues/11608, and https://github.com/JuliaLang/julia/issues/15612.

I must ask - how is lpad<|10 any more "ASCII salad" than, say, x|>sin|>exp? Yet the |> notation was added.

I imagine that @tkelman argues

we should get rid of the existing |>.

in part because _both_ lpad<|10 and x|>sin|>exp venture into ASCII-salad territory :).

I think @toivoh's (a f b), with mandatory parens, is the best proposal so far.

Related to https://github.com/JuliaLang/julia/issues/11608 (and thus also https://github.com/JuliaLang/julia/issues/4882 and https://github.com/JuliaLang/julia/pull/14653): If (a f b) => f(a,b), then it would be make sense if (a @m b) => (@m a b). This would allow replacing the existing special case macro logic for y ~ a*x+b with normal (and thus much more transparent) (y @~ a*x+b).

Also, the "parens required" could be the preferred idiom for concise infix definitions. Instead of saying (to use a stupid example) a + b = string(a) * string(b), you'd be encouraged (by lint tools, or by compiler warnings) to say (a + b) = string(a) * string(b). I realize that this is not actually a direct consequence of choosing the "parens required" option for infix, but it is a convenient idiom that would allow us to warn the people using infix on the LHS mistakenly but lay off of the people doing it on purpose.

My feel is currently that if you are applying a function infix (rather than prefix),
then it is an operator, and should look and act like an operator.

And we have support for infix operators defined using unicode.
since https://github.com/JuliaLang/julia/issues/552

I guess it might be nice to have that exented so you can add the keywords as in the orginial suggestion.
So we could have, for example, 1 ⊕₂ 1 == 0
Being able to have arbitrary names for your infix seems a bit excessive.

should look and act like an operator.

I agree that there should be strong naming conventions for infix operators. For instance: one character of unicode, or ends in a preposition. But those should be conventions that develop organically, not requirements enforced by the compiler. Certainly, I don't think that #552 is the end of the story; if there are dozens of hard-coded operators, there should be a way to add more programmatically, if only for prototyping new features.

...

For me, the (a f b) (and (a @m b)) proposal is head and shoulders above the rest of the proposals in this bug. I'm almost tempted to make a patch.

(a f b)=>f(a,b)
(a f b c d)=>f(a,b,c,d)
(a f)=>syntax error
(a+2 f+2 b+2)=>(f+2)(a+2,b+2)
(t1=a t2=f t3=b)=>(t1=f)((t2=a),(t3=b)) (space has lowest possible precedence, as in macros)

...

Would it be inappropriate for me to submit a patch?

I didn't understand the last two cases:

(a+2 f+2 b+2)=>(f+2)(a+2,b+2)
(t1=a t2=f t3=b)=>(t1=f)((t2=a),(t3=b))

I find the (a f b c d) syntax very strange. Since 1 + 2 + 3 can be written as +(1,2,3) then shouldn't f(a,b,c) be written as (a f b f c)?

Overall I'm personally not convinced Julia should support custom infix operators beyond what is currently allowed.

I can see two problems with (a f b c d).

First, it will be difficult to read when you've got a more complicated expression - one of the reasons why brackets can be frustrating is that it can often be hard to tell, at a glance, which brackets pair with which other brackets. That's why infix and postfixing (|>) operators are desirable in the first place. Postfixing in particular is liked because it allows a nice, neat left-to-right reading without having to deal with brackets every time.

Second, it leaves no way to nicely do things like make it elementwise. My understanding is that f.(a,b) is going to be a notation in 0.5 to make f operate elementwise on its arguments with broadcasting. There will be no neat way to do the same thing with the infix notation, if it's (a f b). At best, it would have to be (a .f b), which in my view loses the niceness of symmetry that .( affords with .+ and .*.

Compare, for example, the case of wanting to use the example from Haskell. shashi on #6946 made the point that has an equivalent here. In Haskell, you would write circle 10 |> move 0 0 |> animate "scale" "ease". Using this notation, this becomes ((circle(10) move 0 0) animate "scale" "ease"), which isn't any clearer than animate(move(circle(10),0,0),"scale","ease"). And if you wanted to copy your circle to multiple places, using |> notation, you might have circle 10 .|> copy [1 15 50] [3 14 25]. In my view, that is the neatest way to implement the idea - and then, brackets do their normal role of dealing with order of operation issues.

And as I've pointed out, a|>f b c has the benefit of also having a natural extension allowing the same notation to have more use - f b c would parse as "function f with parameters b and c set), and thus would be equivalent to i->f(i,b,c). This allows it to work not just for infixing, but for other situations where you might want to pass a function (especially an inbuilt function) with parameters (noting that the standard is to have the object of the function first).

The |> also makes it clear which one is the function. If you had, say, (tissue wash fire dirty metal), it would be quite hard to, at a glance, recognise wash as the function. On the other hand, tissue|>wash fire dirty metal has a big indicator saying "wash is the function".

Some of the latest objections sound to me like saying "but you could abuse this feature!" My response is: of course you could. You could already write utterly unreadable code using macros if you wanted. The parser's job is to enable legit uses; to stop abuses, we have conventions/idioms and in some cases delinters. Specifically:

I didn't understand the last two cases:

These are not meant in any way to be an example to follow; they are just showing the natural consequences of the precedence rules. I think both of the last two examples would qualify as abusing the syntax, though (a^2 ಠ_ಠ b^2) => ಠ_ಠ(a^2,b^2) is clear enough.

shouldn't f(a,b,c) be written as (a f b f c)

My proposal of (a f b c d) was, frankly, an afterthought. I think it makes sense, and I could come up with examples where it's useful, but I do not want to hang up this proposal on this issue if it's controversial.

[For instance:

  • f is an "object method" of an object a, probably complicated, using b, c, and d, probably simpler.
  • f is a "naturally broadcast" method like push!]

While (a f b f c) would make sense if f were like +, I think that most operators are not actually like +, so it should not be our model.

it will be difficult to read when you've got a more complicated expression

Again, my answer would be, "so don't abuse it".

Say we want some way to write a complicated expression like a / (b + f(c,d^e)) with f infix. In @toivoh's proposal, that would be a / (b + (c f d^e)). In Haskell-like usage, it would be a / (b + (c |> f d^e)) or at "best", if |> precedence was changed to fix this one particular example, a / (b + c |> f d^e). I think that @toivoh's is easily as good here.

(tissue wash fire dirty metal)

I think the solution to this is strong naming conventions for infix operators. For instance, if there were a convention that infix operators should one character of unicode, or end in a preposition or underscore, then the above would be something like (tissue wash_ fire dirty metal) which is as clear as that expression could ever hope to be.

...

elementwise

This is a valid concern. (a .f b) is a bad idea, because it could be read as ((a.f) b). My first suggestion is (a ..f b) but it doesn't make me very happy.

circle 10 |> move 0 0 |> animate "scale" "ease"

I've used jquery, so I definitely see the advantage of function chaining like that. But I think that it's not the same issue as infix operators. Using the (a f b) proposal, you could write the above as:

circle 10 |> (move <| 0 0) |> (animate <| "scale" "ease")

... which is not quite as terse as the Haskell version, but still pretty readable.

Maybe it can be limited to only three things inside the ():
(a f (b,c))
.(a f (b,c)) using the operator .(

Finally, a response to the general point:

Overall I'm personally not convinced Julia should support custom infix operators beyond what is currently allowed.

Obviously we're all entitled to our opinions. (I'm not clear whether the thumbs-up referred to that part of the comment, but if so, that goes triple.)

But my counterarguments are:

  • Julia already has dozens of infix operators, many of them extremely niche. It is inevitable that more will be proposed. When somebody says "how can you have but not §?", I'd much rather respond "do it yourself" and not "wait until the next version is widely adopted".
  • Something like (a § b) is eminently readable, and the syntax is lightweight enough to learn from one or two examples.
  • I'm not the first person to raise this issue, and I won't be the last. I understand that language designers should be very very skeptical of creeping (mis)features, because once you add an ugly feature it's basically impossible to fix later. But as I said above, I think (a f b) is clean enough that you won't regret it.

I'm really not sure on the clarity of (a f b)

Here is a possible use-case:
select((((:emp_id, :last_name) from employee_tbl) where (:city, == ,'indianapolis')) orderby :emp_id));

This is certainly viable use of infix functions.
The select function is either the identity function, or sends the built query to the database.

Is this clear code?
I just don't know.

.(a f b)

Yes, that makes sense. But it's not very readable.

Is (a @. f b) more readable? Because the @. macro to enable that would be a simple one-liner.

[[[Come to think of it, if we allowed infix macros without requiring parens, @Glen-O could use them to do what he wants: circle(10) @> move 0 0 @> animate "scale" "ease"=>@> (@> circle(10) move 0 0) animate "scale" "ease" =macro> animate(move(circle(10),0,0),"scale","ease"). I think that solution is uglier than (a f b), but at least it would resolve this overall bug in my eyes.]]]

...

select((((:emp_id, :last_name) from employee_tbl) where (:city, = ,'indianapolis')) orderby :emp_id);

I would definitely rather use a macro for "where" so that the conditional expression didn't have to be strangely quoted. So:

select((((:emp_id, :last_name) from employee_tbl) @where city == 'indianapolis') orderby :emp_id);

The parens are mildly annoying, but on the other hand I see no reasonable way for the parser to deal with this kind of expression without them.

select((((:emp_id, :last_name) from employee_tbl) @where city == 'indianapolis') orderby :emp_id);

The parens are mildly annoying, but on the other hand I see no reasonable way for the parser to deal with this kind of expression without them.

On second thought, the precedence in that expression is just right to left. So, using infix macros, it could just as well be:

select((:emp_id, :last_name) @from employee_tbl @where city == 'NYC' @orderby :emp_id)

or even:

@select (:emp_id, :last_name) @from employee_tbl @where city == 'NYC' @orderby :emp_id

So while I still like (a f b), I'm beginning to see that infix macros are a good answer too.

Here's the full proposal through examples, followed by the advantages and disadvantages:

main uses:

  • a @m b => @m a b
  • a @m b c => @m a b c
  • a @m b @m2 c => @m2 (@m a b) c
  • @defineinfix f; a @f b => macro f(a,b...) :(f($a,$b...)) end; @f a b => f(a,b)

Corner cases: (not intended to be good code, just to show how the parser would work)

  • t1=a @m t2=b t3=c => @m t1=a t2=b t3=c (though this is not good programming style)
  • t1 + a @m t2 + b => @m t1+a t2+b (though this is not good programming style)
  • a b @m c => syntax error (??)
  • a @m b [c,d] => please don't, but @m a b[c,d] (ETA: Nope, with the patch this comes out as @m a b ([c,d]) which is probably better.)
  • a @m b ([c,d]) => @m a b ([c,d])
  • [a @m b] => bad style, please use parentheses to clarify, but [a (@m b)] (??)
  • a @> f b => @> a f b => f(a,b)
  • @outermacro a b @m c d => @outermacro a (@m b c d)

Advantages:

  • define infix macros, get infix functions for free (with one-time overhead of macro evaluation. That's not quite as low-overhead as parser magic, but much better than having extra function calls every evaluation)
  • can lead to powerful DSLs, as seen in the SQL-like example above
  • Removes the need for a separate |> operator, since that's a one-liner macro. Similarly for <| and the rest of @Glen-O's proposals.
  • explicit, so very low risk of being used by accident, unlike (a f b)
  • As shown, the @defineinfix macro could allow shorthand use for functions not macros.

(Minor) Disadvantages:

  • precedence and grouping seem to work well in most cases with RtoL, but there would be exceptions which would require parens.
  • I think that a @> f b or even a @f b isn't quite as readable as (a f b) (though they're not too horrible either.)

Given how active this thread has become, I'm going to remind people of my original concern with this topic: issues about syntax often generate a huge amount of activity, but that amount of activity is generally out of proportion to the long-run value of the change being debated. In large part, that's because threads about syntax end up being close to pure arguments about tastes.

that amount of activity is generally out of proportion

I'm sorry. I'm probably guiltiest of getting into back-and-forth.

On the other hand, I think this thread has clearly made "useable" progress. Either of the latest suggestions (a f b) or [a @> f b, with a @f b definable as a shortcut] is clearly superior in my view to the earlier suggestions like a %f% b or a |> f <| b.

Still, I think that further back-and-forth comments are probably not going to make any further progress, and I'd encourage people to use thumbs-up or thumbs-down from now on unless they have something truly new to suggest (that is, not just an orthographic change to an existing proposal). I've added "hooray" emoticons (exploding cone) to the "votable proposals". If you believe that we should not have a specialized syntax for arbitrary functions in infix position, then downvote the bug as a whole.

...

ETA: I think that this discussion is now mature enough to get a decision tag.

For reference, (and I expected someone else to point it out).
If your want to embed SQL-like syntax, the right tool for the job is Nonstandard String Literals, I think.
Like all macros they have access to all variables in scope when called,
and they allow you to specify your own DSL, with your own choice of priority, and they run at compile time.

select((((:emp_id, :last_name) from employee_tbl) where (:city, == ,"indianapolis")) orderby :emp_id));

Is better written

sql"SELECT emp_id, last_name FROM employee_tbl WHERE city == 'indianapolis' ORDER BY emp_id"

Nonstandard string literals are a seriously powerful bit of syntax.
I can't find any good examples of them being used for embedding a DSL.
But they can do it.

And in this case I think the result is a lot cleaner than any infix operation that can be defined.
Though it does have the overhead of having to write your own microparser/tokenizer.


I really don't see the need to a decision tag.
This has no implementation as a PR, nor any usable prototype.
that lets people test it out.
Contrast to https://github.com/JuliaLang/julia/issues/5571#issuecomment-205754539 with its 8 usable prototypes

My feels towards this go up and down everytime I read the thread. I don't think I'll really know til I try it. And right now I don't even know what I would use it for. (Unlike some of the definitions for |> and <| which I have used in F#)

SQL-like syntax, the right tool for the job is Nonstandard String Literals

Whether or not SQL is best done with NSLs, I think there is a level of DSL that is complex enough that inline macros would be very helpful, but not so complex that it's worth writing your own microparser/tokenizer.

right now I don't even know what I would use it for. (Unlike some of the definitions for |> and <| which I have used in F#)

The inline macro proposal would enable people to, among other things, roll their own |>-like or <|-like macros, so you could use it for whatever you've done in F#.

(I don't want to get into back-and-forth bikeshedding arguments, but I was responding anyway because of the below, and I do think that the inline-macro proposal kills multiple birds with one relatively-smooth stone.)

I really don't see the need to a decision tag.

I asked earlier if it was appropriate for me to create a parser patch, and nobody answered. The only word on that so far is:

I don't think there's going to be much in the way of "syntactic innovation" before Julia 1.0.

Which would seem to argue against making a patch now, as it might just sit around and bit-rot. However, now you're saying that it's not worth making a decision on this (including the decision not to decide right now?) unless we have an "implementation as a PR [or] usable prototype".

What does that mean? (What is a PR?) Would a macro that used the character '@' instead of the token @ do the job, so that @testinline a '@'f b=>@f(a, b)? Or should I submit a patch to julia-parser.scm? (I've actually begun initial looking at writing such a patch, and it looks as if it should be simple, but my Scheme is very rusty.) Do I need to create test cases?

Right now, there are 13 participants in this bug. There are a total of 5 people who have voted on one or more of the proposals and/or downvoted the bug itself, and only one of those (me) did so after the inline macro proposal was on the table. That doesn't make me confident that it's time for prototyping yet. When the number of people who have voted since the last serious proposal is more like half the number of participants, I hope some kind of rough consensus will be becoming clear, and then it will be time for prototyping and testing and deciding (or, as the case may be, giving up on the idea).

By "implementation as a PR [or] usable prototype".
I mean something that can be played with.
So it can be seen how it feels in practice.

A PR is a pull request, so a patch is the term you've been using.

If you made a PR it could be downloaded and tested.
More simply though if you implemented it with macros
or Nonstardard string literals,
it could be tested without having to build julia.

Like it ain't my call, but I doubt I'll be bale to make up my own opinion without something I can play with.

Also +1 to not going to back and forth bike sheding.

...or maybe an Infix.jl package with macros and nonstandard string literals.

We have definitely reached the "working code or GTFO" point in this conversation.

OK, here's working code then: https://github.com/jamesonquinn/JuliaParser.jl

ETA: Should I reference a specific commit, or is the above link to the latest master OK?

...

(That does not have any of the convenience macros I'd expect you'd want, such as the equivalents for |>, <|, ~, and the @defineinfix from my example above. Nor does it remove _deprecate_ the now-useless special case logic for ~ or the |> operator. It's just the parser changes to get it working. I've tested basic functionality but not all corner cases.

...

I think that the current ugly hack with ~ shows that there's a clear use case for this kind of thing. Using this patch, you'd say @~ when you needed macro behavior; much cleaner, with no special case. Or does anyone seriously believe that ~ is utterly unique and nobody will ever want to do that again?

Note that the patch (it's not a PR yet because it targets the native bootstrapped parser, but for now the scheme one should come first in terms of PRs) is more generally useful than the issue name here. The issue name is "custom infix operators"; the patch gives infix macros, with infix operators only coming as a side effect of that.

The patch as it stands is not a breaking change, but I expect that if this became the plan the next step would be to deprecate the currently-existing ~ and |>, which would eventually lead to breaking changes.

...

Some simple tests added.

11608 was closed with a pretty clear consensus that many of us do not want infix macros and the one current case of ~ parsing was a mistake (made early on for R compatibility and no other especially good reason). We intend to deprecate and eventually get rid of it, just haven't done it (along with the work of modifying the API for the formula interface in JuliaStats packages) yet.

Macros are now technically generic, but their input arguments are always Expr, Symbol, or literals. So they aren't really extensible to new types defined in packages the way functions (infix or otherwise) are. Possible use cases for infix macros are better served by prefix-annotated macro DSL's or string literals.

(Sorry I posted prematurely; fixed now.)

In #11608, I see several negative arguments:

===

What would the following transform into?
...
y = 0.0 @in@ x == 1.0 ? 1 @in@ 2 : 3 @in@ 4

This was dealt with in the thread:

Cases like that are why I always use parenthesis...

and

same precedent ... apply without being macros: 0.0 in 1 == 1.0 ? 2 in 2 : 3 in 4

===

more functionality to Julia that people have to implement, maintain, test, learn to use, etc.

which is (partially) answered (and seconded) here by:

"headaches for parser developers" is the lowest possible concern.

===

is there no way for 2 packages to simultaneously have definitions for the same macro-operator that could be used together unambiguously in a single user code base?

This is an interesting point. Obviously, if the macro just calls a function, then we have all the dispatch power of the function. But if it is a true macro, as with ~, then it's more complicated. Yes, you could imagine hackish workarounds, like attempting to call it as a function, and catching any errors to use it as a macro... but that's kind of ugliness should not be encouraged.

Still, this is just as much of an issue for any macro. If two packages both export a macro, you simply can't have both with "using".

Is this likely to be more of a problem with infix macros? Well, it depends what people end up using them for:

  • Just a way to have user-defined infix functions. In that case, they're no worse than any other function; dispatch works fine.
  • As a way to use other programming styles, using operators like the |> and <| that @Glen-O discusses above. In that case, I think there will quickly develop common conventions about what macro means what, with little chance of collision.
  • As a way to make special-purpose DSLs, like the SQL example above. I think these will be used in specific contexts and the chance of collision is not too bad.
  • For things like R's ~. At first, this looks the most problematic; in R, ~ is used for several different things. However, I think that even there, it's manageable, with something like:

macro ~(a,b) :(~(:$a, quote($b))) end

Then, the function ~ could dispatch based on the type of the LHS, but the RHS would always be an Expr. This kind of thing would allow the principal uses it has in R (regression and graphing) to coexist, that is, to dispatch correctly despite coming from different packages.

(note: the above has been edited. Initially, I thought that an R expression like a ~ b + c used the binding of b and c through R's lazy evaluation. But it doesn't; b and c are the names of columns in a data frame passed explicitly, not names of variables in local scope that are thus passed in implicitly.)

===

The only way forward here would be to develop an actual implementation.

Which I have done.

===

Macros are now technically generic, but their input arguments are always Expr, Symbol, or literals. So they aren't really extensible to new types defined in packages the way functions (infix or otherwise) are.

This relates to the point above. Insofar as an infix macro calls a specific function, that function is still extensible through dispatch in the normal way. Insofar as it doesn't call a specific function, it is doing something structural/syntactic (such as what |> does now) that should not be extended or redefined. Note that even if it calls a function, the fact that it is a macro can still be useful; for instance, it can quote some of its arguments, or process them into callbacks, or even interact simultaneously with the name and the binding of a variable, in a way that a direct function call cannot.

===

Possible use cases for infix macros are better served by prefix-annotated macro DSL's or string literals.

As was pointed out in the referenced thread:

[Infix is] easier to parse (for English and most western speakers), because our language works that way. (The same thing generally holds for operators.)

For example, which is more readable (and writeable):

select((:emp_id, :last_name) @from employee_tbl @where city == 'NYC' @orderby :emp_id)

or

send(orderby((@where selectfrom((:emp_id, :last_name), employee_tbl) city == 'NYC'), :emp_id))

?

===

Finally:

11608 was closed with a pretty clear consensus

Looks pretty evenly split to me, with "who's gonna do the work" casting the deciding vote. Which is now at least partly moot; I've done the work in JuliaParser and I'd be willing to do it in Scheme if people like this idea.

This is my last post in this thread, unless there's positive reaction to my hacked juliaparser. It is not my intention to impose my will; just to present my point of view.

I'm arguing in favor of infix macros (a @m b=>@m a b). That doesn't mean I'm not aware of the arguments against. Here's how I'd summarize the best argument against:

Language features start at -100. What do infix macros offer that could possibly overcome that? By their very nature, there is nothing you could accomplish with infix macros that couldn't be accomplished with prefix macros.

My response is: Julia is first of all a language for STEM programmers. Mathematicians, engineers, statisticians, physicists, biologists, machine learning people, chemists, econometricians... And one thing that I think most of those people realize is the usefulness of a good notation. To take an example I'm familiar with in statistics: adding independent random variables is equivalent to convolving PDFs, or even to convolving derivatives of CDFs, but often expressing something using the former can be an order of magnitude more concise and understandable than the latter.

Infix versus prefix versus postfix is, to some degree, a matter of taste. But there are also objective reasons to prefer infix in many cases. Whereas prefix and postfix lead to indigestible precipitates of back-to-back operators like the ones that make Forth programmers sound like German politicians, or the ones that make Lisp programmers sound like a Chomskian caricature, infix puts the operators in what's often the cognitively most natural place, as near to all their operands as possible. There's a reason nobody writes math papers in Forth, and why even German mathematicians use infix operators when writing equations.

Yes, infix macros could be used to write obfuscated code. But existing prefix macros are just as prone to abuse. If not abused, infix macros can lead to much clearer code.

  • (a+b @choose b) beats binomial(a+b,b);
  • score ~ age + treatment beats linearDependency(:score, :(age + treatment));
  • domSelect("#logo") @| css "color" "red" @| fadeIn "slow" @thenApply addClass "dummy" beats the holy hell out of addOneTimeEventListener(fadeIn(css(domSelect("#logo"),"color","red"),"slow"),"done",(obj,evt)->addClass(obj,"dummy")).

I realize that these are just toy examples but I think the principle is valid.

Could the above be done with nonstandard string literals? Well, the second and third examples would work as NSLs. But the problem with NSLs is that they give you too much freedom: unless you're familiar with the particular grammar, there's no way to be sure even what the tokens of an NSL are, let alone its order of operations. With infix macros, you have enough freedom to do all of the above examples, but not so much that it isn't clear on reading the "good" code what the tokens are and where the implied parentheses go.

The it needs certain things to be moved from unknown unknowns to known unknowns. And unfortunately, there is not a mechanism to do this. Your arguments need a structure which does not exist.

Now that <| is right-associative (#24153), does the initial a |>op<| b proposal work?

I have made a package for the hack mentioned by Steven in https://github.com/JuliaLang/julia/pull/24404#issuecomment-341570934:

I'm not how many potential infix operators this affects, but I'd really like to use <~. The parser won't cooperate -- even if I space things carefully, it wants a <~ b to mean a < (~b).

<- has a similar problem.

Sorry if this is already covered by this or another issue, but I couldn't find it.

We could potentially require spaces in a < ~b; we've added rules like that before. Then we could add <- and <~ as infix operators.

Thanks @JeffBezanson, that would be great! Would this be a special case, or a more general rule? I'm sure there are some details in what the rule should be to allow more infix operators, give clear and predictable code, and break as little as possible existing code. Anyway, I appreciate the help and the quick response. Happy new year!

In case that a <~ b will be different than a < ~b I would like to see a =+ 1 as error (or warning at least)

I know this is quite an old discussion, and the question asked was asked quite some time ago, but I thought it was worth answering:

Now that <| is right-associative (#24153), does the initial a |>op<| b proposal work?

No, unfortunately, |> still gets the precedence. The update done makes it so that, if you define <|(a,b)=a(b), then you can successfully do a<|b<|c to obtain a(b(c))... but this is a different concept.

Frozen during 2 years, a comment and a commit 2 and 5 days ago !

See Document customizable binary operators f45b6be

Was this page helpful?
0 / 5 - 0 ratings

Related issues

StefanKarpinski picture StefanKarpinski  ·  3Comments

manor picture manor  ·  3Comments

sbromberger picture sbromberger  ·  3Comments

wilburtownsend picture wilburtownsend  ·  3Comments

omus picture omus  ·  3Comments