Julia: julep: a plan for backticks

Created on 14 Jul 2015  Â·  68Comments  Â·  Source: JuliaLang/julia

It has not infrequently seemed a shame to me that Markdown has trained us all to quote code by wrapping it in backticks. a + b is how we want to write the quoted expression a + b. That's considerably nicer than :(a + b) – the frownieface operator is just kind of weird – and it has some syntactic issues since the parens are actually part of the expression being quoted, not the quotation syntax; this has tripped quite a few people up.

Currently, backticks are used for quoting external commands using a convenient shell-like syntax. You don't want to use single or double quotes for this since it's quite common to want to use those quote characters in command expressions. But there's one bit of syntax we haven't exploited yet: backtick custom-literal strings. (This option only just occurred to me the other week.) So I would propose the following syntaxes:

  1. Use bare backticks to quasiquote Julia code: a + b + $ex. The dollar sign splices expressions into the quoted code as it does inside of :(...) currently.
  2. Use cmd-prefixed backticks to write commands: pipe(cmdfind -name *.$ext, cmdhead -n$n). The dollar sign splices values into commands as it does into backticks currently.
  3. Use colon for symbol literals, allowing double quotes to write symbols that aren't valid identifiers: e.g. :foo for symbol("foo"), :"foo bar" for symbol("foo bar") or :123 for symbol("123").

Using backticks for quasiquoting has the advantage that it's what lisp does. Getting to this point without breaking everything will require a substantial deprecation process:

  • [ ] Introduce custom backtick literals – foo...`` – and allow people to use those for a while.
  • [ ] Introduce cmdfind -name *.$ext`` as a syntax for external commands.
  • [ ] Deprecate bare backticks for commands.
  • [ ] Wait a release cycle to let the deprecation "take".
  • [ ] Change a + b to meaning quasiquotation – this breaks code using ... for commands.
  • [ ] Deprecate :(a + b) as quasiquotation.
  • [ ] Wait a release cycle to let the deprecation set it.
  • [ ] Disable :(a + b) for quasiquotation, enable :"foo bar" for non-identifier symbols.

That's a long process, but I think it's a better use of backticks. It has the advantages of matching how we write quoted code in Markdown and most Lisps use backtick for quasiquotation – in Lisp style just at the front, of course, but still, I think it will be more familiar to Lispers.

design julep speculative

Most helpful comment

I think I'm in the vast minority here, but I actually prefer the frowny face operator; having :(...) construct an Expr seems a natural extension of having :x construct a Symbol. If anything, I'd prefer something like :{...} as a replacement for frowns over backticks, or even go full on Lisp and use '(...).

I also think that the triple backticks for block syntax is much less clear than quote/end, and is less consistent with how we do blocks in other contexts (e.g. do, for, begin). It bears resemblence to the multiline string syntax, but multiline strings contain arbitrary stuff whereas blocks contain code. I think it's safe to assume that quoted blocks more often contain code than arbitrary text.

I agree with Jake that it's a lot of ecosystem-wide breakage for, in my very humble opinion, marginal benefit.

All 68 comments

-1 from me. Although I like the proposal, I don't think what we gain here is worth the massive breakage.

If we're going to change the command syntax, why not have cmd be an ordinary custom string literal? It's not clear to me what we'd gain from backtick custom literals besides confusion and extra special cases in the code. We can always use cmd"""x""" if there are quotes, but I'm not sure there are many cases where you can't use single quotes in the command. (In fact I'm not sure there are many cases where you actually want quotes at all if you have interpolation.)

Ref #9945 for the proposed symbol changes.

I agree with @simonster, cmd"""xxx""" or cmd"xxx" instead of backticks for commands.
I don't think backticks are used for commands _that_ frequently, in base, I found 9 files that used it, plus 8 files in the pkg directory, and a lot more places where backticks were part of documentation.
In packages, it seems that most places where backticks are used as commands, was as a literal argument to run, so I that for those cases, why can't run also accept a normal string, and treat it as if it were in backticks?
Add the cmd"..." and cmd"""...""", along with run("...") and run("""..."""), deprecate backticks for commands at the same time, and _then_ think about using backticks for other things.

I'm not sure there are many cases where you can't use single quotes in the command.

The cmd shell on Windows doesn't handle single quotes the same way a posix shell does, as one case.

cmd".." is also easy to add to Compat, no need for parser changes.

-1 to run et al accepting plain strings due to the difference between interpolation (e.g. of arrays; see http://julialang.org/blog/2013/04/put-this-in-your-pipe/).

+1 to this proposal, and the end goal.
Beyond making quoting much more readable, it also introduces a distinction between quoted symbols and bare ones, which I seem to recall is needed to improve macro hygiene.

One thing that would be lost though: it would no longer be possible to nest quasiquotes lexically. But I think the needs of that would be infrequent enough that you could easily work around it.

This seems like two independent issues.

  1. Enable backtick custom string literals, and how they should work.
  2. Reclaim unprefixed backtick literals for a more widely used purpose.

Previously we had @*_str and @*_mstr macros, but they were merged when the deindentation function for tripple quoted strings were moved to the parser. Should prefixed backtick quoting be just another string literal that calls @*_str, with different parser behavior with regard to escaping, or do we want a different concept?

2 will be a long process with deprecation periods to allow people to migrate to the new solution, so there is no hurry deciding what we will use the syntax for.

Would triple back-ticks replace quote-blocks?

@hayd Pardon my ignorance, could you give an example of why run("string") could not be treated as the equivalent of run(string)? I read the link, but couldn't see just where that said or implied that that wouldn't work. Thanks.

@toivoh Example of nested quasiquotes please? Thanks.

+1, this seems like a nice improvement. If nothing else I'm not going to be able to unsee the frownyface operator now.

@toivoh I don't think we'd necessarily lose that ability. We can already nest strings as e.g. "foo $("bar") baz", and the parser realises that " doesn't end the string because it's inside an expression. foo(`bar`) could work in the exact same way.

@ScottPJones Command interpolation works differently. For instance, assume we had a program nargs which returns argc, and let arg = "one two". run("nargs $arg") would return 3, but run(nargs $arg) would return 2.

+1 for the @StefanKarpinski proposals 1 and 3 (backticks become expressions, colons are symbols), and also the @mauro3 suggestion to use triple back-ticks to replace quote/end blocks. Also agree that cmd"..." and cmd"""...""" are sufficient and don't require special back-tick notation. I was burned yesterday by the subtlety of the frownyface notation, and I think there should be a clear distinction between expressions and symbols. As @ScottPJones pointed out, there are very few current uses of back-ticks so I say just go ahead and break stuff.

@pao Thanks. That's a rather subtle difference I would think.
With the backtick quoting, what would one do if they want things split up like in the first example (with " quotes)?

@ScottPJones, please read http://julialang.org/blog/2012/03/shelling-out-sucks/ and http://julialang.org/blog/2013/04/put-this-in-your-pipe/ for more background on why Julia's backticks exist, work the way they do, and are important for calling external programs reliably. I went through it there in a great bit of detail with lots of examples. No point in rehashing that unnecessarily.

@one-more-minute wrote:

@toivoh I don't think we'd necessarily lose that ability. We can already nest strings as e.g. "foo $("bar") baz", and the parser realises that " doesn't end the string because it's inside an expression. foo(`bar`) could work in the exact same way.

Good point. Since you can parenthesize expressions you could always write (...) for nested quasiquotation. I also like the idea of ... for quote end – again, it fits nicely with how triple backticks are used in Markdown.

OK, I did read the second one, that @hayd mentioned, I'll read the other one. Thanks.

It is not a very strong argument that just because the cmd syntax is not used often in Base, it can just be freely deprecated without too much impact. Of course it is not used in Base, you should make minimal assumptions about your environment if you want to be cross platform. Command syntax is used often in data processing pipelines. It is often faster to call unix functionality through shelling out than to use Julia code to munge your data as the unix utilities are currently much faster.

The real deprecation here is not with the cmd syntax but with expression quoting (which is used everywhere). I agree that the backtick is marginally nicer syntax, but is it _that_ much nicer to go through all this code churn? @tbreloff you say that the current syntax is subtle, could you give a concrete example?

At the current release rate, this proposal would have us adapting to deprecations and rewriting code for ~2 years. To go through and fix packages is a lot of effort for often little gain. I'm just raising the red flag that we should actually be gaining something tangible from this proposal (other than it is more aesthetically pleasing) before committing to it.

@ivarne wrote:

Previously we had @*_str and @*_mstr macros, but they were merged when the deindentation function for tripple quoted strings were moved to the parser. Should prefixed backtick quoting be just another string literal that calls @*_str, with different parser behavior with regard to escaping, or do we want a different concept?

Ah, I see you beat me to this proposal.

@jakebolewski, this is a good point, but I do think that aesthetics matter and this is something that will be in the language forever. I don't really want to live with the frowneyface operator forever, especially when there's this other _much_ nicer syntax so tantalizingly close.

regarding subtlety... here's a few quick examples which are non-obvious with a quick glance (for non-expert users anyways):

julia> x = :(); typeof(x)
Expr

julia> x = :(x); typeof(x)
Symbol

julia> x = :(+); typeof(x)
Symbol

julia> x = :(+5); typeof(x)
Int64

julia> x = :(+(5)); typeof(x)
Expr

I feel like it would be much clearer to see:

``   # equivalent to :()
:x
:+
`+5`
`+(5)`

Aesthetics matter a ton. I want to be able to scan code in 1-2 seconds to understand what it's doing.. I don't want to spend my time looking for matching parens and reasoning about what something means in context. This is doubly valuable if I can add logic to my syntax highlighter that clearly identifies expressions in the code. I can't easily do that if symbols and expressions share syntax.

:(x) == :x is particularly fiddly, because it means that you can't reliably return quasiquote syntax from macros, and instead have to have use :($(Expr(:quote, x))) everywhere (linking back to the nested-quasiquotes issue). Making a distinction between symbols and quoted expressions makes a ton of sense.

@tbreloff, @one-more-minute wouldn't using explicit quote ... end blocks solve most of the points you raise (except +5 which is transformed in the parser).

Possibly, although wrapping things in a redundant Expr(:block) isn't always convenient either.

@jakebolewski yes you can obviously get around these problems, but quote ... end adds it's own confusion and messiness.

Julia is still 0.4 (dev)... if there are good solutions to making the language easy to understand/read, we should do it.

@tbreloff what is the confusion and messiness with quote ... end blocks? Block syntax is fundamental to Julia.

Users who are manipulating quoted expressions have entered "sufficiently advanced user territory". We don't even commit to having a stable Expr AST representation.

I think there'd be a certain elegance to have … always wrap its contained expression in an Expr(:quote, …) Expr, akin to how quote … end is always Expr(:block, …) (and that could become …). Then you'd no longer need to worry about potentially getting AST literals back, either.

(Edited, thanks @toivoh)

Thinking about the commonalities between quasi-quotation and command line syntax, they're both some sort of executable string with syntax. Perhaps the custom foo…literals should be encouraged for writing DSLs or other interop like `sql`…. With that in mind, should there be any differences in the parsing or macro name between foo"…" and bar…``?

+1 to eventually using … as default quoting syntax. I also think that having the shell> mode lessens some of the impact here since that's, at least for me, the most common use of shelling out in Julia.

@mbauman brings up a good point. Maybe the convention going forward is foo".." _string_ literals return objects, while foo...`` _backtick_ literals actually call a method of some kind, i.e. execution.

quote … end is not Expr(:quote, Expr(:block, …)), just Expr(:block, …), and I think it should stay that way. When you are working with an AST, sometimes you want to quote it, but more often not I would say (and it's easier to add the quotation than to remove it).

Of course, sorry for the misinformation and thanks for the correction. I was thinking one level too deep (:(:(…)) and :(quote end)). I had initially wrote that it wouldn't be possible to always return Expr without bigger changes, but in playing around at the REPL I was quoting quotes and got excited.

With that in mind, should there be any differences in the parsing or macro name between foo"…" and bar…``?

Maybe the backtick macros always get file and line number information in a second argument? (Cf. #9577 and #9579)

@jakebolewski wrote:

Users who are manipulating quoted expressions have entered "sufficiently advanced user territory". We don't even commit to having a stable Expr AST representation.

That's no excuse. We will at some point have to commit to a stable AST representation. The fact that we have not at this point is merely an artifact of being pre-1.0 and AST manipulation being a relatively niche thing.

@mbauman wrote:

Thinking about the commonalities between quasi-quotation and command line syntax, they're both some sort of executable string with syntax. Perhaps the custom foo…literals should be encouraged for writing DSLs or other interop like `sql`…. With that in mind, should there be any differences in the parsing or macro name between foo"…" and bar…``?

Yes, this is precisely what I had in mind: backticks become a general way of quoting code.

Maybe the backtick macros always get file and line number information in a second argument? (Cf. #9577 and #9579)

That's an excellent idea. I quite like it.

AST manipulation being a relatively niche thing.

That was my point. I just don't see the argument that the quote ... end syntax having a redundant Expr(:block) is really that big a deal for users who manipulate quoted syntax.

It's certainly usable as it is, but it's not really that great. The unification of backticks as how one quotes code in general is a very nice generalization. We use it for that currently, but with the wrong default – the default kind of code should be Julia code, not external commands.

We can already nest strings as e.g. "foo $("bar") baz", and the parser realises that " doesn't end the string because it's inside an expression. foo(`bar`) could work in the exact same way.

How so? The only reason the former nesting quotes inside interpolation works is because the parser has special handling for string interpolation. Are you proposing special parser handling of ( inside backticks?

There may be a better use for backticks than what we have now (though shell mode really isn't a substitute when you're writing non-interactive scripts), but I'm with @jakebolewski here - I'm not sure Julia Expr quoting is that much of a better use for them, it's a lot of churn, and there are downsides with nesting and making Cmd objects not work as well.

I like the aesthetics of the backtick for quoting expressions.

+1 from me. The frownyface operator has caused me sadness (and hours of debugging) in the past.

@tkelman, it would work pretty much the same – if the quoted expression is complete when you encounter a `then it closes the quoted expression; otherwise you try parsing it as opening an inner quoted expression. That means thatfoo(bar)`would work sincefoo(` is incomplete.

The key is that the parsing of julia quasi-quotes and custom backtick quotes will be necessarily different.

I don't think that just changing the spelling of Julia quotes from :(…) to … is really that meaningful. Sure, it makes :… always return a symbol, which is nice, but $x will still sometimes return AST literals. And there will still be crazy edge cases in the disambiguation between Colon(), ranges, ternary ?: syntax, and symbols.

It's the standardization of interop code blocks that makes this worthwhile in my view. I really like the unification of quoted code and code blocks for cmd…, `cxx`…, sql…``, etc. +1 from me.

One could even write

``````

int foo() ...

``````

as an equivalent to cxx...`` to really unify things with Markdown, although that's a different bikeshed.

special parser handling of ( inside backticks

You could call it special handling if you want, but it's really no different to the way that :( doesn't always stop at the first ), or quote doesn't stop at the first end.

Well it's a pretty major change to propose applying the Julia parsing rules inside backticks, which we certainly don't do inside strings or cmd objects right now (other than inside interpolation). That seems like it could make backticks less useful than what's being proposed for other-format prefixed overloads.

It would only apply to bare backticks which are specifically for quoting Julia code. This is exactly the same as how bare double quotes allow interpolation of expressions that use double quotes.

Except that backtick "interpolation" would be automatic, silent, and default rather than set off by $(

@tkelman, I don't really get your issue with this – it is exactly analogous to string interpolation.

No, I still think that's an imprecise analogy - it works the way code inside an interpolation (or existing quoting) works, not the way interpolation inside strings works. Interpolation is its own parsed context within the string. You're proposing making backticks a parsed context, except when prefixed by a formatting macro? Seems maybe useful, but not a dramatic improvement. The funny lowering of custom string literals is already kind of hidden and confusing, now we're going to add another version of it?

Or to take this another step, if we're going to do this, why not apply the exact same treatment to single quotes while we're at it. I'm sure there's a better use for them than chars, we can just use char'a' for that. (not sure if joking)

That is quite nice. 1 + 1 is quasi-quoted (julia)-code, and cmdecho -e "\033[2J" is code (a command with args in the execvp sense) and bashecho -e "\E[2J" is some specific shell code etc

You're proposing making backticks a parsed context, except when prefixed by a formatting macro?

Isn't the point that you _could_ define parsing rules on prefixed backticks? e.g. cmd/cxx/sql.

The conventions between parsing and executing are a bit unclear: sql/cxx execute on construction (IIUC), cmd/:( don't and need to be run/eval'd.

Yes, there's a bit of a distinction to be made between constructs that construct code versus evaluate code. It might be worth making it more uniform, but it's unclear why one would want to construct but not evaluated SQL or C++ code objects.

If we really are going to change the syntax, I think we should start deprecation process in this release so that at least we can write non-deprecated code going forward in the 0.5 release cycle.

This the chance to get infix operators. That's the only nice way to achieve custom infix operators that I'm aware of, and if we decide to use this syntax for quasi-quotation we will never get that.

I don't see why backticks are a desirable choice for infix operator syntax.

Because they are a minimal piece of syntax that we can put surrounding a custom operator:
a operator b is much nicer than a $operator$ b or something like that.

Unless we can define a symbol to be an infix operator without surrounding it with syntax, say:

@infix :operator

Can we not worry about infix operators when we're getting close to a release and need to get things moving sooner rather than later?

Got it

I've never liked backticks for infix operators, so I'm not too worried about that.

Yeah, I used to think having generalized infix operators was super important, but over time, it's become less and less important IMO. I think the natural Julia style with multiple dispatch generalizes to more than just two arguments, so infix suddenly isn't as important. Using backticks for quoting code is much more natural.

x f y is currently a syntax error outside of array and macro context, which would be more appealing, especially since it generalizes the x in y special syntax. To me the main question is how (if) one would generalize it to more arguments, associativity, etc.

Back on topic, I think the order of Stefan's checklist is right. We first need to allow and implement `foo```. How we do that is still in discussion (line numbers? conventions for construction vs execution?) and a bit of work. I'm not sure it's worth delaying 0.4 more for this.

Leaving huge breaking syntax changes up in the air doesn't really help anyone. If this is going to change it should happen sooner rather than later.

I personally feel that we should put this on the backburner for now, and discuss earlier in the 0.5 cycle.

I agree with @ViralBShah ; at this point I don't want a single additional thing to worry about for 0.4. The "do it sooner rather than later" argument has merit, but can only go so far. Otherwise as soon as somebody raises a breaking syntax idea, we have to delay whatever release we're working on until it's settled. It's sort of a breaking-change-filibuster.

I like backticks for code quoting, but I'm worried about the nesting behavior. It vaguely reminds me of the syntax of prefix $ before we made it parse just one atom. One was never sure exactly how much syntax it would eat. People may end up preferring to disambiguate by writing ( ... ), but then we're in a similar situation as we are with :(). See for example #11611. So we might want to make parens part of the syntax from the beginning, i.e. the open backquote is backtick-open-paren, and the close backquote is close-paren-backquote.

While I'm still in favor of this, I agree that jumping into it too fast is a bad idea. This is post 0.4.

I think I'm in the vast minority here, but I actually prefer the frowny face operator; having :(...) construct an Expr seems a natural extension of having :x construct a Symbol. If anything, I'd prefer something like :{...} as a replacement for frowns over backticks, or even go full on Lisp and use '(...).

I also think that the triple backticks for block syntax is much less clear than quote/end, and is less consistent with how we do blocks in other contexts (e.g. do, for, begin). It bears resemblence to the multiline string syntax, but multiline strings contain arbitrary stuff whereas blocks contain code. I think it's safe to assume that quoted blocks more often contain code than arbitrary text.

I agree with Jake that it's a lot of ecosystem-wide breakage for, in my very humble opinion, marginal benefit.

I was (perhaps unnecessarily) a bit worried about how nesting expressions will work nicely, which is the nice thing about having them surrounded by brackets. Here's just an idea:

`I am a symbol with spaces and #hashes and $string-like interpolation`
`(I am an expression with $expr-like interpolation #comment)`

with triple backticks being more like the latter, I suppose? Having symbol makes it seem a bit more like it is just another type of string (an interned string - though I'm not sure whether or not to add _all_ the String-like methods to make it so...)

I don't think we're going to do this. The problem with backticks in place of :( ) is that it doesn't next, which is very annoying.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

omus picture omus  Â·  3Comments

omus picture omus  Â·  3Comments

TotalVerb picture TotalVerb  Â·  3Comments

iamed2 picture iamed2  Â·  3Comments

ararslan picture ararslan  Â·  3Comments