Nim: new parsing rules of do: notation causes problems

Created on 18 May 2017  路  25Comments  路  Source: nim-lang/Nim

Most helpful comment

If you want a more consistent syntax, please add a new syntax (this would be syntax error with the current nim compiler):

myProc arg1,arg2,arg3,do:
  doSomething

Instead of trying to use spaces to separate your arguments.

All 25 comments

Yes, the rules have changes as explained here:
https://github.com/nim-lang/Nim/issues/5733

It's a bit unfortunate that quote do is indeed such a common form that ended up broken. You can fix the code by surrounding the quote in parenthesis.

well I was looking for a long time for something that does not require me to surround quote in parentheses. And I was very happy I had something. I was, until now. I just want to mention, I don't like this change at all.

I just mentioned the problems in one project. I have another private project that is equally filled with errors. Please revert this change.

I don't think this breaking change is justified. Try to parse the following:

  myProc arg1, arg2:   #C0-> myProc(a1, a2, stmt)
    doSomething
  myProc arg1 arg2:    #C1-> myProc(a1(a2), stmt)
    doSomething
  myProc arg1 do:      #C2-> myProc(a1, stmt)
    doSomething        #       WAS: myProc(a1(stmt))
  myProc(arg1)arg2:    #C3-> myProc(a1)(a2, stmt)
    doSomething
  myProc(arg1)do:      #C4-> myProc(a1, stmt)
    doSomething
  myProc(arg1 arg2:    #C5-> ObjConstr(myProc(a1(a2): ident))
    doSomething
    )
  myProc(arg1 do:      #C6-> myProc(a1(stmt))
    doSomething
    )
  myProc arg1 arg2 do: #C7-> myProc(a1(a2), stmt)
    doSomething        #       WAS: myProc(a1(a2(stmt)))
  myProc arg1,arg2 do: #C8-> myProc(a1, a2, stmt)
    doSomething        #       WAS: myProc(a1, a2(stmt))
  • To restore the previous behavior of the call C2, you need to use the relatively uglier C6.
  • The command call syntax of Nim evaluates from right to left. The previous behavior of C2 is more consistent.
  • The previous behavior of C7 and C8 is more consistent and understandable.

Well I like to disagree. Whenever I need the do notation, I need to pass the result to something else. Naked statements with a do notation simply don't exist. Therefore I am now forced to put braces everywhere. I normally would not mind to put braces, but nim is indentation based, and I think it's extremely ugly to have braces around indentation based blocks everywhere.

As discussed in the other thread, the main rationale for the change is that the language should not treat differently statements like:

foo:
  ...

foo do:
  ....

bar(x):
   ...

bar x:
   ...

bar x do:
   ....

Doing otherwise seems inconsistent with the nkCommand syntax.

This feature is considered a mechanism to create control-flow and function-like abstractions:

consoleCommand "fov" do (angle: int):
    # here, we are implementing a quake-like console in a game
    ...

retry 3:
    # retry is a custom control-flow construct that tries to re-execute
    # a block specific number of times if an exception is thrown
    connection = connect(some_address, some_port)
    ...

undoRedoAction "Resize Shape" do:
   # perform the operation
   ...
do:
    # write the code that will be executed on undo
    ...

window.on "resize" do:
    # this form is useful when working with APIs intended for dynamic languages

I agree that procs like quote suffer from this change significantly. If you hate the parenthesis, you can also assign the result to a variable, but perhaps we could use something like Haskell's $ operator:
http://stackoverflow.com/questions/940382/haskell-difference-between-dot-and-dollar-sign

I also once did the console command thing once and it works very well without the weird syntax of the do notation:
https://github.com/krux02/opengl-sandbox/blob/master/examples/console.nim#L95

interpreterProcs:
  proc command_add(arg1: int, arg2: int): void =
    let res = arg1 + arg2
    echo arg1, " + ", arg2, " = ", res

  proc command_add3(arg1,arg2,arg3: int): void =
    let res = arg1 + arg2 + arg3
    echo arg1, " + ", arg2, " + ", arg3, " = ", res

  proc command_mult(arg1,arg2: int): void =
    echo arg1, " * ", arg2, " = ", arg1 * arg2

Personally I think that's much nicer looking than what you suggested with the do notation. An alternative to this block could also be to just add a pragma to the exported functions.

I don't think the language needs the do natation at all. The only reason I used the do notation in the past was, it allowed me to add a block of code to an expression at the end of the line. Now you took this feature away and every instance of the do natation is parsed as a statement.

All your examples you show the do notation here, are examples, where you pass basically anonymous functions to a macro/procedure. That's what 位-expressions are for. Not a weird language feature that only one language offers.

Here is what I came up with avoiding the do notation:

proc resize_shape(w,h: float): void =
  # perform the operation
  discard

proc undo_resize_shape(w,h: float): void =
  # write the code that will be executed on undo
  discard

macro undoRedoAction(name: static[string], action, undo: typed): typed =
  # do stuff to register this action
  discard

undoRedoAction("resize shape", resize_shape, undo_resize_shape)

import tables

type
  Window = object
    resizeHook : proc(window: Window, w,h: int): void
    stringNamedHooks : Table[string, proc(window: Window): void]

proc call(window: Window; name: string) =
  window.stringNamedHooks[name](window)

var window: Window

window.resizeHook =
  proc(window: Window, w,h: int): void =
    echo "resize ", w, " ", h

window.stringNamedHooks = {
  "resize": proc(window: Window): void {.closure.} =
                echo "window got resized"
  ,
  "foobar": proc(window: Window): void {.closure.} =
                echo "foobar"
}.toTable

window.call("resize")
window.call("foobar")

Well, you are happy to introduce variables and additional constructs to emulate my examples, but you deem unacceptable to introduce a variable or a couple of parenthesis in your case. That's not very fair. And then my primary argument remains about the command syntax, which you didn't address.

nkCommand notation requires comma to separate arguments:

myProc arg1 arg2 arg3 do:
  doSomething
myProc arg1,arg2 arg3 do:
  doSomething
myProc arg1 arg2,arg3 do:
  doSomething
myProc arg1,arg2,arg3 do:
  doSomething

What do you expect these statements do?

The previous behavior is more consistent.

Well I introduced new variables, because I think it makes sense to symbols for hook functions. That allows them to be called also when not the hook itself is the trigger. I did it because I thought it is useful to have a symbol in this use case. But here 位-expressions do work very well:

undoRedoAction "resize shape",
  action:
    proc(x,y: float): void =
      echo "lambda resize"
  , # yes this comma is weird
  undo:
    proc(x,y: float): void =
      echo "lambda undo resize"

To talk about the command syntax. While I am writing this @jxy made a very good point. I would like to add one more example:

myProc arg1, arg2, arg3, do:
  doSomething

If you want a more consistent syntax, please add a new syntax (this would be syntax error with the current nim compiler):

myProc arg1,arg2,arg3,do:
  doSomething

Instead of trying to use spaces to separate your arguments.

And @krux02 just beat me to it

+1 for comma. This would be the most versatile and consistent syntax.

@zah Is this enough feedback to revert this breaking change?

We'll have to wait for a decision by @Araq on this.

I've explained my arguments and I would still vote for the breaking change. Besides everything I've already said, we should keep in mind that this syntax exists in other languages such as Ruby and Boo and there it behaves like our current compiler. ibn4 principle of least surprise.

@zah your arguments do not mention the comma syntax though. What do you think? Would not it be more consistent? If we treat do as an expression it would make perfect sense to delimit do with comma in a command call:

# myCommand(arg, do)
myCommand arg, do():
  discard

# myCommand(arg(do))
myCommand arg do():
  discard

regular call syntax still remains as it was:

myProc(arg) do():
  discard

@zah I was watching a conference talk about programming paradigms today (https://www.youtube.com/watch?v=Pg3UeB-5FdA) and thought of this conversation. It was mentioned that in the object oriented programming paradigm the term object.foo(arg1,arg2) could in theory be seen as object.send("foo", arg1, arg2), because it is about sending messages. What is see, is that you use the do notation to implement such a messages receiver. I think it would be nicer, if you would not need a different notation to implement your message handler to interface a dynamically typed programming language, than it is to implement a simple procedure that is used on other objects. I update my example here

proc add(arg1: int; arg2: int): void {.genCommandFacade.} =
  ## adds two numbers
  let res = arg1 + arg2
  echo arg1, " + ", arg2, " = ", res

proc add3(arg1,arg2,arg3: int): void {.genCommandFacade.} =
  ## adds three numbers
  let res = arg1 + arg2 + arg3
  echo arg1, " + ", arg2, " + ", arg3, " = ", res

proc mult(arg1,arg2: int): void {.genCommandFacade.} =
  ## multiplies two numbers
  echo arg1, " * ", arg2, " = ", arg1 * arg2

proc ls(): void {.genCommandFacade.} =
  ## list all functions
  for name, _, comment in registeredCommands.items:
    echo name, "\t", comment

proc help(arg: string): void {.genCommandFacade.} =
  ## prints documentation of a single function
  for name, _, comment in registeredCommands.items:
    if name == arg:
      echo comment
      return
  echo "ERROR: no such function found"

It uses the pragma notation, to run a macro to generate all the glue code. Therefore procedures that interface with in interpreter are not only simply written, but also in their syntax they do not distinguish from a regular procedure definition. I thit that makes this syntax preferable than what you had as an example for the do notation:

window.on "resize" do:
    # this form is useful when working with APIs intended for dynamic languages

This is just an example that doesn't depend on anything else of the project. So you should be able to run it on the newest version of Nim.

I am sorry that I cannot really comment on the undo/redo syntax thing or give something better. All I can think of, is that undo could probably be much easier implemented with a state history and some diffing for compression (that is how it is implemented in Braid), and not some manual state reversal that could end being hard maintain. My point is simply that for a breaking change there should be some hard evidence that the change is worth it, and I don't see undo/redo as such an example.

I do agree that the do notation looks unnatural in Nim and people will always strive to use more natural syntax (like the aforementioned pragmas).

I do have some nefarious reasons to support my position though. Semantically, the do notation is preferable, because it doesn't require the introduction of additional helper macros and when closures are involved, the implementation of a proc like consoleCommand can be much more straight-forward.

The nefarious part is that I do plan to work on a dialect of Nim in the future, where regular procs and custom procs will look exactly the same when it comes to "naturalness". This grammar will have "blocks" as first-class citizens and there won't be keywords such as proc. Instead proc is a regular magic function that takes an identifier and a block and you can easily define similar functions yourself:

proc foo [a b]
   ...

async bar [a b]: int
   ...

consoleCommand "help"
   ...

items.each [item]
   ...

So, this is perhaps a lost cause, but I do encourage people to use the semantically-natural do notation instead of the syntactically-natural procs with attached pragmas.

Regarding the breaking change, the do notation was always intended to work like Ruby and Boo. It was an accident that we have broken this in a previous release and it did go unnoticed.

Well I am not a Ruby nor a Boo programmer. So I don't know how the do notation in such languages work. All I see is that the examples you gave to break the language feature that I used a lot are not really good examples to use the do notation in my opinion. When you want to make your own dialect of Nim where you break the compatibility of the do notation, I am totally fine with whatever you do there, because it doesn't affect me. But this change here in Nim does affect me, and I don't like it. It means I not only have to replace a few occurences in my code, it also means that I have to change a pattern that I got used to and started to really like, and replace it with something that much less satisfying.

I actually thought about making a little tutorial about the neat trick that you can prepend any method in front of a quote do:. The problem with quote do: is that it only produces StmtList, and nothing else. Whenever I need something different that a StmtList I need to quote something that contains the Node that I want, and then extract it. And the old syntax allowed me to just prepend the quote do: with the extractor proc. Also in a majority of cases I want to append the result of a quote do: to a StmtList or some other node. These are all cases and patterns that I got used to, and I don't want to change my patters, especially when I started to really like them.

This also breaks an example in my book: https://forums.manning.com/posts/list/41175.page#p115468

I can still modify it, but not for long. So, should I fix it?

I vote for "reverse" the parser change. I don't see the merit in this consistency and it breaks too much code. We can then release 0.17.2 and pretend it was just a bug... :-)

To be clear, the specific reversal proposal may be the following:

"Do binds strongly to the preceding param only when used with the command syntax"

foo bar do: body is parsed as foo(bar(body)).

This is in contrast with the following forms:

foo bar: body continues to be parsed as foo(bar, body).

foo(bar) do: body continues to be parsed as foo(bar, body)

Should the last two forms be the same? The old parser was throwing in an occasional nkLambda only in some of the situations where do is used.

instead of param, I would call it symbol, but yes. Let my result.add quote do: and my result.add extractSymbol quote do: work again as they were before.

"Do binds strongly to the preceding param only when used with the command syntax"

Not sure I agree with this way of putting it but I agree with the examples you present.

Here is the example from my book:

  var body = newStmtList()
  body.add quote do:
    var `objIdent` = parseFile(`filenameIdent`)

Unless somebody tells me otherwise I will keep it as-is and not change it. (Please let me know ASAP, I will submit this chapter within a day so not much time to change this!)

In fact, I have another piece of code in the book that would require changing. As far as I can see it would need two extra lines (x4) which would reflow the book too much (my publisher wouldn't be happy)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Tronic picture Tronic  路  3Comments

hlaaftana picture hlaaftana  路  3Comments

SolitudeSF picture SolitudeSF  路  3Comments

hlaaftana picture hlaaftana  路  3Comments

capocasa picture capocasa  路  3Comments