Black: Syntax errors reported on the wrong line due to multiple grammars tested

Created on 27 Jul 2018 · 9Comments · Source: psf/black

Operating system: Ubuntu 16.04
Python version: 3.7.0
Black version: 18.6b4
Does also happen on master: yes

For this minimal example, the syntax error is reported on the second line instead of the first:

print(f"spam", ham="eggs")
pancakes = [,,]

It happens with at least several other syntax errors too (eg pancakes = [..], pancakes =).
It happens no matter how many valid statements are added before, after, or between the lines.
It only happens with the print function, and only when passing it a keyword argument.

design help wanted

Source

noahstier

👍1

Most helpful comment

Is that a bug in the last grammar?

No. Python 2 and Python 3 are different in this regard as you might now. Black can be used to format both Python 2 and Python 3 code. We default to the Python 3 grammar because we're not Luddites. However, for users who are still on Python 2 we try to be helpful to them, too, and when the Python 3 grammar fails to parse, we attempt Python 2 and see what happens.

The problem here is a bit tricky to solve because out of many syntax errors that we see, we don't know which syntax error is "the worst", in other words which is the one we should report to the users.

Let me give you an example:

async def g():
    print 'Hey'

This is clearly an invalid file but Black can't know what the user meant, e.g. which part is wrong and should be addressed by the user. It can be either.

You might think, maybe a sensible heuristic would be to choose the syntax error out of all available errors that comes from the farthest line in the file? That means the parser was able to get through all the previous "errors" that other grammars found problematic! But in this case, if somebody puts a Python 2-only piece of code somewhere high in their file, we will report valid Python 3 code below as invalid syntax. So, there's no silver bullet.

I think Jelle is right here. If your code cannot be compiled by Python, you need to address this.

I will leave this open if anybody has a better idea, especially that I want to add more grammars (Cython!) so this is only going to be a bigger problem in the future.

ambv on 17 Aug 2018

👍2

All 9 comments

What are you doing exactly? Black is not meant for reporting syntax errors, it's a code formatter.

JelleZijlstra on 27 Jul 2018

This is because of how black tries multiple possible python grammars when encountering a syntax error. It can't tell the "real" cause of the syntax error, because the last grammar it tries doesn't accept print as a function.

zsol on 28 Jul 2018

What are you doing exactly?

Just trying to format some code, and getting confusing error messages.

Black is not meant for reporting syntax errors

Black reports syntax errors, and not by accident. I would argue that it is meant to report syntax errors, and if it reports them incorrectly that's a bug, even if reporting syntax errors is not listed in the mission statement.

black tries multiple possible python grammars ... the last grammar it tries doesn't accept print as a function

Interesting. Although it apparently does accept print as a function, just not when it is given keyword arguments. Is that a bug in the last grammar?

noahstier on 29 Jul 2018

Is that a bug in the last grammar?

Let me give you an example:

async def g():
    print 'Hey'

This is clearly an invalid file but Black can't know what the user meant, e.g. which part is wrong and should be addressed by the user. It can be either.

I think Jelle is right here. If your code cannot be compiled by Python, you need to address this.

I will leave this open if anybody has a better idea, especially that I want to add more grammars (Cython!) so this is only going to be a bigger problem in the future.

ambv on 17 Aug 2018

👍2

Ok thank you for the explanation. I think Black's behavior makes sense here, and maybe a slightly more detailed error message would be enough to avoid confusion.

For example, the error and the grammar could both be printed for each grammar tried.

$ echo "print(f'spam', ham='eggs'); pancakes = [,,]" > test.py
$ black test.py
error: cannot format test.py:
Failed parsing as python 3: 1:30: pancakes = [,,]
Failed parsing as python 2: 1:18: print(f"spam", ham="eggs")
All done! 💥 💔 💥
1 file failed to reformat.

noahstier on 17 Aug 2018

👍1

@ambv @zsol : Isn't the fix obvious?
Allow the user to restrict black regarding which grammars it will try (can be restricted to: pick one, default any)?

The current behaviour is really problematic, especially when:

You use black within the editor e.g. vim
You are in prototyping phase and use print(), since your logging is not yet set up. E.g. in lambdas, e.g. in list comprehensions, e.g. with kwargs (file=sys.stderr)

Then you can basically forget about getting any indication on why your code is not formatted...

... and this even if you try get black have that last py2 grammar check not fail via from __future__ import print_function - and also even if you try have black do py36 syntax only:

# echo -e 'from __future__ import print_function\nlambda: print()\nif i:\ni=1' | black --py36 -
from __future__ import print_function
lambda: print()
if i:
i=1
error: cannot format -: Cannot parse: 2:8: lambda: print()   # <--- aaaargh

Yes, it is a formatter and not a syntax checker - but it was just so cool to have both steps done in one and it was a low hanging fruit for you to deliver also that, considering the advanced way you do things. It may be not intended by you but it is one big selling point.... So taking the handwaving route (of @JelleZijlstra) on that one would be really sad...

Now:

I understand why the __future__ import would be hard to see and apply for you.
I don't understand why it still tried 2.7 although I said --py36
Also I think checking ANY grammar is not scalable when it comes to more, you mentioned cython, there are others (e.g. Transcrypt)

=> So, why not give the user the chance to say: Hey, do apply grammar xyz (e.g. python3.7) only?

AXGKl on 21 Sep 2018

Ok, I had a quick check in the code and it would be pretty simple, unless I overlooked sth.
Not digging into click/options, just use an env var (same testfile than above):

# black r.py
error: cannot format r.py: Cannot parse: 2:8: lambda: print()  # <-- bad
All done! 💥 💔 💥
1 file failed to reformat.

# export BLACK_GRAMMAR=py3
# black r.py
error: cannot format r.py: Cannot parse: 7:4:     i = 1    # <---- good
All done! 💥 💔 💥
1 file failed to reformat.

really easy, too small for a PR - basically give the grammars a name, so that the user can pick one:

GRAMMARS = [
-    pygram.python_grammar_no_print_statement_no_exec_statement,
-    pygram.python_grammar_no_print_statement,
-    pygram.python_grammar,
+    ["py3", pygram.python_grammar_no_print_statement_no_exec_statement],
+    ["py2_with_futures", pygram.python_grammar_no_print_statement],
+    ["py2", pygram.python_grammar],
 ]
+# TODO: should be click option:
+user_req_grammar = os.environ.get("BLACK_GRAMMAR")
+if user_req_grammar and not [g for g in GRAMMARS if g[0] == user_req_grammar]:
+    user_req_grammar = None # fall back to checking all


 def lib2to3_parse(src_txt: str) -> Node:
     """Given a string with source, return the lib2to3 Node."""
-    grammar = pygram.python_grammar_no_print_statement
     if src_txt[-1:] != "\n":
         src_txt += "\n"
-    for grammar in GRAMMARS:
+
+    for grammar_name, grammar in GRAMMARS:
+        if user_req_grammar and grammar_name != user_req_grammar:
+            continue
         drv = driver.Driver(grammar, pytree.convert)
         try:

At least you should, imho, use pygram.python_grammar_no_print_statement last - can't see a reason why you let errors seen by this one be hidden by errors seen by the "dumbest" of them all (dumb=ignores future print_function imports), which is pygram.python_grammar in your current version.

AXGKl on 22 Sep 2018

Ok, I did make a PR: https://github.com/ambv/black/pull/528 now, really would like to have it in if possible...

AXGKl on 23 Sep 2018

Since #618 --target-version allows you to choose the grammar, so the original minimal example reports a syntax error on the first line with -t py27, and the second line with -t py36

zsol on 2 May 2019

Was this page helpful?

0 / 5 - 0 ratings