Dhall-haskell: Poor parse error message with parenthesis-enclosed function application

Created on 6 Dec 2019  Â·  12Comments  Â·  Source: dhall-lang/dhall-haskell

If a parse error occurs within parentheses in function application, instead of an error at the point of the parse error, you instead get unexpected '(' at the open parenthesis. For example:

> dhall <<< '(3f)'
Error: Invalid input

(stdin):1:3:
1 | (3f)
  |   ^
unexpected 'f'

vs

> dhall <<< 'a (3f)'
Error: Invalid input

(stdin):1:3:
1 | a (3f)
  |   ^
unexpected '('

This does not itself seem particularly problematic, until the expression within the parentheses is complex and the parse error difficult to see at a glance. For example:

1 | Optional/fold Text optional Text (λ(text : Text) → "\latexcommand{${text}}") ""
  |                                  ^
unexpected '('

You can stare at the above for quite a long time before you notice the real problem, namely that \l is not a valid escape sequence, and it should be "\\latexcommand{${text}}". (Don't ask me how I know this...)

error messages parser

All 12 comments

As far as I can tell, the only way to fix this is to make two changes:

  • Remove support for customizable import parsing (e.g. the Dhall.Parser.exprA utility)

    ... which is a prelude to:

  • Switch to an LALR parser (e.g. Earley) or parser generator (e.g. happy)

    ... which would fix most of the parsing issues that we've been running into

I don't think there are any shortcuts left at this point for fixing this particular issue

This is a particularly aggravating issue when you have a long file, example I had this typo:

screenshot

on line 1238, and it caused this error message:

> dhall --file ./log2.dhall
dhall: 
Error: Invalid input

./log2.dhall:1217:11:
     |
1217 |           [ tweag
     |           ^
unexpected '['
expecting ',', ->, :, ], operator, or whitespace

That’s two screen heights above the actual error at the beginning of the nested list!

I recently had a run-in with this issue. I think I would have torn my hair out, if it was long enough to do so! ;)

@Gabriel439 I see that you've mentioned generating the parser with happy above. I've done small edits on GHC's happy code, but I don't have a good understanding of it. Would we generate the parser from dhall.abnf, or a modified version of it?

Would we generate the parser from dhall.abnf, or a modified version of it?

Happy doesn't directly work with that specific format so we need to port it to happy syntax. The only problem that we might encounter by porting dhall.abnf to happy syntax is that happy works much better on a right-recursive grammar and dhall.abnf spec is mostly left-recursive (indirectly).

@Gabriel439 Thanks! With Earley we'd still have a hand-written parser, right? It can't be generated from dhall.abnf as some other implementations seem to do?

@sjakobi: Yeah, it would still be hand-written, but it would no longer require backtracking (which is the main reason for poor error messages) or tricks to avoid pathological performance. The main downside of Earley is that it might have slower constant factors because it doesn't support fast bulk operations like megaparsec does

Morte … I just now realized that the Planescape naming scheme is older than Dhall :)

@Profpatsch: Yep! That's correct :slightly_smiling_face:

@ocharles just shared his Earley-based parser in https://hub.darcs.net/ocharles/dhalli/browse/lib/Dhalli/Syntax.hs with me. Might be a good resource when we finally make the switch! :)

BTW, Happy let's you export multiple parsers these days, and it can report expected tokens (output may need some tidying, but it is doable).

Was this page helpful?
0 / 5 - 0 ratings