Openrefine: GREL parsing is too lazy and doesn't throw syntax errors

Created on 1 Apr 2020  路  11Comments  路  Source: OpenRefine/OpenRefine

Describe the bug
Arrays are not parsed properly for syntax issues.

To Reproduce
Steps to reproduce the behavior:

  1. Go to Edit Cells -> Transform
  2. Paste the example GREL [1,['a','b','c']]
  3. Notice no Parsing error is given and output is [ 1, [ "a", "b", "c" ] ]
  4. Paste the example GREL [1['a','b','c']]
  5. Notice it evaluates to [ null ]

Expected behavior
I would expect the GREL [1['a','b','c']] would throw a Parsing exception as it currently does for something like [1,['a','b','c']3] where the 3rd element 3 in the outer array is missing a comma before it. Instead, ['a','b','c'] as an index token of the element [1] is evaluated to null instead of IndexOutOfBoundsException

Desktop (please complete the following information):

  • OS: Windows 10
  • Browser Version: Firefox
  • JRE or JDK Version: JDK8

OpenRefine (please complete the following information):

  • Version [e.g. OpenRefine 3.3 Beta]

Additional context
Interestingly, GREL doesn't seem to fully evaluate all the elements in a continuing expression. For instance, GREL ['a','b','c']3] does not advise the user of a Parsing exception and ignores remaining part of the expression 3].

Additionally,

  1. [a] evaluates to [null] where a was not even evaluated properly.
  2. ['a'] evaluates to ["a"] as expected.
  3. 1[] evaluates to null as unexpected.
  4. 1['a'] evaluates to null as unexpected.
  5. 1[0] evaluates to 1 as expected
  6. 1[1] evaluates to Error: java.lang.StringIndexOutOfBoundsException: String index out of range: 2 as expected.
bug error handling expression language grel

All 11 comments

Correctly, [] is an empty array as shown with [].type()
But then when we mix in Indexes of an Arrays, it is there that GREL syntax evaluates lazily, IMHO, maybe for good reason, I cannot recall what David and I expected, it's been too long. Maybe @tfmorris you remember?

This sounds very similar to #169 posted by @thadguidry in 2010. ;) (and no, I don't know of any good reason for it to behave this way -- I think it's just a bug)

Ah that issue #169 ... yeah that one as explained to me by David at the time was something like "I didn't expect folks to type paragraphs of Greek into GREL expressions! ...so I didn't work very hard on look-ahead parsing evals" or something like that if I recall he told me.

Anyways, indeed, I forgot about that particular issue. And it would be nice to get it fixed.
So...

169 seems to be around parentheses force evaluation order (a bug left over)

2507 this one, seems to be around the index/array evaluation (another bug, which seems non-trivial if we keep the GREL syntax exactly as it is. My hunch is that GREL will need to grow up a bit now for Array handling support and we might need to make it a bit more Ruby/Pythonic here)

@thadguidry: just to be sure, when you talk about indexes of arrays, you talk about vVar[0]?
So [0[1]] would be a confusion of the GREL interpreter and he would see an index of an array here?

Regards, Antoine

It's about making sure our Scanner is finding the right Token (and Token Type) at a given Index.
Go crazy:

  1. https://github.com/OpenRefine/OpenRefine/blob/master/main/src/com/google/refine/grel/Scanner.java
  2. https://github.com/OpenRefine/OpenRefine/blob/master/main/src/com/google/refine/grel/Parser.java

Notice the //swallow comments in there... I think somewhere we are swallowing (ignoring) prematurely.

Hi, @thadguidry I would like to work on this.

Why is this labelled high priority? Is it something critical to fix for 3.4?

Why is this labelled high priority? Is it something critical to fix for 3.4?

My mistake. I corrected it.

Regards, A.

Thinking critically, this issue is not a bug, but more of an enhancement. #169 is a bug. So relabeling this issue.

I would call this a bug.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ralcazar-oeg picture ralcazar-oeg  路  3Comments

kushthedude picture kushthedude  路  3Comments

stellasia picture stellasia  路  4Comments

lapoisse picture lapoisse  路  3Comments

ettorerizza picture ettorerizza  路  4Comments