Openrefine: Using "Apply to All Identical Cells" gives error when cell contains Number or Boolean

Created on 30 May 2018  路  13Comments  路  Source: OpenRefine/OpenRefine

When reporting a bug please provide the following information to help reproduce the bug:

Version of OpenRefine used (Google Refine 2.6, OpenRefine2.8, an other distribution?):

3.0 Beta

Steps followed to create the issue:

Edit two cells to contain the same Number or Boolean
Click edit in one of the cell, change the value, and click "Apply to All Identical Cells"

Current Results:

Error in pop-up:
JSONArray[0] not a string.

Expected Results:

The identical cells should all update to the edited value

See also #332

bug High

Most helpful comment

Just an update - I'm working on this and work in progress is at https://github.com/ostephens/OpenRefine/tree/mass-edit-fix

tl/dr fix #1631 is relatively straightforward, but I'm trying to make tests and fix #1632, #180 and #332 at the same time which is proving more complicated and taking more time

All 13 comments

From console:

23:56:25.124 [                  command] Exception caught (1ms)
org.json.JSONException: JSONArray[0] not a string.
    at org.json.JSONArray.getString(JSONArray.java:405)
    at com.google.refine.operations.cell.MassEditOperation.reconstructEdits(MassEditOperation.java:124)
    at com.google.refine.commands.cell.MassEditCommand.createOperation(MassEditCommand.java:59)
    at com.google.refine.commands.EngineDependentCommand.doPost(EngineDependentCommand.java:78)
    at com.google.refine.RefineServlet.service(RefineServlet.java:178)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
    at org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:81)
    at org.mortbay.servlet.GzipFilter.doFilter(GzipFilter.java:132)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
    at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    at org.mortbay.jetty.Server.handle(Server.java:326)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
    at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:938)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:755)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
    at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

This works in 2.8
In both 2.8 and 3.0 the same JSON message is passed from the front end:

OR 3.0 mass edit number:
columnName: Column 1
expression: value
edits: [{"from":[2],"to":"2","type":"text"}]
engine: {"facets":[],"mode":"row-based"}

OR 2.8 mass edit number:
columnName: Column 1
expression: value
edits: [{"from":[2],"to":"2","type":"text"}]
engine: {"facets":[],"mode":"row-based"}

None of:

  • com.google.refine.operations.cell.MassEditOperation
  • com.google.refine.commands.cell.MassEditCommand
  • com.google.refine.commands.EngineDependentCommand

changed between 2.8 and 3.0 beta

Looking at the Git commit history and see the impacts of org.json.* it is highly likely it came from either

https://github.com/OpenRefine/OpenRefine/commit/19f98b7ea2a012065a9e21999a9f9dc30347daf7
or
https://github.com/OpenRefine/OpenRefine/commit/f58d963dbd459036bbe3cbf505ba38b093e14266

Turn on Debug and Step through with a breakpoint on com.google.refine.operations.cell.MassEditOperation.reconstructEdits(MassEditOperation.java:124)
to see what the JSON Array's index 0 value really is.

Its also interesting to note that the description on JSONArray itself...which to me provides the clue
https://stleary.github.io/JSON-java/org/json/JSONArray.html

Through testing, I think the issue was introduced in

1398 (commit c4b0ff6bea1278032855c56bb8c6393619c42acd)

which was a big change. Not sure what it was in this commit that caused the problem but it gives a place to start the investigation

My guess is the update of the json jar used:
json-20100208.jar -> json-20160810.jar

is where the problem lies. @jackyq2015 any ideas?

OK - this commit https://github.com/stleary/JSON-java/commit/f4cb14728f13629972a0ea76bb3dc0705a735fa8 changes the behaviour of JSONArray.getString() to throw an exception where it doesn't find a string. Previously it tried to convert any object to a string

This is the problem (and error message) we are seeing here

So basically com.google.refine.operations.cell.MassEditOperation.reconstructEdits needs to be smarter - my first instinct is we will need to use getJSONObject instead of getString and then decide what to do with the retrieved object, but not looked closely yet

the generic JSONArray.get() might be useful for 1st pass inspection.

Alternatively, you possibly could use https://stleary.github.io/JSON-java/org/json/JSONArray.html#optJSONObject-int- for that 1st pass inspection, since the opt methods do not throw an error and instead return null.

Get the optional JSONObject associated with an index. Null is returned if the key is not found, or null if the index has no value, or if the value is not a JSONObject.

Up to you @ostephens how you think best to handle it. Some ways are easier for test setup, but there might be more thorough tests needed in those cases.

Note from #1639 this issue also occurs if you create a text facet on a column containing numbers/booleans and attempt to do an edit on a value in the facet

Just an update - I'm working on this and work in progress is at https://github.com/ostephens/OpenRefine/tree/mass-edit-fix

tl/dr fix #1631 is relatively straightforward, but I'm trying to make tests and fix #1632, #180 and #332 at the same time which is proving more complicated and taking more time

@ostephens I am so impressed with your digging in and solving real problems and coming up with great solutions to our issues. Thank you so much for stepping up and learning beyond what you originally intended of helping out on "easy fixes where you can" ! Keep up the great work , we really appreciate it !

Fixed by #1642

Was this page helpful?
0 / 5 - 0 ratings