Version of OpenRefine used: master
Operating Systems and version: linux
open a project created with an earlier version of OpenRefine
11:42:35.336 [..ject_metadata_utilities] load metadata failed: /home/antonin/.local/share/openrefine/2329149621165.project/metadata.old.json (1ms)
11:42:35.336 [..ject_metadata_utilities] java.time.format.DateTimeParseException: Text '2018-03-21T11:49:28.992' could not be parsed at index 23
at java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:1949)
at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1851)
at java.time.Instant.parse(Instant.java:395)
at com.google.refine.util.ParsingUtilities.stringToLocalDate(ParsingUtilities.java:196)
at com.google.refine.model.medadata.ProjectMetadata.extractModifiedLocalTime(ProjectMetadata.java:296)
at com.google.refine.model.medadata.ProjectMetadata.loadFromJSON(ProjectMetadata.java:216)
at com.google.refine.model.medadata.ProjectMetadata.loadFromStream(ProjectMetadata.java:583)
at com.google.refine.model.medadata.ProjectMetadata.loadFromFile(ProjectMetadata.java:555)
at com.google.refine.io.ProjectMetadataUtilities.loadFromFile(ProjectMetadataUtilities.java:162)
at com.google.refine.io.ProjectMetadataUtilities.loadMetaDataIfExist(ProjectMetadataUtilities.java:109)
at com.google.refine.io.ProjectMetadataUtilities.load(ProjectMetadataUtilities.java:98)
at com.google.refine.io.FileProjectManager.loadProjectMetadata(FileProjectManager.java:122)
at com.google.refine.io.FileProjectManager.recover(FileProjectManager.java:452)
at com.google.refine.io.FileProjectManager.initialize(FileProjectManager.java:83)
at com.google.refine.RefineServlet.init(RefineServlet.java:130)
at javax.servlet.GenericServlet.init(GenericServlet.java:241)
at edu.mit.simile.butterfly.Butterfly.init(Butterfly.java:180)
at org.mortbay.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:440)
at org.mortbay.jetty.servlet.ServletHolder.doStart(ServletHolder.java:263)
at com.google.refine.RefineServer.configure(Refine.java:296)
at com.google.refine.RefineServer.init(Refine.java:208)
at com.google.refine.Refine.init(Refine.java:114)
at com.google.refine.Refine.main(Refine.java:108)
the old metadata should be parsed correctly. We should accept timestamps without a Z character at the end.
@jackyq2015 it seems related to your change?
Your project maybe created from one of the dev version which don't have the Z at the end. Later in order to support old version, the Z is added to keep the same format as the old versions.
@jackyq2015 I think the Z should also be added at parsing time if not present, then. My workspace is completely screwed by that…
@jackyq2015 my test5 workspace is also screwed up by this. Although it is a test workspace and I backup using iterative "testX" workspace paths. I guess Antonin doesn't know about our best practices of creating backups of the workspaces while developing with OpenRefine :) @wetneb Now you know. But seriously Jacky, yes it would be nice to have my test5 workspace fixed up. Hmm.. perhaps just a preferences array for crap that we break and change along the way, so we can fix without disruptions sometimes ? ["dateAddZ":"True","newGrelContainsCommandFallback":"True"]
I had the same issue - fixed it up by fixing all the data via find/replace across all metadata.json
I tend to feel that adding the Z at parsing time as suggested by @wetneb might be worth it to avoid issues for any other testers, although I don't have strong feelings about it
@ostephens yeah, or the simple fix... find/replace in the metadata.json :) Good enough for now and works.
@thadguidry haha, thanks for the lecture ^^ I do have backups, that's not the point.
@jackyq2015 just noticed another place where this change causes a bug: in the Data Extension API, if the service returns a date in the previous format (which the wikidata interface does), parsing fails. That makes it impossible to fetch dates.
22:08:44.887 [ command] Exception caught (856ms)
java.time.format.DateTimeParseException: Text '2006-01-01T00:00:00+00:00' could not be parsed at index 19
at java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:1949)
at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1851)
at java.time.Instant.parse(Instant.java:395)
at com.google.refine.util.ParsingUtilities.stringToDate(ParsingUtilities.java:191)
at com.google.refine.expr.functions.ToDate.call(ToDate.java:88)
at com.google.refine.model.recon.ReconciledDataExtensionJob.collectResult(ReconciledDataExtensionJob.java:196)
at com.google.refine.model.recon.ReconciledDataExtensionJob.extend(ReconciledDataExtensionJob.java:122)
at com.google.refine.commands.recon.PreviewExtendDataCommand.doPost(PreviewExtendDataCommand.java:124)
at com.google.refine.RefineServlet.service(RefineServlet.java:178)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
at org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:81)
at org.mortbay.servlet.GzipFilter.doFilter(GzipFilter.java:132)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:938)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:755)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
22:09:01.687 [ ProjectManager] Saving some modified projects ... (16800ms)
More broadly, this is also an issue that affects the toDate GREL function: things like toDate('2006-01-01T00:00:00+00:00') worked in earlier versions, and not in master.
In short: please ensure that date formats that were supported before are still supported. Those are three different examples of bugs introduced by this change, how many do I need to collect so that this mess is cleaned up?
I tend to leave the code unchanged. It is unfortunate that it happened and screwed up the tester's workspace. But my impression is that it is not in a official release so very fewer user got impacted. So there is no backward comparability required. It is a mistake and it has been fixed already. Adding code to support the result introduced by the mistake is another mistake and will cause confusion.
For the toDate('2006-01-01T00:00:00+00:00') issue, I will take a look.
Well, I think it is in our interest that date parsing utilities accept a wide range of formats, so adding support for that format would definitely make sense regardless of the bug… How would it "cause confusion" to accept timestamps without a Z? That would make the toDate function more flexible.
For the file format, we should have only one definition. Either have Z or without Z. Allowing 2 will cause confusion. However, to parse a date sting to date is a different story. I agree we should support more if possible.
The date parsing utilities class is used at various places in the code, so we should definitely make sure the fixes go there, and not just in the toDate function. Get date parsing right for once and then use that everywhere else. Making ParsingUtilities.stringToDate accept dates without Z does not "cause confusion", it just widens the formats accepted when reading dates in OpenRefine. This will be very useful to our users and authors of reconciliation services.
The metadata format for OpenRefine remains the one that is defined by the serializer, which includes Z. No confusion about that.