When reporting a bug please provide the following information to help reproduce the bug:
Comparing master+wikidata extesion with OR 2.8
Another date issue I'm afraid. Not sure if this is related to general date handling or something about the wikidata extension - not had a chance to look yet.
Using the code from master branch + wikidata extension
Adding a column using "Add columns from reconciled values" with a date type (e.g. 'date of birth').
The retrieved date is not identified as a 'date' type in OR but rather as a 'Java.time.OffsetDateTime'

however, then opening the same project (no changes) in OR2.8 the same cell shows correctly as 'date' type

In addition this seems to stop date based functions working on the value in master branch:

But the same function works fine in 2.8

Expect the data to register as a 'date' type, and for date functions to work as in 2.8
Thanks for catching this before the release! I'll investigate.
@jackyq2015 I'm now totally confused - I don't get why we made changes to date parsing in the first place. My understanding was that you had changed date parsing while dropping support for JDK7, but as far as I can tell the code we had before that runs fine with JDK8. I am tempted to just restore ParsingUtilities.stringToDate to the version before your change 2afdcbf54 because I don't see what is wrong with it?
Maybe we should have a call about this because I feel like I have misunderstood something…
So, here is how I understand the situation:
when @jackyq2015 worked on last modified dates he undertook to migrate from old-style Date and Calendar classes to the modern implementations, which was not necessary in the migration from JDK7 to JDK8 but is generally a sensible thing to do. However, this work has not been finished so there are discrepancies in the code (some parts still work with the old-style classes, some use the new ones), which causes the bug observed by @ostephens (which btw is not specific to wikidata-extension, it can be observed in master too).
There are two ways out of this situation:
@jackyq2015 any info would be welcome…
I say we go forward.
And its not hard to go forward... it just means everyone putting better Tests in place where we are lacking them.
Like here especially lacking:
https://github.com/OpenRefine/OpenRefine/blob/master/main/tests/server/src/com/google/refine/tests/expr/functions/ExpressionUtilsTests.java
But basically adding good tests for these 44 source files that involve Date:
https://github.com/search?utf8=%E2%9C%93&q=Date+repo%3Aopenrefine%2Fopenrefine+path%3A%2Fmain%2Fsrc+language%3AJava+language%3AJava&type=Code&ref=advsearch&l=Java&l=Java
@thadguidry a full migration would be very costly (as you said there are many different classes which still use that). There might be ways to isolate that usage (for instance, it is probably hard to migrate the internals of CalendarParser to use the new classes, but it should be doable to just change its interface).
Personally I will not have time to do a full migration before at least 6 weeks…
i will take a look. but date is an internal type of openrefine. we are
moving to java 7 date to java 8 time api. there is nothing wrong with the
type. the broken part is the function.
On Tue, May 1, 2018, 3:01 AM Antonin Delpeuch notifications@github.com
wrote:
@thadguidry https://github.com/thadguidry a full migration would be
very costly (as you said there are many different classes which still use
that). There might be ways to isolate that usage (for instance, it is
probably hard to migrate the internals of CalendarParser to use the new
classes, but it should be doable to just change its interface).
Personally I will not have time to do a full migration before at least 6
weeks…—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/OpenRefine/OpenRefine/issues/1588#issuecomment-385612547,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKCGizcG9l9m0PbloZvihTi2k2RrDDSYks5tuAhLgaJpZM4TrYsr
.
toDate('2017-01-01T00:00:00Z').type() gives java.time.OffsetDateTime whereas toDate('2017-01-01').type() gives date because since your partial migration the toDate function can return various types depending on the input… I can provide a quick fix for that but it does not fix the root cause.
This situation where both old and new date classes are allowed as cell values is dangerous - now I realize that this change was quite ill judged. It was a mistake to do it lightly: in most applications, you could have relied on the type checker to do the migration, but because OpenRefine uses Object for cell values, we cannot rely on that.
I think the java 8 time API is the right way to go even it will take effort. It is safe to just keep the java 7 Date and Calendar API but it is not the long term solution. The original code has a pattern to deal with the date like this:
if (instance of date)
do something;
else if (instance of calendar)
do something else;
I doubt if it's necessary. Even it is, I think some of them is for backward compatibility. We can consider to use the OffsetDateTime to unify those kind of block if possible. At least I know for the cell date type, we can store the "date" string to indicate its type, not the java class. So potentially it's possible to only change the code but keep the file format untouched.
I don't think dates should be stored as strings - this would go against the current design of storing strings, numbers, patterns and dates as different Java objects. For now, we need to make sure all dates stored in cells use the Java 7 classes, not the Java 8 ones, so that we don't break compatibility with GREL and extensions which rely on that. This is especially important because this introduces bugs that cannot be detected at compile time.
If we were months away from the next release we could try to do a full migration but right now I don't think we can take that risk.
Tested with #1597 and this seems to have fixed this particular problem
We are still fixing bugs related to the change of the internal Date type including #1908 and #2241.
Another impact of this change was redefining the historical OpenRefine handling dates without timezones as being datetimes in the local timezone to instead cast everything as UTC datetimes.
There is absolutely no reason that dates in in metadata need to be treated uniformly with the user data type. They are completely different, disjoint contexts. Things like modifiedAt metadata should be stored as an ISO 8601 UTC datetime and converted to the user's locale by the browser. This has nothing to do with how dates in user data are parsed, computed on, etc.