In Presto 0.178 (or earlier) running Java 8u101 (or later), we get
presto> select date_trunc('day', FROM_UNIXTIME(1496548289) AT TIME ZONE 'Pacific/Easter');
_col0
----------------------------------------
2017-06-02 23:00:00.000 Pacific/Easter
This bug is mitigated by commit 12188921904004d969423e2181930773f731c851. However, the root cause persists.
The cause is that Presto uses both joda and java.util.time. Therefore, when tzdata version mismatches, the result can be completely wrong (with respect to any tzdata version).
As joda and java upgrades independently, there will inevitably be mismatch of tzdata. When that happens, issue like the above will happen again (in a different timezone).
Permanent solutions are more complicated. Possibilities under investigation by @electrum include
In the particular case shown above:
DateTimeFunctions.truncateDate uses Joda@JsonValue SqlTimestampWithTimeZone.toString() uses java.util.timeThis bug is mitigated by commit 1218892. However, the root cause persists.
Shall we rename the issue to "Combined usage of Joda Time and Java Time leads to repeated problems due to time zone database discrepancies" ?
Possibilities [...] Use joda exclusively in Presto
Perhaps it would be worth noting, why using java time exclusively is not a considered option.
I just renamed it. Is it better now?
@dain did a bunch of investigation a long time ago on java.util.time performance. The primary problem is that java.util.time doesn't work with utcMillis (as long) directly. It is required that one provide an actual object, which has year/month/day/hour/... fields eagerly populated. This introduces both cpu/gc issues.
@dain can give you more context on this matter.
I think it might be more preferable to use joda exclusively. Updating joda version is easier than changing jvm version, also joda's time database got updated more frequently.
@electrum @findepi
Joda gets default zone provider this way:
private static Provider getDefaultProvider() {
// approach 1
try {
String providerClass = System.getProperty("org.joda.time.DateTimeZone.Provider");
if (providerClass != null) {
try {
Provider provider = (Provider) Class.forName(providerClass).newInstance();
return validateProvider(provider);
} catch (Exception ex) {
throw new RuntimeException(ex);
}
}
} catch (SecurityException ex) {
// ignored
}
// approach 2
try {
String dataFolder = System.getProperty("org.joda.time.DateTimeZone.Folder");
if (dataFolder != null) {
try {
Provider provider = new ZoneInfoProvider(new File(dataFolder));
return validateProvider(provider);
} catch (Exception ex) {
throw new RuntimeException(ex);
}
}
} catch (SecurityException ex) {
// ignored
}
// approach 3
try {
Provider provider = new ZoneInfoProvider("org/joda/time/tz/data");
return validateProvider(provider);
} catch (Exception ex) {
ex.printStackTrace();
}
// approach 4
return new UTCProvider();
}
As a result, we can implement a custom org.joda.time.tz.Provider (public interface in joda) ourselves (by delegating to ZoneInfoProvider probably).
We can put all versions of tzdata in Presto distribution. Have our implementation of Provider detect the joda version of the current JVM, invoke ZoneInfoCompiler (which compiles to joda specific format) on the fly, and then construct ZoneInfoProvider with the compiled tzdata. All calls to our implementation of Provider will then delegate.
Alternatively, we can also precompile the tzdata with ZoneInfoCompiler and ship that (to avoid the compilation step at runtime).
Now, the question is how to detect what tzdata the current JVM is using. We could hardcode the mapping according to https://www.oracle.com/technetwork/java/javase/tzdata-versions-138805.html , but that poses two problems:
Timezone Updater Tool.Since Java 8, there is a public API to get current tzdata version in JVM: answer from Andreas (currently 2nd) in https://stackoverflow.com/questions/7956044/java-find-tzdata-version-in-use-regardless-of-jre-version
@hellium01 Using joda exclusively could be hard. I'll let @electrum and @martint to weigh in on whether that is doable if we have a strong will.
Yes, I think we can get the tzdata version by getVersion method on ZoneInfoFile. So the method @haozhun proposed do seem favorable.
Reopening this issue as we removed the joda-to-java-time-bridge registration until the cache thrashing issue is resolved.
This issue has been automatically marked as stale because it has not had any activity in the last 2 years. If you feel that this issue is important, just comment and the stale tag will be removed; otherwise it will be closed in 7 days. This is an attempt to ensure that our open issues remain valuable and relevant so that we can keep track of what needs to be done and prioritize the right things.
Most helpful comment
@electrum @findepi
Joda gets default zone provider this way:
As a result, we can implement a custom
org.joda.time.tz.Provider(public interface in joda) ourselves (by delegating toZoneInfoProviderprobably).We can put all versions of tzdata in Presto distribution. Have our implementation of Provider detect the joda version of the current JVM, invoke
ZoneInfoCompiler(which compiles to joda specific format) on the fly, and then constructZoneInfoProviderwith the compiled tzdata. All calls to our implementation ofProviderwill then delegate.Alternatively, we can also precompile the tzdata with
ZoneInfoCompilerand ship that (to avoid the compilation step at runtime).Now, the question is how to detect what tzdata the current JVM is using. We could hardcode the mapping according to https://www.oracle.com/technetwork/java/javase/tzdata-versions-138805.html , but that poses two problems:Users won't be able to run with non Oracle Java version or future Java version (we can probably provide an escape hatch that allows users to manually specify a system property)tzdata can be updated independently of Java version using Oracle providedTimezone Updater Tool.Since Java 8, there is a public API to get current tzdata version in JVM: answer from Andreas (currently 2nd) in https://stackoverflow.com/questions/7956044/java-find-tzdata-version-in-use-regardless-of-jre-version