influxdb 🚀 - Support per-query timezone offsets

Doing a simple +/- is not sufficient for full timezone support because of daylight savings (PST vs PDT depending on the time of year).

For example, if you have a GROUP BY time(1d), the size of the day/bucket can vary depending on it's time of year. This makes it extremely messy to implement with the current method. That said, it is exactly what I need for my application.

Right now, I am planning on implementing an index on top of influx for this purpose, but would love to not have to do that.

jnutzmann on 25 Mar 2015

IMO there should be both options.

Query-wide parameter covers most of the cases but function that just returns offset based on supplied timezone (which would be necessary for per-query operator anyway) is much more convenient than doing that client-side

XANi on 25 Mar 2015

Regarding "any other questions", can support be provided for falling edge v rising edge groupings?

time_zone as a "number" should be a four number string too (some XX30 tz)

jufemaiz on 26 Mar 2015

Not sure if you're still debating format, but I'd recommend supporting the Olsen / tz database. See http://joda-time.sourceforge.net/timezones.html. It supports named timezones, as well as UTC offsets (which are confusingly backwards, ie. Etc/GMT+5 is actually GMT-5). This is the closest thing to a standard I've ever seen.

rosieman on 19 May 2015

👍2

+1 for Olsen tz

... with timezone('america/phoenix')

jessecascio on 4 Jun 2015

why not write a timezone item in conf file?

itxx00 on 15 Jun 2015

+1

manusvs650 on 17 Jun 2015

+1

zredAlmeria on 18 Jun 2015

another +1 for olsen db rather than literal offsets or any other named time zones

though I should also point out that timezones aren't the only problem. if I do a group by time(30m), I might want my groups to be between 15 and 45, so that the "middle" of the grouping is on the hour and half hour, when I expect my peak load due to user traffic patterns. and that has nothing to do with timezone - it's more "offset the grouping by a certain amount"

ccutrer on 18 Jun 2015

select mean(value) from cpu
where time >= today()
group by time(10m)
time_zone(+8)

It may be the best

namtuoc0202 on 26 Jun 2015

I would use a configurable default timezone + query param to override it

nemesisdesign on 30 Jun 2015

I eagerly awaited this feature.
Without this feature,I can not use group by time(1d) .

Make timezone can be configurable(in influxdb.conf) maybe is easier.
Can add this feature in the 0.9.2 release version first?

thetcc on 3 Jul 2015

👍2

+1
We are fetching sums and grouping by 1 day and can not get accurate stats because of this timezone issue.

jeremyfromearth on 4 Aug 2015

+1 for setting default timezone in config file.

Currently, I'm using a quick workaround which consists in using a cron script or program to retrieve the average of the day (executed at 00:01 AM for example) of the desired series from InfluxDB, using the correct start and end timestamps in the WHERE clause of the query.

After that, insert the data in a new series with the timestamp of the beginning of the day and you'll be able to retrieve the correct data in your queries.

zredAlmeria on 4 Aug 2015

I really want to know which version will include this feature, is there a PR proposed already?

CrazyJvm on 6 Aug 2015

There is no PR for this yet, and the work is currently unscheduled.

beckettsean on 6 Aug 2015

+1
I'm surprised such a simple and basic feature is unscheduled :( So far grouping by 1d is useless for me :(
...or at least a timezone setting in a config file. In my case it's always the same timezone.

hotsnow77 on 12 Aug 2015

@hotsnow77 we will happily accept a PR for this feature, but I don't think you'll find it simple. It is definitely a highly desired addition but likely won't be tackled until clustering and other stability/availability concerns are considered mature.

beckettsean on 12 Aug 2015

This is pointless. If the devs are busy with other features that they consider more important why bother complaining? Either get together some golang developers who can send a patch or let them work on their priorities without pointless complaints that will only slow them down because they are nice and they'll find the time to answer politely.

We want this feature as bad as you so don't slow them down with useless comments.

PS: this is an open source project, "without warranty of any kind"

nemesisdesign on 12 Aug 2015

+1 for per-query timezone

Zolmeister on 24 Aug 2015

Just bumped into this as well when trying to chart solar gains per day, they were all offsetted vs UTC instead of what I wanted.

Looking forward to get native support in influxdb for this. I fixed it in grafana for my use case :)

CumpsD on 13 Sep 2015

jnutzmann comments about timezones and day light savings are spot on.

Just configuring the service to aggregate under a specific timezone is weak. US & Canada fall into at least 6 different timezones, and if your app is global then obviously this just isn't sufficient if accuracy is important.

Also, a lot of people in this thread are referring to this affecting only group by day queries, but it also affects group by hour, as their are timezones like Newfoundland which are UTC−03:30.

My personal opinion is that support for this should only take the timezone names, so that devs can't screw up day light savings. Dates are more complicated then most realize.

JeffAshton on 13 Sep 2015

+1!

my suggestion is using names (like in php). A lot of countries changes her time at summer. P.e in Europe are countries that in summer are +2 and winter are +1.

select mean(value) from cpu
where time >= today()
group by time(1d)
time_zone("Europe/Madrid")

nachovives81 on 17 Sep 2015

The hard part here (implementation wise) will probably be that timezones change (or something, I'm not familiar with the process).
moment-timezone for example is updated periodically (https://github.com/moment/moment-timezone/tree/develop/data/meta)
There seem to be 6+ releases just for this year: https://www.iana.org/time-zones

Zolmeister on 17 Sep 2015

@Zolmeister that's actually pretty easy. Most distributions keep the tzdata in a separate package that keeps up to date, so you just read in the files when you need them. Worst case you have to restart influx when they get updated. I'm not a Go developer, but https://codereview.appspot.com/153720043 seems to imply that Go as a language already pulls in the tzdata packages as they are updated.

ccutrer on 17 Sep 2015

Yup: https://golang.org/pkg/time/#LoadLocation

ccutrer on 17 Sep 2015

Also, a lot of people in this thread are referring to this affecting only group by day queries, but it also > affects group by hour, as their are timezones like Newfoundland which are UTC−03:30.

And one other quirk is the fact, that a day does not always start at midnight. In many production facilities which run 24 hours a day, reporting is associated with "mill day" which changes when morning shift comes to work. In a paper mill where I work, that is 6 am, so ALL daily reporting is associated to this "mill day" from 6 am to 6 am.

tottote on 22 Sep 2015

Another bit of complexity tottote - and I agree, I've seen the requirement
for the same and would love support for this. However this risks becoming
so complex it doesn't get done, so I would like to point out too that I
think one big area of complexity which might be avoided is support for
timezones by name which have daylight savings. While I'm sure there are
plenty of people who could +1 that they live in such an area and "need that
feature", in reality, most organisations collecting any serious amount of
time series data over geographically diverse areas have found that support
for daylight savings time is unworkable. When clocks step forward you have
a gap, but when they revert, you have duplicate data. In reality, most
(look to big utility companies collecting metering data across timezones)
avoid this by using a time offset that _doesn't_ change over the year
(despite spanning 5 states and three timezones, the Australian East coast
utility grid uses GMT+10 across the board). If I was to make a suggestion,
it would be to shoot for support for fixed (non-DST) timezones first and
then see how many people are left clamouring to see their data with annual
gaps and overlaps (I suspect not that many).

On Tue, 22 Sep 2015 at 21:23 tottote [email protected] wrote:

Also, a lot of people in this thread are referring to this affecting only
group by day queries, but it also > affects group by hour, as their are
timezones like Newfoundland which are UTC−03:30.

And one other quirk is the fact, that a day does not always start at
midnight. In many production facilities which run 24 hours a day, reporting
is associated with "mill day" which changes when morning shift comes to
work. In a paper mill where I work, that is 6 am, so ALL daily reporting is
associated to this "mill day" from 6 am to 6 am.

—
Reply to this email directly or view it on GitHub
https://github.com/influxdb/influxdb/issues/2074#issuecomment-142224331.

marko8n on 22 Sep 2015

If I was to make a suggestion,
it would be to shoot for support for fixed (non-DST) timezones first and
then see how many people are left clamouring to see their data with annual
gaps and overlaps (I suspect not that many).

I don't think the main problem with DST is those gaps and overlaps on the two days per year when DST starts or ends. The main problem is the fact that you have one hour time offset in your aggregated data for a whole DST period. But that's just my point of view, others may have other preferences.

DST is a problem, in every aspect of life. If I ever get elected to King Of The World, my first task will be to criminalize DST. :-) Until then, we just have to deal with it. The fixed offset for handling "mill day" is pretty easy compared to the complexity of handling the DST.

Fortunately, most programming environments (even the standard C library) have working API for converting between local time and UTC, which can even handle the DST properly.

tottote on 22 Sep 2015

+1

anands on 23 Sep 2015

Fortunately, most programming environments (even the standard C library) have working API for converting between local time and UTC, which can even handle the DST properly.

This.

jufemaiz on 29 Sep 2015

If someone can point to a pure Go library that solves all of this I'm sure we would take a look.

corylanou on 29 Sep 2015

I'm no Go person, but https://golang.org/pkg/time/ references timezones.

jufemaiz on 29 Sep 2015

continue saveing datetimes in the database in UTC by convention (this might seem unnecessary but is
important)

Then use these 4 lines to convert from UTC to any other timezone before doing the group_by

http://play.golang.org/p/opw7ptyRVx

giko45 on 29 Sep 2015

:+1:

andremleblanc on 8 Dec 2015

This is really important. I thought I provide some sample usecases that are impossible to solve without timezone support.

Daily Counters

Lots of IoT systems have some sort of daily counters or summations. In this example a Solar PV system controller that sums up the daily production of electricity - reseting at local midnight.
Being in California UTC midnight becomes 16:00h. Because of this simple calculations like production by day/week/month are impossible as max(value) by 1d is wrong:

screen shot 2016-01-25 at 5 59 18 am

Total Counters

In many cases IoT sensor systems have total summation counters. In this example a smart grid meter. Daily/Weekly/Monthly consumption or production can be calculated, but will never matchup to numbers from other system dues to the same 16:00h day break.

screen shot 2016-01-25 at 6 10 52 am

Another example of daily counters

In this case output from a rainfall sensor that reports daily rainfalls. Resets at midnight local time. Makes it impossible to calculate the sum of rainfall for the past week/month/year as the 16:00h day break double counts rainfall across two days in influx

screen shot 2016-01-25 at 6 17 16 am

fxstein on 25 Jan 2016

Agree that this is super important. The current workaround is to just work with hourly data and handle all the timezone stuff outside of influxDB. Correct?

savraj on 21 Mar 2016

@savraj correct, all timezone calculations currently need to be handled outside of InfluxDB.

gunnaraasen on 21 Mar 2016

+1

itxx00 on 22 Mar 2016

+1

ThomasKurz on 22 Mar 2016

👎1

+1!

ricardoekm on 9 Apr 2016

Another clarification: you can write a time to influx db that has an offset (for example: '2016-04-09T16:34:25.663409-04:00' -- the -04:00 is currently US/Eastern time) but whenever you read from influx db, you get UTC back. It's almost better that it is consistent like this -- as timezones get complex quickly.

savraj on 9 Apr 2016

@savraj You can write a time like '2016-04-09T16:34:25.663409-04:00', it's a past time. but you can not write a time like '2016-04-09T16:34:25.663409 +08:00' (if current time is 2016-04-09T16:34:25.663409) ,because it is a future time.

thetcc on 14 Apr 2016

@thetcc actually it's the opposite.

Time(+0000) = 2016-04-09T16:34:25.663409
Time(+0400) = 2016-04-09T20:34:25.663409
Time(-0400) = 2016-04-09T12:34:25.663409

So if it's the same clock time in different zones:

Time(-0400) = 2016-04-09T16:34:25.663409
Time(+0000) =2016-04-09T20:34:25.663409

Means that if the clock time is the same as current time, but you set the timezone to being negative offset to UTC, then you're working in future time.

Likewise:

Time(+0400) = 2016-04-09T16:34:25.663409
Time(+0000) =2016-04-09T12:34:25.663409

A positive offset to UTC will be in the past (if the clock time is the same).

jufemaiz on 15 Apr 2016

in 0.12，i can not use this function.
can i modify the source code, to change timezone?

ring0li on 15 Apr 2016

@thetcc There's nothing in InfluxDB that prevents timestamps in the future, that's fully supported. They aren't returned by default when querying, however. The default upper bound on time for queries is now(), so if you write points in the future you must explicitly set the upper bound to a time further in the future. http://docs.influxdata.com/influxdb/v0.12/troubleshooting/frequently_encountered_issues/#querying-after-now

beckettsean on 15 Apr 2016

👍1

If anyone's interested here's what I do in Go, to save the time in whatever timezone you want. Probably better ways to do this, so open to ideas for improvement.

func timeInLocation(t time.Time, loc string) (time.Time, int, error) {
    l, err := time.LoadLocation(loc)
    if err != nil {
        return time.Time{}, 0, err
    }
    _, offset := t.In(l).Zone()
    return t.Add(time.Duration(offset) * time.Second), offset, nil
}

mschenk42 on 15 Apr 2016

isn't the original idea if this topic (and it's ancestor ticket) for the user to control what keywords such as GROUP BY DAY mean (as in group data in such a way that DAY matches the timezone of the server etc in question)?
I am fine submitting data as utc and it being stored inside influxdb as utc.

consense on 15 Apr 2016

👍1

best practice for saving times in a database is saving in UTC. Clients can later convert to local TZ. As consense writes the original topic adressed GROUP BY DAY queries. This should not be more difficult than something like

select mean(value) from cpu group by time(1d) time_zone(CET)

And in the backend:
t := [time read from DB, should be in UTC]
tz, _ := time.LoadLocation("CET") // get TZ identifier from select clause
t.In(tz) // convert time before group by
after this do the normal group by.

giko45 on 15 Apr 2016

This does seem to add a constant offset to the comparison so it does not handle summer/winter time changes in the dataset, right?

giko45 on 2 May 2016

👍3

Yeah, not sure you can close out a timezone requirement with a time offset solution. Sorry guys. Only partly resolved.

https://github.com/influxdata/influxdb/pull/6504#issuecomment-216334289

jufemaiz on 3 May 2016

@giko45 if you can explain a bit what you mean, we could add something more. What kind of output are you expecting? Times all get represented as UTC so the main concern is having the client interpreting the result. Even when you cross timezone boundaries due to going from something like PDT to PST, you can just do this:

SELECT value FROM cpu WHERE time > '2016-11-05T00:00:00Z-07:00' AND time <= '2016-11-07T00:00:00Z-08:00'

These times will work properly and only select data within those time periods. It's just the output will always be in UTC. If you have an idea for something that can't be done with the currently existing tools, can you open an issue describing your use case?

jsternberg on 3 May 2016

SELECT SUM(value) FROM energy WHERE time>'2016-02-05T00:00:00Z-07:00' AND time<'2016-07-05T00:00:00Z-06:00' GROUP BY time(1d);

The offset bucket you added now allows me to account for the fact that I want my buckets at 06:00Z, but it does _not_ account for the fact that in the specified interval for this query, one of the days has a bucket that is 23 hours, and that the UTC time of the boundaries between buckets will change from 07:00Z to 06:00Z on that day. This is _not_ just a presentation problem, because the bucket size needs to dynamically adjust based on when the DST transition is. Also note that in the query as written right now, there's no way to know _when_ the DST transition is. The offsets -0700 and -0600 don't tell you that - only the actual timezone ('America/Denver') has enough information to know when DST transitions are.

Also note that while the offset solution is enough to help half-hour timezones (like you mentioned - India) get by (ignoring DST problems), if you provide a way for a timezone to be provided, you shouldn't also need to specify an offset, since you can infer the offsets from the time zone (i.e. current offset % 60 will tell you if it's a 30 minute, or possibly even 15/45, or :04 -- I'm pretty sure one of those zones exist). In fact, I would expect something like "SELECT MEAN(value) FROM series GROUP BY time(1h) AT TIME ZONE 'Asia/Kolkata'" to automatically infer an offset of 30m from UTC times for the grouping.

ccutrer on 3 May 2016

I've created #6541 and #6542 for tracking those two use cases. This implementation should work for the use case described in this ticket and will be a good jumping off point for implementing full timezone support.

Thank you for your feedback.

jsternberg on 3 May 2016

A syntax like #6541 or @ccutrer
SELECT mean(value) FROM cpu WHERE time > now() - 7d GROUP BY time(1d) TZ("America/New York")
would be my use case too.

However implementation requires
1: times in the database are in UTC
2: before the grouping: convert all "time"-s in to the required TZ
3: do the grouping (since "time" now is a local time grouping will handle DST, and yes this will imply one day per year with 23 h and one day with 25h but thats OK )
4: last step i am not totally sure: *_Should the result be converted back to UTC? *_ in case of continues queries: yes, you don't want to save local times in you DB. In case of client queries: probably yes too. only return UTC times to the client. this just might seem a bit strange since you will get daily aggregated values att an offset in UTC. However its the clients responsibility to present this correct.

giko45 on 3 May 2016

👍9

+1

henriquemaciel on 5 Jul 2016

it would be nice if the time_zone option is available

skarunkumar on 26 Aug 2016

@mattbasta assuming 8h could be augmented with minute based offset, there's still the issue of DST which can't be dealt with like this :(

jufemaiz on 17 Dec 2016

@mattbasta all good and well to hold the opinion, however there are ample use cases and it's a fairly standard requirement for a database dealing with datetimes to be capable of dealing with the real world and its complexities. Imho this is a long outstanding issue that needs resolving by the Influx core team and I know I am not alone. Until that happens should we implement Influx to meet our tsdb needs we will be limited to aggregating at a half hour level in the database and then completing any aggregation in the application. This puts it at a disadvantage when comparing with other options.

jufemaiz on 17 Dec 2016

If one saves data in the database in UTC then daily aggregates can simply be done by converting to the right TZ before aggregration, this will handel DST too.
Aggregation should be something like this:

select mean(value) from cpu group by time(1d,"CET")
And in the backend (i am not a go programmer):

t := [time read from DB, should be in UTC]
tz, _ := time.LoadLocation("CET") // get TZ identifier from SQL select clause
t.In(tz) // convert time before group by
... // after this do the normal GROUP BY.

Normal select queries will return data in UTC (as the data is saved in UTC) but Clients can convert to local TZ (which is good practice anyway since clients might be in diffrent TZ)

giko45 on 17 Dec 2016

@jufemaiz Does any other tsdb handle that better ? Genuinely curious, haven't played with other ones much

XANi on 17 Dec 2016

@mattbasta PST and PDT are _not_ time zones. They are simply defined offsets (-8 and -7). Pacific Time is a time zone, and includes the information about when it changes offsets (DST), as well as nicknames for those offsets. And it's not hard for software to support it. There is a well maintained database of time zone information that software can use to know what the DST rules are for any given time zone in any given year (within practicality): https://en.wikipedia.org/wiki/Tz_database. Usually software will support the tzinfo data files, and rely on the operating system to keep them up to date.

ccutrer on 17 Dec 2016

I deleted my comment. I don't want to be involved in this discussion. Feel free to argue all you want.

mattbasta on 17 Dec 2016

Influx should store data with a UTC timestamp. Formatting timestamps should be a display detail.
The complete solution to timezone handling is to provide a timezone formatting mechanism. It should be possible to just use something like this with time: format(time,'Central Standard Time'). There can not be any solution based on using "server time" or "client time" etc. Queries simply need to be able to format time. Any query inferring time, such as group by and others, should be using the timezone of the locale that influx can detect at the server to shift time relative to UTC.

greggwon on 24 Jan 2017

@greggwon you're correct except for the rather large use case of aggregation methods by calendar days (weeks, months, years etc), then InfluxDB needs to be able to manage timezone offsets. In this case it is not merely a display detail but a functional requirement :)

jufemaiz on 25 Jan 2017

+1

kushagraSahu on 20 Jul 2017

+1

svetlina-scaleforce on 20 Jul 2017

Aggregation becomes a "display detail", because you are now visualizing. The functional definition as shown in the sample query is not my point. My point is that there should be a common time reference that is automatically known for all data. Data should not float around in varied Timezones and then have to be coerced to another in random fashion. Instead, there should be a single base timezone reference and then "visualization" can specify what timezone is important.

In the sample query, the 'time' comparison to "today()" also implies a visualization (extraction of the data for use) and so today() should also be offset by a timezone offset. today() needs the parameter, not a new function which will randomly interject problems to all other 'time' values in an expression. Each use of a time function, should by default use the server timezone offset but allow a parameter which specifies the timezone offset applicable.

Yes, that becomes more places that the offset must be specified, but it allows the developer/application the explicit control needed to make sure that we don't suddenly have offset math strewn about the query with various +21600000, and -21600000 etc.

greggwon on 21 Jul 2017

I will be locking this issue. Litigating the result of a closed issue that was opened two years ago and resolved six months ago is unfair to the people subscribed to the issue and I think it is unfair to forcibly involve others in the discussion.

With that said, I am happy to share the rationale for why the issue was resolved in the way it was resolved. But, the proper location for that question is on our community forums and not the issue tracker. @greggwon if you want to continue discussion, please create an account on that website and post a new topic. I will keep an eye out for the next day to see if there is any discussion about time zones there.

If you have issues with the way it was resolved that pertain to a valid use case of time zone math, please open an issue. I do not want to fail to address valid use cases since I want the database to be as useful to as many people as possible. If there are problems with the existing implementation, I want to address them so that everyone benefits from a better database.

One clarification that might be needed for most people (since I have seen this confusion before), the TZ parameter _does not_ affect any time values in the actual query. It affects time boundaries. The time 2000-01-01T00:00:00Z already has a time listed. It's UTC and the time zone parameter cannot and will not affect that.

Time zones are a disabled feature by default. If you don't include the TZ() parameter, no time zone math will be done. And the previous group by offset feature introduced in 1.1 or 1.2 is still there for other use cases.

I hope that clarification helps and look forward to your continued engagement with the InfluxData community.

jsternberg on 21 Jul 2017

Influxdb: Support per-query timezone offsets

Most helpful comment

All 69 comments

Related issues