When adding or subtracting a Span or MonthSpan to a Time object that isn't UTC, the result is off by an hour. It appears that it's adding a specific number of seconds instead of days, weeks, months or years, as you would expect.
time = Time.utc(1997, 10, 21, 9, 0, 0)
time + 1.weeks #=> 1997-10-28 09:00:00.0 UTC
time = Time.new(1997, 10, 21, 9, 0, 0)
time + 1.weeks #=> 1997-10-28 08:00:00.0 -05:00 Local
time = Time.new(1997, 10, 25, 9, 0, 0, location: Time::Location.load("America/Toronto"))
time + 1.days #=> 1997-10-26 08:00:00.0 -05:00 Local
I won't comment on whether this is good or not (it does look good to me, though), but yes, time math goes through seconds, always.
Actually, for months it doesn't go through seconds. So:
time = Time.new(1997, 10, 25, 9, 0, 0, location: Time::Location.load("America/Toronto"))
p time # => 1997-10-25 09:00:00.0 -04:00 America/Toronto
p time + 1.month # => 1997-11-25 09:00:00.0 -05:00 America/Toronto
So yes, one of the behaviors should _maybe_ be fixed, though I can imagine it being really hard to implement, specially when right now you can combine units (except months).
Think about implementation though. When you're thinking of 1 week from today, you expect that to be the same hour, not an hour earlier because you set your clocks forward in the meantime.
The workaround I implemented in my project is only 7 lines long. I used a comparison of DST on the receiver and the argument.
I don't know. Let's say I say "in one hour" and dst starts having effect. Then your 7 lines diff will change the hour again. Then in real life, an hour wouldn't have passed, but maybe zero or two.
Maybe with days and weeks it's different, because they are usually tied to a date.
How does Ruby's activrsupport behave regarding this?
No, my workaround doesn't change the hour if the span is 1.hour.
I've been trying to figure out how Ruby's Active Support does this, but I'm not sure yet.
Active Support does exactly what one would expect. A week after 9am on October 21st is 9am on October 28th.
HOW they do it is more interesting. They actually construct and return a new time object constructed from the values of the old object, substituting day and week values when new values are provided.
Yes, this problem has already been mentioned before (see #4556), but I don't think there has been a dedicated issue yet.
Time libraries usually solve this by having two different concepts for time span, one accurate, based on elapsed time (like Time::Span
) and the other nominal based on calendrical units (like Time::MonthSpan
but with higher granularity).
seconds, nanoseconds
and days, months, years
properties.Duration
and Interval
. The latter is essentially a Duration anchored at a specific start instance and can such be interpreted as calendrical unit.Some however only have one, time-based type (for example Rusts chrono Duration
, Golang Duration
). This obviously makes it hard to use calendrical calculations.
I guess it could also be an option to have one type with individual fields for (nano)seconds, hours, days, months, years and it can be used both ways depending on how you populate the fields (either 1 day
or 24 hours
). But that's probably going to be more confusing than anything else because you never know if an instance is meant to be time-based or calendrical.
They actually construct and return a new time object constructed from the values of the old object, substituting day and week values when new values are provided.
That's actually the most sane way to do this. Because it is a calendrical concept, it needs to go through the calendrical interface. That's much easier than figuring out the exact amount clock shifts to calculate an offset of elapsed time.
I would suggest to leave Time::Span
as is (but maybe rename it to Time::Duration
to be more concise) and add an additional Time::Period
type, probably with properties nanoseconds
, hours
, days
, months
, years
. The details need to be discussed, though.
Well, it's an important discussion to have, to be certain. I suspect that many will get tripped up by this before it's fixed. I'm more than happy to help in any way I can.
I don't think Time::Period
is the best naming, it doesn't read very well as English. Perhaps Time::DaySpan
to be analogous to the MonthSpan
that already works in the way you'd expect it.
As asterite pointed out, when people are talking about hours, they're going to mean the specific hours. Same goes for smaller values. This behaviour only really comes into play when people are talking about intervals of days or weeks.
The problem comes with, for example, expiration dates. You can say something expires in 3.days, for example a cookie or a session, and with that you mean 72 hours. The problem is that days
is ambiguous.
Maybe we should remove those convenience methods from the standard library, just like in Go. Alternatively we can document it always refers to computed seconds.
The thing about that is, 3.days should be treated differently than 72.hours. If you want to describe a span of 72 hours specifically, you're going to say 72.hours, not 3.days. On the other hand, if you want an event to repeat every 3 days, you want it happening at the same time every day. Your work schedule isn't going to shift by an hour just because of daylight savings time.
I vote against removing the convenience methods. I think they're brilliant, and very useful.
How about, expanding on straight-shoota's idea, we leave span the way it is, and add another struct for date spans. I think DateSpan or Interval would be good names for it. While the Span still operates on seconds and milliseconds, this new object would hold 3 values: seconds, days and months. That way, addition and subtraction can work in the ActiveSupport way, with days being added evenly, regardless of DST, and months can be added without worrying about leap years.
The behaviour and existence of the convenience methods is actually a subordinate issue.
Foremost we need to discuss how to express the different semantics in terms of data structures.
Yeah, making days and week return a different struct, similar to MonthSpan, sounds good. Or even representing all time spans as a combination of seconds, days and months, as suggested (I don't mind it will be a bit larger, it's not like time spans are stored and used as much as time instances).
it's not like time spans are stored and used as much as time instances
That depends. If you're doing a lot of time measurements, it might be quite important. In my opinion it doesn't make much sense to merge both concepts into one type. The size is one issue, but more importantly, there should be a clear distinction between a time duration (i.e. what's returned by Time.measure
) and an abstract time period.
@straight-shoota How would you distinguish them when constructing them?
Time::Duration.new(days: 5)
vs Time::Period.new(days: 5)
. Let's just leave the convenience methods out for now and figure that out later.
What's wrong with treating days, weeks and months as such, instead of a fixed amount of seconds?
Not sure I understand you question... there's nothing wrong with that, I actually propose adding a type for treating them individually.
I mean, 1.day would mean period, 1.hour would mean duration. Those are the constructors. We just need to fix the semantic.
I know it's just semantic, but I really don't like the name Period. Elixer's Interval name sounds far better to me.
Here are my thoughts on how it would work:
span = Time::Span.new(72, 0, 0)
interval = Time::Interval.new(0, 0, 3, 0, 0, 0) #=> 3 days
time + span #=> 2018-11-06 08:00:00.0 -05:00 Local
time + interval #=> 2018-11-06 09:00:00.0 -05:00 Local
Having Period
, Duration
, Interval
, or any combination of that is bad, because the names are interchangeable and almost mean the same, so it'll be confusing as hell.
I propose what you, @HCLarsen, proposed: a Time::Span
should consist of:
seconds
, milliseconds
, microseconds
nanoseconds
, minutes
and hours
days
and weeks
months
and years
Then there's no confusion at all. The only confusion might arise if someone does 3.days
expecting that to be exactly 72 hours. For that case, we should document that one must use 72.hours
and not 3.days
, and that's it. I prefer that small confusion that can be explained in the docs over having two or more new types, which behave slightly different from one another, and where it's not clear what name belongs to which behavior.
a Time::Span should consist of:
Actually, my idea was that the Interval class would consist of that, and the Span would remain as is, just seconds and milliseconds. Although, I recognize that this idea has validity as well.
The problem with having two types is that one can no longer do: 1.day
. Is it one type or the other? So we must use the more verbose Time::Interval.new(...)
or Time::Period.new(...)
. Or we could have 1.day(period: true)
or something like that, but it's still more complex than just having 1.day
as the only possibility with only one possible meaning.
Maybe those 2.days
convenience methods are not that important and we could drop them, I don't know. I like them :-)
Crystal seems to handle that just fine now with the Span and MonthSpan types. 1.months
returns a MonthSpan, and 31.days
returns a Span. In my new idea, it would work like this:
72.hours #=> 3.00:00:00 Time::Span
3.days #=> 3.00:00:00 Time::Interval
Technically, the same time period, but different methods return different types.
Sure, that could work.
The only inconvenience is that right now you can do:
# You can add spans
span = 1.hour + 5.seconds
# However, you can't add a span with a month span
span = 1.month + 3.hours # => doesn't compile
With just a single type holding all this information, it could work. And then if you have anything in the "days" or "months" fields, you can treat "hours" as fixed hours, so "1.day + 3.hours" after 9am would always give the next day at 12am, regardless of DST. Or maybe these fields shouldn't be combined to avoid this new confusion.
You could still do that if you overloaded the + method.
struct Time::Interval
def +(span : Time::Span) : Time::Interval
@seconds + span.total_seconds.to_i
end
end
Please let's not worry about the utility constructors right now. This is one step ahead. They're certainly nice and I'd like to find a way to keep them with improved semantics.
@asterite Yes, having two different types with similar names can be confusing. But maybe we can find names that clearly express purpose and avoid that.
Having only one type is very much confusing, because then there's no type safety. If you have two instances, there is no way to tell by the type if they're meant to represent elapsed time or calendrical units. I'd expect this could be a source of even more confusion when applying mathematical operations.
In many applications, such as benchmarking, time measurements, timeouts etc. you really just need to handle elapsed time. If the same type could also have days, month and year fields, all these use cases would have to make sure to properly handle these as well, or raise an error (but that could only happen at runtime).
I haven't found a single instance of a multi-purpose time span type in any time library. I have seen they either have two separate types or only support elapsed-time.
What about activesupport? According to OP it seems io be doing what he wants, and you only create them with the utility methods.
Can we just look at what go provides and leave the rest of the time complexity to a shard?
I've looked at what go provides and Go just has time.Add(time.Hours * 24)
or time.AddDate(0, 0, 1)
to add a day. The former returns 24 hours and the latter returns 25.
We don't need a whole new datatype for this, just a new method for adding days, weeks, and years. For the record though, go doesn't provide a time.Weeks
or a time.Months
, so it effectively segments the API between "hours" and shorter being a length in seconds and "days" and fewer being calendar-based.
That being said it'd be a lot more crystally to have a new datatype, for a TimeSpan
and a CalendarSpan
, the former holding seconds and the latter holding a days, months, and years integer offset.
We already have a MonthSpan
, we just need to rename it and add year and day offsets.
@asterite Yeah, fair enough, ActiveSupport::Duration
could be considered as combining both purposes.
Yet the plain Ruby Time
class still expresses time durations as simple numeric values. Measuring time by subtracting instances does not return Duration
:
Time.now - Time.now # => -1.07e-05
But, obviously ActiveSupport::Duration
can express a duration based on elapsed time as well as calendrical. That makes it different from Duration
/Period
in the Java Time API for example, where the former is strictly time based (stores only (nanos)seconds) and the latter strictly date-based (stores days, months and years).
Having a data type combine both concepts can effectively implement a duration as specified in ISO 8601 (5.5.4.2 Representation of time-interval by duration only).
@RX14 It would certainly be an option to only include the current duration-based Time::Span
in stdlib. And leave anything beyond that for a shard. That's essentially the status quo. And there is noting wrong with it, per se.
Maybe add something like Go's time.AddDate
or ActiveSupport's Time#advance
which would be a place for a shard to hook in.
Then we should probably remove the convenience methods to let them be defined by a shard as well.
On the other hand, I'd really like to extend Crystal's Time API to support calendrical time spans. It's nice to have this included by default as it is very commonly used for time-based calculations in all kinds of applications. Having a solid implementation in stdlib would be great as people can just just use it and can have the nice convenience methods like 5.days
readily available.
I do like the idea of having two. Leaving Time::Span the way it is, for nanosecond-accurate passages of time, whereas the new struct would handle calendar date passages of time.
I've been working on a rough draft of my idea, so let me know what you think:
Just to know what we're talking about, I compiled a list of time units supported by the Time API and their relations:
1 nanosecond
: base unit for all time types, but essentially just a SI prefix1 second
= 1_000_000_000 nanoseconds
: the conceptual (and mostly used) base unit1 minute
= 60 seconds
: This is generally true (leap seconds are smeared). A zone-offset difference with non-zero seconds could theoretically change this relation but I don't think there is any real significance.1 hour
= 60 minutes
: Generally true except for non-zero minutes in a zone-offset difference. This is still rare but actually happens. For example when Venezuela switched from -4:30 to -4:00 on 2016-05-01, the hour between 2:00 and 3:00 lasted only 30 minutes (and 9 year before that, it was the other way around).1 day
= 24 hours
: Regularly differs because of daylight savings time setting clocks back and forth an hour (usually).1 week
= 7 days
: Generally true. Alternations are very rare but not without precedent. The week from 2011-12-25 to 2012-01-01 for example lasted only 6 days in Samoa, because it skipped 2011-12-28 due to changing time zone from -11:00 to +13:00.1 month
: no general relation to smaller units whatsoever (typically 28/30/31 days)1 year
= 12 months
Without relying on general-but-not-necessarily-every-time relations this boils down to a few base units which can be used to represent others as well.
With these, you can describe calendarical time spans and don't lose accuracy because of relying on implicit semantics:
nanoseconds
(also representing seconds
, minutes
)hours
days
weeks
(maybe)months
(also representing years
)A different issue are fractions of a unit. Should they be supported at all? ActiveRecord::Duration
for example does.
And if yes, how are they to be interpreted. For example, what does 1.5 days
or 0.3 months
mean?
A few thoughts on this.
First, if someone is looking at calendrical time spans, will they be concerned with nanoseconds? It seems to me that an application where someone is looking at that much accuracy in the time that's passed isn't the same application where you're concerned with calendar accuracy. That's why I opted for seconds
instead of nanoseconds
.
I'm very hesitant to add weeks as a stored value. This would get VERY messy. Every other value, even the ones not directly stored, fit together well. Weeks are the oddball that are 7 days, but don't fit into months evenly. That's likely the reason why the ISO8601 duration value allows for either date-time or week representation, but not both together.
The hours thing seems like an extreme edge case. Should we add another stored value for this kind of extreme edge case?
First, if someone is looking at calendrical time spans, will they be concerned with nanoseconds?
Probably not. But I see no reason to set an artificial limit. Bot units are just a multiplier of 1_000_000_000
apart. In some cases, higher resolution than second might actually be necessary and then you should go the entire way instead of going to milliseconds. Plus, it seems pretty logical to use nanoseconds
as base unit in all time related types.
The only reason that might speak against it is that it requires an 8 byte value to express seconds as nanoseconds. But the API for accessing seconds would still be Int32 probably.
I wouldn't say it is definitely required, but up for discussion. I just wouldn't simply discard nanoseconds as "nobody needs that".
I agree that having a duration type that stores different units of time can be a bit confusing because the order of application can change the output.
But that is true for any unit, not just weeks. I don't see any reason why weeks would be different in any form from the other units. And it is very common to express time durations in numbers of weeks. But of course, it is an absolutely rare edge case that a week does not equal 7 days. Still, I think a somewhat respectable implementation should cover that.
The same goes for hours. It is a part of the typical calendrical representation of a time instance and due to the influence of time zone offsets, the elapsed time between to instances an hour apart is not intrinsically equal to 60 seconds. Such a case is pretty rare, but a real thing, so I think it should be covered.
And the thing with time zone changes is, that they can come pretty fast with short notice. Countries can pretty much do what they want. There might be a relevant change coming up sooner than anyone might think (and faster than anyone could patch) and change to some weird offset. Who knows... politics is complicated.
ActiveRecord::Duration
btw. supports years, months, weeks, days, hours, minutes, seconds. This is perhaps actually the best thing (plus maybe nanoseconds) because it covers all relevant time units. Even if months and years are strictly proportional related (at least I can't think of any exception) and the potential incongruity between minutes and seconds doesn't necessarily need to bother. This form of completeness certainly has something to it. I think that's what I would prefer because it's most explicit.
I agree that having a duration type that stores different units of time can be a bit confusing because the order of application can change the output.
I never said that. All along I've said that it's necessary to store different units of time, and my example makes use of that, storing seconds, days and months explicitly.
What I'm saying is that storing weeks is a bad idea, because it doesn't fit evenly into the scale. Every other unit of time fits evenly into the next larger unit, without any remainder. Weeks only fit evenly into February. The rest of the months, it's a partial week. That will make a complete mess of things. That's why I'm extremely loathe to include weeks as a stored unit.
Isn't a week just 7 days? Or what's the problem with seeing it like that?
Technically, a week is the next time the same day of the week arrives. The mass majority of the time, that's 7 days, although @straight-shoota did demonstrate an example where those two weren't equal.
That being said, I'm far more in favour of looking at weeks as just 7 days.
@asterite see https://github.com/crystal-lang/crystal/issues/6522#issuecomment-412666986
example:
samoa = Time::Location.load("Pacific/Apia")
a = Time.new(2011, 12, 25, 0, 0, 0, location: samoa)
b = Time.new(2012, 1, 1, 0, 0, 0, location: samoa)
a.day_of_week # => Sunday
b.day_of_week # => Sunday
(b - a).days # => 6
@HCLarsen
That will make a complete mess of things.
How so? I don't see why this would cause any more trouble than any other unit.
ActiveRecord::Duration
works with weeks.
How so? I don't see why this would cause any more trouble than any other unit.
What I said in the very same paragraph that you quoted:
The rest of the months, it's a partial week.
1 year = 12 months
1 day = 24 hours
1 hour = 60 minutes
1 minute = 60 seconds
1 month = 28, 29, 30 or 31 days
All solid integers. Meanwhile, month to weeks...
1 month = 4.428571428571429 weeks
Yeah, I get that the ratio between month and week is fractional. But why do you think this is a problem?
Besides, these integer multipliers only apply as long as there is no change in the time zone offset.
For example, at every clock change for daylight savings, the elapsed time of a month is actually 31.04 or 30.96 days (assuming a switch of one hour back and forth in October and March).
The code example I showed you don't use number of days to calculate the passage of a month.
There's no initializer for Time with a weeks
parameter, so you can't directly add a weeks value the way you do with all the other values.
Further, how do you deal with overflow? When adding anything else, you can "carry" it over to the next largest unit of measure. Because the relationship between weeks and months is fractional, you can't add it to an integer without losing information.
We shouldn't bother with weekspans, just have a Time::CalendarSpan
storing integer years months days and that's it. If you want to add calendar days and hours at the same time - and thats rare - just add twice: Time.now + 5.days + 5.hours
. Keep it simple then wait for someone to come up with a usecase for more. We shouldn't attempt to cover 100% of the Time
complexity inside the stdlib, but we already have prescedent with MonthSpan
so this is a fine, small, addition.
@RX14 that's not a bad idea. Although the ISO8601 spec does allow years, months, days, hours, minutes and seconds, it would be easy enough to just do it as suggested above.
@straight-shoota one thing to note about ActiveSupport::Duration. It does store a weeks value, but when performing addition and subtraction, it converts the weeks to 7 days, and adds that to the days value. It seems that it wouldn't compensate for that Samoa edge case that you mentioned.
@RX14 Yeah, that's exactly like Duration
and Period
in the Java Date Time API.
@HCLarsen Weeks are not an integral part of the regular calendar format, but calendar weeks are still common enough in business applications. Right now, Time
doesn't support that at all, but it should be added at some point.
The thing is, weeks will work differently than anything else. Every other time unit corresponds directly to one of the arguments for Time.new. The only way to calculate a proper week is to look for the next time the same day of the week occurs.
The only way to calculate a proper week is to look for the next time the same day of the week occurs.
A week apart is just the difference equivalent to seven days in UTC. The length of a week and any other time unit is fixed to the standard length in UTC. It's only ever the representation in a non-UTC timezone that can lead to deviations.
What do you think about this proposal for going forward:
Time::DateSpan
(name not settled) to represent a calendrical duration with fields days, month, years
.Int#days, Int#weeks, Int#months, #Int#years
to return instances of DateSpan
.Time#+
and Time#-
to accept DateSpan
.DateSpan
(Time#-(Time)
should still return Time#Span
). Maybe DateSpan#between
would be a?Time#MonthSpan
.Time#add_span
similar to Timex.shift/2
and Time#advance
. In the process, I'd also favour to rename this method, maybe to Time#shift
.@RX14 I propose DateSpan
over CalendarSpan
because hours, minutes, seconds are also calendar units but not expressed in this type.
A week apart is just the difference equivalent to seven days in UTC. The length of a week and any other time unit is fixed to the standard length in UTC. It's only ever the representation in a non-UTC timezone that can lead to deviations.
So if you were going to code that, could you take the time zone time, convert to UFC, add 7 days, then convert back to the original time zone?
@straight-shoota that's not too far from what I've already written to test my ideas out. It shouldn't take much to change it to those proposed specs. You guys are more than welcome to copy as much of my code as you want.
So if you were going to code that, could you take the time zone time, convert to UFC, add 7 days, then convert back to the original time zone?
Essentially yes. It just requires some extra effort because you need to convert the wall clock reading (or rather just the date) to and from UTC, and there is currently no method for that. Time#in
converts the instance, but for this calculation we want the exact same wall readings just in UTC and back. That's why we need to take away the offset and re-apply the new one after adding seven days:
samoa = Time::Location.load("Pacific/Apia")
start = Time.new(2011, 12, 25, 0, 0, 0, location: samoa)
start_utc = (start + start.offset.seconds).in(Time::Location::UTC)
week_utc = start_utc + 7.days
week_samoa = (week_utc - samoa.lookup(week_utc).offset.seconds).in(samoa)
week_samoa - start # => 6.days
This is conceptually the same as going through the calendrical constructor which probably illustrates the logic better:
start_utc = Time.utc(start.year, start.month, start.day, 0, 0, 0)
week_utc = start_utc + 7.days
week_samoa = Time.new week_utc.year, week_utc.month, week_utc.day, start.hour, start.minute, start.second, location: samoa
Oh. Different than what I was thinking.
Add a method to subtract two dates from each other resulting in a DateSpan
Is this really required?
Add more arguments to Time#add_span similar to Timex.shift/2 and Time#advance.
Why does this method exist?
Apart from those details, this looks fine to me.
I updated my rough draft. While it needs some refactoring, is this the gist of what you're looking for?
Add a method to subtract two dates from each other resulting in a DateSpan
Is this really required?
I'd say so, yes. There should be a means to calculate the conceptual time span (in days, months, years) between to dates. We can probably leave it out of the initial implementation, though, and focus on the details afterwards.
Add more arguments to Time#add_span similar to Timex.shift/2 and Time#advance.
Why does this method exist?
I think it provides a efficient interface to add (or subtract) some amount without having to go through Time::Span
.
@HCLarsen Yes, that looks good! Care to put that in a PR?
Sure, I can do that.
One question though. I did all my testing using the minitest.cr gem. For the PR though, would you like me to write some specs using the built in spec framework?
Also, have we all agreed with DateSpan as the name? I'll agree that it works better than Interval, and it's more succinct than CalendarSpan, although the latter is more descriptive.
@HCLarsen For the standard library your only option is spec
, you can't use minitest.cr
And you want full test coverage, right?
We can probably leave it out of the initial implementation, though, and focus on the details afterwards.
On second thought, this should really be implemented first so that it can be used as implementation for Time#+(Time::DateSpan)
(and Time#-(Time::DateSpan)
).
Ok, I'm a bit confused about this newest development. Are we now saying that Time#+(Time::DateSpan)
and Time#-(Time::DateSpan)
should implement Time:add_span
? To be honest, I don't think that'd work to well. I see the logic in add_span with the Span and MonthSpan objects, but with DateSpan, the addition and subtraction methods get a bit more complicated, due to the calculations of overflow and the like. Just take a look at my +/- methods:
https://github.com/HCLarsen/time_ext/blob/master/src/time_ext/time.cr
The other question is, should Time - Time return a DateSpan, or a Span?
I think it is actually quite simple: Time#add_span
(or better Time#shift
) should be the only place where actual time transformation is applied. That's actually how it is now, and because of the additional complexity, I don't see any reason to change that. Time#+
and Time#-
should just delegate to that method with appropriate arguments, depending on whether it is called with Time::Span
or DateSpan
. These are just data formats representing some kind of time difference which can be applied to a time instance.
I'll publish some code shortly.
should Time - Time return a DateSpan, or a Span?
It should return Time::Span
. That's probably what this is most likely to be used for, to calculate the difference in elapsed time between two instances (like "this happend ... ago").
But I think there should also be a method to generate a DateSpan
from the difference between to Time
instances. I don't know a great name for such a method.
We could consider to just use a constructor DateSpan.between(Time, Time)
. But I think this is not optimal because in contrast to #-
it doesn't clearly communicate which one is interpreted as start point.
What about Time#between(Time)
or Time#from(Time)
?
The datespan between would neccessrily loose precision, whereas timespan would not. So it definitely should be a very specifically named method
How can Time - Time
return a DateSpan? What does this return?
2018-08-16 - 2018-07-16
Does it return a DateSpan of 1 month? Or 29 days?
For me it makes very little sense to return a DateSpan
, mostly because the difference would be ambiguous. But returning an absolute number of seconds (just Time::Span
) always makes sense (the current behavior).
@asterite exactly.
I think it is fine to leave it at that for now. We can discuss how to calculate the difference between two time instances in calendrical units in a follow-up issue. This could be a rather complex feature.
I want to run something by you guys, just to make sure we're all in agreement here. It relates to dealing with the difference in days of the month. 1 month after October 31st, is that November 30th, or December 1st? I think it's the former, and that's how I've coded these methods so far.
Currently, I'm thinking that the months are added (or subtracted), then any compensation for a shorter month is made, and then the days are added (or subtracted). Are we in agreement on this?
This is actually the current behaviour (Time.utc(2018, 8, 31) + 1.month # => 2018-09-30
) and it should stay like this. I think it makes more sense than overflowing a random 1 or 3 days into the next month. Most time libraries handle it like that.
Good. Glad we're in agreement with that. I'll get to working on a PR, although it may be a couple of days before I can finish it.
From go's stdlib:
AddDate normalizes its result in the same way that Date does, so, for example, adding one month to October 31 yields December 1, the normalized form for November 31.
I said most, not all. Timex, Java Date Time API, Swift, Rails implementations all don't overflow to the next month.
Going in line with the way the current + and - methods work for the Time
class, I refactored mine into add_day and add_month methods. The years are taken care of in the add_month method, but I'm wondering if I should extract that further into add_years, or if that'd be redundant.
https://github.com/HCLarsen/time_ext/blob/master/src/time_ext/time.cr
Having two methods for adding months and days separately is not very performant because the transformations between timestamp and calendrical representation and back need to be calculated twice.
I have an implementation of such a combined method ready and opened a PR at #6598.
It may just be my sleep deprived brain, but it looks to me like there's a couple of things missing.
Also, are you just converting it to UTC, doing the additions, and then converting back to the time zone, in order to avoid any and all time zone complications such as DST and time zone changes?
Yes, calculating in UTC means we can simply work with second-based durations and don't worry if an hour is actually an hour in elapsed time. See https://github.com/crystal-lang/crystal/issues/6522#issuecomment-413462971
day
is always in range 1..31
as result of year_month_day_day_year
.month
is the remainder of dividing by 12
, so it is always in range 0..11
.month is the remainder of dividing by 12, so it is always in range 0..11
If that's true, then is this code not redundant?
if month < 1
month = 12 + month
year -= 1
end
Oh, that was a bit imprecise. The value range of months.remainder(12)
is -11..11
because negative months have a negative remainder. That's why month < 1
is necessary.
Ok. I think I get it now. That's not a concern for negative days because of the line seconds += days * SECONDS_PER_DAY
, whereas months are added using the Time#absolute_days method.
Once that PR is merged, I'll create my PR using the #shift method to implement the DateSpan struct, and the + and - methods that use the #shift method.
@HCLarsen #6572 has been merged, you're ready to go :+1:
Question that I should have thought of earlier. Should I delete the MonthSpan object, since that'll be basically obsoleted by the DateSpan, or leave both?
MonthSpan
is obsolete. Most people probably just used the convenience methods Int#months
etc. anyway, so it's not even going to break much.
Just discovered that no tests were written for the MonthSpan. Certainly makes this easier.
@HCLarsen It's tested indirectly through 1.months
, etc. But I don't expect anyone in the world to have used the MonthSpan
name.
That'sold statement, @asterite: https://github.com/search?l=Crystal&q=MonthSpan&type=Code
But, yeah, it's just a handfull and most of the uses don't seem to be very serious.
I think we had some miscommunication here. It wasn't #6572 that I was waiting for, it was #6598. I need the Time#shift method to implement the Time#+(span : DateSpan) and Time#-(span : DateSpan) methods.
Yeah, sorry I confused those two.
But you can actually already start developing on top of my branch.
Really? Does that mean that I would have to make a fork from your fork?
You can just checkout the pr branch from the crystal-lang repo at pull/6598/head
. See https://help.github.com/articles/checking-out-pull-requests-locally/
Sorry I haven't done anything with this in a while. I've been swamped with other projects, but I should be able to get back to this now.
DST time is being phasing out in Europe after a sucessful vote at the parliament.
Some states will still using it in the world, but IMHO dealing with DST can be delegated to an external library.
Good joke.
for the record, this "vote at the parliament" @j8r talks about were public consultations really.
@Sija yes that's not really the right word, there was a resolution too before.
If we remove Europe from the map https://en.wikipedia.org/wiki/Daylight_saving_time, remains USA and Canada as major countries.
We could see some actions starting at 2020.
You love it, personally I really hope this mess will be a thing of the past - I won't invest on it.
Yes, Europe voted to move out of summer/winter time changes (finally). By October 2019 we should have gotten rid of it, depending on what time each country decides to keep (summer or winter).
But, that doesn't change anything for programmers: countries are still using DST, it will have been in use for 1976 - 2019 in France (for example), and any date computation that is within this range needs to be correct around DST changes, whatever if it becomes a "thing of the past" :(
Now that #6598 has been merged, we can proceed with this. Following the proposal in https://github.com/crystal-lang/crystal/issues/6522#issuecomment-413344502, the next step would be to replace Time::MonthSpan
with a more broader Time::DateSpan
type with fields days
, months
and years
(although years can be stored internally as multiples of months
), similar to https://github.com/HCLarsen/time_ext/blob/master/src/time_ext/date_span.cr
@HCLarsen Are you still up for this?
/cc @jgaskins @anykeyh WDYT?
Wait, are we going to have multiple types to represent a time/date span? Can't we just have a single one?
My two cents:
After what I read, I think the proposal is way better than the current implementation. However, I'm confused about the reason why we should keep two different type of structure (TimeSpan and DateSpan) instead of combining them.
The argument of the size of the structure seems for me non-legit here; basically, a structure can be stored in two Q-word (16 bytes):
struct TimeSpan
@months : Int32
@days : Int32
@microseconds: Int64
end
I don't think it would impact anything in terms of performance or memory. Moreover, with the proposal sent above (aka keep two structures), we face some problems in term of design:
Date
object, and prefer to really on more generalist Time
object. I do love the concept, and I think this approach must be kept for Span
too.Imagine I want to recreate a very simple cron
. Basically I want to relay on Span to store my crontab values. Imagine I want to say "Every last day of the month at 12 noon"
My code would be then:
span = 1.month - 1.day + 12.hours
time = Time.now.at_beginning_of_month
time + span
In the case of the two structures, I will need to store both span in a custom structure. It really feel not as slick as I would expect from Crystal.
I really think we should stick to a structure already proofed in different software. Only one struct, three counters :smile: . Or there's the ActiveSupport approach: Array of transforms. Not really performant but Time::Span
seems to be mostly used in business-level code.
I understood that we were agreed on having two different types since August. The reason is not just the size, but also semantics. See https://github.com/crystal-lang/crystal/issues/6522#issuecomment-412469608 for details.
@anykeyh Your size draft doesn't hold. It is not possible to convert between the individual fields, thus each one needs to be able to store the full range of values. Microsecond precision is also not enough, we need nanoseconds. That's at least 96 bits (Int64 + Int32) for nanoseconds, like the current Time::Span
implementation.
And once we start with combining date and time fields into a single data type, we should also support hours, minutes maybe weeks as individual fields. As detailed in https://github.com/crystal-lang/crystal/issues/6522#issuecomment-412666986, these are not necessarily simple multiples of smaller units.
I don't see why days and months would be like apples and oranges. Conceptually, that's the same thing, just different granularity. But it's semantically different from a elapsed time measurement like Time::Span
.
I agree that Time::DateSpan
by itself is not suitable for storing a full date and time span as described in the ISO format and used in databases etc. However, it makes it very simple to create such a type by combining Time::Span
and Time::DateSpan
. Database bindings would maybe require custom types anyway for dealing with different timezone behaviour etc.
A viable alternative would be to expand DateSpan
to also include weeks, hours, minutes, seconds, nanoseconds each as individual fields. This would be a fully ISO-compatible data type for storing a duration in calendrical units. But I wouldn't want to mix this with measuring elapsed time in Time::Span
.
I really think we should stick to a structure already proofed in different software.
I agree. Most sophisticated time libraries differentiate between durations of elapsed time and durations of calendrical units.
@straight-shoota
Having only one type is very much confusing, because then there's no type safety. If you have two instances, there is no way to tell by the type if they're meant to represent elapsed time or calendrical units
The type must store both things. It can represent any of these. If you say "days", "years" or "months" it means calendrical units, if you say "hours", "minutes" or "seconds" it means elapsed time. The type should store both separately and don't mix them. Then when doing operations, both are applied.
Is this possible?
Sorry if I'm not explaining very well:
There's only 3 counters which matters:
1) Months, as x + 1.month != x + 30.days
. Store Month and year (= 12 months)
Semantically, 30 Jan + 1 month should equals 28 February.
2) Days, as Summer/Winter leaps can make x + 1.day != x + 24.hours
. Store days and week (7.days)
Semantically, [anyday 12:00:00] + 1 day
should return [tomorrow 12:00:00]
no matter which day and winter/summer rules applied.
3) Nanoseconds, for everything else. (hours, minutes, seconds, microseconds, nanoseconds)
Counter for weeks are useless, there's no leap in weekday possible, so x + 1.week == x + 7.days ALWAYS and day counter is enough.
Assuming 1 nanosecond is 10**-9
and an Int64
store ±2**63
, the maximum of second span would be ±292 years before overflow, without using the months and day counters. I bet it would not be an issue for 99.999% of the app built or being build with Crystal. I guess application dealing with nanosecond-precise historical or future prediction data will implement their own way to count time :rofl:
EDIT: I can see a problem with nanoseconds registry in the case of substracting dates far away in the time (literally centuries away). I really think here the problem is the precision level, as microsecond would offer a range of 292'000 years instead. Storing nanoseconds in Time Span object sounds overkill; nanoseconds are used mostly in monotonic clocks in benchmarking or hft to compute tiny time range, and don't really care about days and month and use a tick counter instead.
I don't see it an issue if people try to benchmark their app using Time.now - before_now
and complain that it's not nanosecond precise.
Last note: I think you may not understand the business interest to write in the same structure:
duration = 1.day + 1.hour + 30.minutes
I'm mostly writing ERP system which use a lot of time operations and having a common structure to express both calendar and time distance is a real plus when writing application.
Having this working and not working at the same time is not really beginner friendly (assuming business logic developers are not rockstar devs usually):
Time.now + 1.day + 1.month #OK compile
t = Time.now
s = 1.day + 1.month # no overload matches 'Time::Span#+' with type Time::MonthSpan << Confusing now.
t + s
@asterite Hours, minutes, seconds are also calendrical units and it makes a difference if a duration is specified as 1 hour
or 3600 seconds
because "one hour later" is not necessarily equal to "60 minutes later".
Elapsed time measures only a single dimension, in our case nanoseconds. That's what you get when measuring time with a monotonic clock. The difference of two Time
instances can either be expressed as the amount of time elapsed between both instances on the time line, or as a description of duration in calendrical units of different granularities.
Both concepts can be expressed in a single data type. But this leads to semantic ambiguity.
@anykey If we want a data type to fully model a duration using calendrical units, it needs fields for hours, minutes and weeks as they cannot simply be expressed as multiples.
See https://github.com/crystal-lang/crystal/issues/6522#issuecomment-412666986 for detailed explanations and especially https://github.com/crystal-lang/crystal/issues/6522#issuecomment-412988101 for a demonstration of a week that is not equal to 7 days.
Only seconds, milliseconds, microseconds and nanoseconds can be treated as a single field, as they are simply SI prefixes of the same dimension. And 1 year
is always equal to 12 months
. All other relations are mostly fixed, but not always.
If Crystal had a Date
type, having a Date::Span
would make _perfect_ sense to me. If Time
is expected to handle dates as well as times, it doesn't seem to be a stretch to include calendar-level granularity inside Time::Span
.
In this forum post, @anykeyh pointed out how Postgres stores the values (months : Int32, days : Int32, microseconds : Int64
), which I think is valuable information if we're discussing the memory footprint because databases have some of the strictest requirements on minimizing it. Neo4j serializes its Duration
type as a 4-element struct: months, days, seconds, nanoseconds. My PR #7454 isn't far off from this idea (just missing days). This includes all of the things we're talking about in this thread.
Loading from a database is likely to be _the_ most taxing use case on memory (in the case of huge result sets loaded eagerly) as well as the most common use case that is taxing at all. Wrapping two instances of different Span
types to store a single instance of the DB's native interval / duration / period / span type would likely consume quite a bit of memory. Can anyone suggest a real-world scenario where having a single, unified Time::Span
would use more memory in a typical app than that? Other than fetching them from some data source, I can't imagine many Time::Span
instances lasting longer than the stack frame in which they were created.
@jgaskins Exactly. If we need to make a Time::Span
instance include all of the calendric and time info needed, and it makes it like 40 bytes big, I still don't see that as a problem. Sure, it's big, but I can't imagine it being a bottleneck somehow.
To be clear: I'm all in for a full fledged duration data type including date and time fields which would be compliant to ISO 8601 and typical duration types of databases etc.
However, it has not been clear if such a type should exist in the stdlib or would better fit in an external time library. I think the recent arguments make a strong case for having it in stdlib.
But: It should be a separate type from Type::Span
which should keep it's meaning of describing a fixed length on the instance time line.
But: It should be a separate type from
Type::Span
which should keep its meaning of describing a fixed length on the instance time line.
Why?
They're different things and behave differently. Thus they should not be melted into a single type.
When you add to a Time
instance the amount of a elapsed duration (Time::Span
), it means shifting the instant on the time line, which results in an instant offset by exactly the amount of the time span. This is true for any instant, in any time zone and any value of the duration.
Shifting a Time
instance by a duration expressed in calendrical units, the result on the time line depends on the instant, the time zone and the dimensions and values of the duration.
Wow, long discussion here.
Yes @straight-shoota , I am working on the Time::DateSpan struct. You can see my progress here:
https://github.com/HCLarsen/crystal/blob/feature/date_span/src/time/span.cr
I'm currently working on the Time + and Time - methods that use DateSpan instead of MonthSpan.
Well I think the discussion is stalled, because this is more a conceptual question:
Time::Span
structure define multiple operations behaving differently at the same time?That's currently how it's treated in the main databases of the market:
https://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html
https://www.postgresql.org/docs/9.1/functions-datetime.html
...But C# give @straight-shoota credit :smile:
A TimeSpan value represents a time interval and can be expressed as a particular number of days, hours, minutes, seconds, and milliseconds. Because it represents a general interval without reference to a particular start or end point, it cannot be expressed in terms of years and months, both of which have a variable number of days. It differs from a DateTime value, which represents a date and time without reference to a particular time zone, or a DateTimeOffset value, which represents a specific moment of time.
For Ruby's ActiveRecord, it behave strangely ...:
ActiveSupport::Duration.build(2716146).parts # => {:months=>1, :days=>1}
(seriously: month from seconds is no-go)
Otherwise, it stores the combination of duration as an operation list and resolve it when applied to DateTime
, which is cool but not super performant.
Scala gives @asterite (and I) credit:
The definition of a period also affects the equals method. A period of 1 day is not equal to a period of 24 hours, nor 1 hour equal to 60 minutes. This is because periods represent an abstracted definition of a time period (eg. a day may not actually be 24 hours, it might be 23 or 25 at daylight savings boundary). To compare the actual duration of two periods, convert both to durations using toDuration, an operation that emphasises that the result may differ according to the date you choose.
^--- interesting enough, it compute the duration of the periods using a date parameter, which makes sense.
They're different things and behave differently.
I agree that they are _technically_ different things, but conceptually they're both about distance between two timestamps. I don't feel that my app should need to know that there's a distinction between shifting some number of seconds/nanoseconds and shifting days/months, but making them separate types would require an app developer to work around that when storing them in ivars, for example.
To be clear: I'm all in for a full fledged duration data type including date and time fields which would be compliant to ISO 8601 and typical duration types of databases etc. … I think the recent arguments make a strong case for having it in stdlib
I think @asterite means that, if having a unified time/date span in the stdlib makes sense, what purpose does having two additional separate ones serve?
I don't feel that my app should need to know that there's a distinction between shifting some number of seconds/nanoseconds and shifting days/months
@jgaskins your app may not need to know the difference between the two. However, there are apps that will need to know the difference, such as calendar apps, like the one I was working on when I discovered this issue.
@HCLarsen What constraints would both of them being the same type impose on that application?
What I'm proposing is to store both the calendric difference and the duration difference in a single type, without mixing those values. That is, 24.hours and 1.day would be stored differently, but always in the same time. Can that be done or is it something impossible?
@asterite It's definitely doable, especially now that Time#shift
is a thing. #7454 is just missing days
(already has months). Some current specs do specify things like hours, minutes, and seconds converting directly to days, so those will have to change but otherwise it shouldn't be terribly painful.
Though, I'm thinking how to visually represent such type. For example, right now if you do 24.hours.to_s
you get 1.00:00:00
, which means "1 day, 0 hours, etc.". Now you could do (1.day + 24.hours).to_s
... what would that show? Maybe if there are values in both components (calendar units and duration units) it appears as something different. Or maybe always the same, showing 0
for any component that's missing.
I admit that it's a hard topic and mixing the two concepts might be confusing and helpful at the same time :-P
@jgaskins
but conceptually they're both about distance between two timestamps.
I don't think you understand the concepts correctly. Only elapsed time (Time::Stamp
) represents a distance between two instances on the time line. A duration expressed with calendrical fields is a distance between two local time representations (wall clocks) and can actually mean a variety of different distances of individual instances.
there's a distinction between shifting some number of seconds/nanoseconds and shifting days/months
It's not about date vs. time. It's about a discrete amount of time vs. a conceptual distance in calendrical units.
@asterite
What I'm proposing is to store both the calendric difference and the duration difference in a single type, without mixing those values. That is, 24.hours and 1.day would be stored differently, but always in the same time.
Yes, this can be done and it is definitely possible. That's almost exactly how it should work. The only problem is that there is no difference between both concepts in a single type. It must always be treated as a calendric difference. That's mostly okay as you can express elapsed time simply as nanoseconds in this type.
This distinction however solely depends on the individual value of an instance. It is not type safe. Yet I'd consider such type safety rather important for a clean API. Usually there probably won't be many intersections of these types. When you're working with calender values in you business logic, you'll be using calendrical fields; when you're measuring time, you'll be using elapsed time. But when two instances of supposedly different types meet, the result is ambiguous. For example, the sum of 1 hour
and 1 second
can either be interpreted as elapsed time and the hour component is implicitly converted to it's default duration resulting in 3601 seconds
, or it can be interpreted as individual calendrical fields meaning 1 hour 1 second
. Which one to pick depends on the context and the intention of the value. This intention is best communicated using different types. if 1 second
is an elapsed time type, the result would be an elapsed time type with a duration of 3601 seconds
, if it is calendrical field, the result would be a calendrical duration of 1 hour 1 second
. Obviously both types can be converted, but this needs to be explicit.
So that means good bye to being able to write 1.hour
, right?
I would have thought that 1.hour
would give you a Span
, while 1.day
would give you a DateSpan
. That's how I've been writing it.
I don't think you understand the concepts correctly. Only elapsed time (
Time::Stamp
) represents a distance between two instances on the time line. A duration expressed with calendrical fields is a distance between two local time representations (wall clocks) and can actually mean a variety of different distances of individual instances.
I assure you I'm not misunderstanding the concepts. Think of it this way: if you walked 1km, but you're only 750m from your origin because the path you took was not a straight line, that doesn't mean that your path wasn't a distance.
Similarly, just because a day isn't 24 hours long doesn't mean the distance between 12:00 one and 12:00 the next day is not 1 day. You don't have to use a monotonic unit to call it a distance.
Similarly, just because a day isn't 24 hours long doesn't mean the distance between 12:00 one and 12:00 the next day is not 1 day. You don't have to use a monotonic unit to call it a distance.
I do believe that's the point that we're all trying to make here. 1 day isn't always 24 hours, so we need separate structs to define those two. That's why, if you look at the code I wrote, 24.hours
returns a Span
, and 1.day
returns a DateSpan
. One defines a specific passage of time, and one represent a passage of a calendrical unit.
@HCLarsen
I would have thought that
1.hour
would give you aSpan
, while1.day
would give you aDateSpan
. That's how I've been writing it.
That was the original plan we discussed in this issue, yes.
But I think the discussion is now tending towards essentially adding more units to DateSpan
. But that arises the question if we would still need a separate Time::Span
or if it should be a single type.
If we go with the proposal to expand the calendrical duration to contain time units, 1.hour
should obviously return an instance of that calendrical type (whether there is another one or not).
1 day isn't always 24 hours, so we need separate structs to define those two.
@HCLarsen Why do we _need_ separate structs to distinguish them? How would it make developing an application more difficult if they were combined into the same type?
@jgaskins https://github.com/crystal-lang/crystal/issues/6522#issuecomment-465263698
@straight-shoota so we're talking about a single struct that would individually store some combination of seconds, minutes, hours, days, months and years? Could work, but could get messy as well.
Yes, exactly. It does work and I don't think it can get any more messy than the alternative. That's actually what I had initially proposed in my very first comment in this issue. https://github.com/crystal-lang/crystal/issues/6522#issuecomment-412265226
Essentially, the plan we had until now has been a model similar to Period and Duration types in Java 8 (also mentioned in that post).
@HCLarsen Why do we _need_ separate structs to distinguish them? How would it make developing an application more difficult if they were combined into the same type?
@jgaskins because if you've got a Time
instance that represents a few minutes before DST kicks in, then there's a difference between 24 hours later, and 1 day later.
@jgaskins because if you've got a
Time
instance that represents a few minutes before DST kicks in, then there's a difference between 24 hours later, and 1 day later.
@HCLarsen Right, I agree with this part. Absolutely, 100% on the same page there. What I don't understand is how this implies that a separate type is required.
For example:
struct Time::Span
@months : Int64 = 0
@days : Int64 = 0
@seconds : Int64 = 0
@nanoseconds : Int32 = 0
# Condensing this method for brevity despite knowing it won't compile as written
def initialize(@months = 0, @days = 0, @seconds 0, @nanoseconds = 0)
end
end
struct Int
def hours
Time::Span.new(hours: self)
end
def days
Time::Span.new(days: self)
end
end
struct Time
def +(span : Time::Span)
shift(
months: span.months,
days: span.days,
seconds: span.seconds,
nanoseconds: span.nanoseconds,
)
end
end
I'm trying to understand what a second type enables that this approach cannot enable either at all or as nicely from the perspective of an application or library developer.
I'm asking because the only way I can see the multiple-types approach being better is that the internal implementation might be cleaner if you only ever use them in an expression like time + span
. If you have to store it for any reason (in an ivar, in a database, etc), distinguishing between Time::Span | Time::DateSpan
pushes the complexity of dealing with that distinction out to the application or library.
struct WebHook
@frequency : Time::Span | Time::DateSpan
end
struct CalendarEvent
@duration : Time::Span | Time::DateSpan
end
A WebHook
can be set to fire every minute, every hour, or every day. A CalendarEvent
can be a meeting lasting a matter of minutes/hours but it could also be a multi-day conference. An application shouldn't need to add type checking to work around the distinction. It doesn't feel meaningful enough to encode into the type system.
It seems there are a few things mixed up here.
The example mentioned by @HCLarsen about adding 24 hours vs. 1 day near a DST change actually does not work as an argument for separate types. Neither of them defines a fixed amount of time. Instead both, hours and days are calender units and thus would be expressed using a calendar-based type.
@jgaskins
There should not be much need for supporting both types of durations simultaneously. Their use cases are relatively distinct, and your business logic would most likely only need to deal with either one of them.
Looking at your examples, both instance vars should probably just be of the calendrical unit-based type because they're used in a calendrical context. Despite that, the values should probably be time-based, at least when operating with them, to avoid influences by time zones So one could also make a case for using a time-based type in the first place for either of these examples. But that depends on the context. However, none of them should need to be a union of different duration types.
Instead both, hours and days are calendar units and thus would be expressed using a calendar-based type
But in Java the days are present in both Period
and Duration
. In one, a day is a calendric unit. In the other a day is exactly 24 hours.
I'm starting to think... the problem of mixing the two types is that if you say 1.day
that could either mean a day after the next one, or exactly 24 hours (1440 minutes, 86400 seconds). So if you say 1.day
, which one is it?
Well, maybe it depends on how you _treat_ that value? Could we have two different Time
methods, one to treat such value as a calendric unit and another to treat it as a duration? In fact, this is what we (almost) have right now with +
and shift
.
Maybe we can remove the +
method and have these methods: plus_duration
, plus_period
, minus_duration
, minus_period
?
There's a problem, though. minus_duration(1.month)
doesn't make sense because we can't convert a month to a duration in seconds, it depends on the specific month and year. In that case we could raise an exception at runtime. And this is where we lose type safety, because if we had two different types to represent duration and period, you could't invoke minus_duration(period)
.
However... I'm not sure the benefit of this particular type safety outweighs that of having a single type.
On the other hand, maybe 1.day
looks nice but won't be terribly missed if we have something like Time::Duration.days(1)
or Time::Period.days(1)
. I mean, it's a bit more typing but there's no confusion, and we retain type safety.
@straight-shoota Is my reasoning correct? Both possibilities seem to have pros and cons, unless there's something I'm not seeing in the "mixed in a single type" proposal.
The example mentioned by @HCLarsen about adding 24 hours vs. 1 day near a DST change actually does not work as an argument for separate types. Neither of them defines a fixed amount of time. Instead both, hours and days are calender units and thus would be expressed using a calendar-based type.
Not true. An hour is not a calendar unit. It's a specific passage of time, 3600 seconds to be exact. For example, 1 hour after 1:00AM on March 10, 2019 1:00AM 3:00AM on the same day. Alternatively, 1 day after 1:00AM is always 1:00AM.
While any calendar based app is going to use calendar units, any sort of scientific calculation is going to need to look at the specific passage of time. The actual rotation of the Earth isn't 1 day, it's 86,400 +/-20 seconds. That's the reason for two separate classes, the two very separate use cases.
I must confess I overthought the thing the whole night but eventually I thought about a consensual idea.
Assuming below Duration/Span is an absolute time in seconds, while Period is a calendar time in seconds, days and months where the absolute length (in seconds) depends of the context.
I do understand than period must be explicitly converted to span, while the other way is not true. A duration can be a period or composed of a period without any problem, period. (pun intended)
So basically, the arguments are:
Below this prosal should answer to all the good points above; tell me if it sounds good:
2 structures exists:
struct Time::Span
@nanoseconds: Int64
@seconds: Int64
end
struct Time::Period
@nanoseconds : Int64 #Or microseconds? Do we really need nano on period?
@days : Int32
@months : Int32
end
Now, the helpers on Int
should behave like this:
x = 1.hour # => return Time::Span(seconds: 3600)
y = 1.month # => return Time::Period(month: 1)
z = x + y # => return Time::Period(month: 1, nanoseconds: 3_600_000_000_000_u64)
Both can be added or subtracted from a date. While both can be mixed, you cannot revert from
Time::Period
to Time::Span
unless using a referential point in time (see below).
However, you can easily cast from Span
to Period
:
x = 1.hour.to_period #For storage purpose
# or
x = Time::Period.new(hour: 1)
# or even (tricky one)
x = 0.day + 1.hour
Period have to_period
member returning self, which allow Union type to be merged easily:
class MyClass
@x : Time::Period
def period=(x)
@x = x.to_period
end
end
MyClass.new.period = 1.hour #< Note here we cast the Time::Span
MyClass.new.period = 1.day #< Note here we return self
day
, week
, month
and year
returns a Period.hours
, minutes
, seconds
returns a Span.To get a duration from a period, it's also possible:
(1.month+1.day).to_span(Time.now) # => Equal to Time::Span.new(seconds: ((1.month+1.day).from_now - Time.now))
A referential must be however provided. Idea: This referential can be Time.now
by default?
Example of output:
13.month + 1.day + 171.hours + 82.minutes # => "1 yr 1mo 1d 172:22:00"
1.day - 1.hour # => "1d -1:00:00"
8.day + 24.hour #=> "1w 1d 24:00:00"
Time::Span#to_s is more simple to compute, as it's an absolute time counter.
It still face the issue of the order of transformation:
Time.now + (1.day + 1.month)
# and
Time.now + 1.day + 1.month
# MAY NOT RETURN THE SAME TIME!!!!
This is because a Period is applying always month, followed by day,
while applying day then month can lead to different result on month leap.
However this issue exists in all case unless we assume than period is
a list of transformations played in a specific order. Which cause others problem
(how to store it properly, how to export it in JSON or to a DB,
performance issue, memory consumption etc...)
WDYT? Sounds not too complex nor memory/computation heavy and a good compromise?
TLDR:
24.hours
helper to express one day.@asterite
But in Java the days are present in both
Period
andDuration
. In one, a day is a calendric unit. In the other a day is exactly 24 hours.
The Duration
class does not represent days
, only seconds (and nanoeconds). It provides a few methods for working with days, but they're immediately converted to (nano)seconds. You can't tell a difference whether an instance was created as 1 day or 24 hours or 86400 seconds.
So if you say
1.day
, which one is it?
A day is a calendar unit, thus 1.day
should obviously be a calendrical type. It can be converted to a time-based type with a conversion method (for example #to_span
).
Maybe we can remove the
+
method and have these methods:plus_duration
,plus_period
,minus_duration
,minus_period
?
I don't think that would be a great solution. It just makes the API more complicated. And these are not the only methods. All operations with two duration values essentially would need such a distinction as well.
A better solution is to simply use different types in the first place, giving the benefit of type safety.
Both possibilities seem to have pros and cons, unless there's something I'm not seeing in the "mixed in a single type" proposal.
I'm pretty sure most of the examples we've been discussing here are mostly about calendrical type. The argument for separate types also bases on the use cases for a time-based type. That is, where you only need actual time spans and not have to deal with ambiguous calendrical units. Examples for this are all applications related to Time.monotonic
/Time.measure
, like measuring elapsed time or calculating timed events.
@anykeyh
Your proposal for a Period
class is not sufficient to fully represent the units of the ISO calendar. It goes a part of the way, but it's better to either go fully or don't even try. For example, the value of 1 hour
should be represented in a unit of hours, not seconds. 3600 seconds
is not equivalent to 1 hour
.
If we agree on having two types, one for calendrical and one for fixed-time durations, there are IMHO two alternatives for the calendrical one:
The small option is like Java 8's Period
class which only models the date components years, months, days
(where years
can internally be represented as months
). The big option is like the Period
class in Joda Time (which the Java 8 Datetime API is based on) supporting years, months, weeks, days, hours, minutes, seconds, nanoseconds
.
However, I'd like to not get to much into the details of this, unless we're agreed on the general idea of having two different duration types.
Yeah, I'm now more inclined to having two types.
And about my proposal, what's about the seamless integration of both type with minute
and hour
etc... with both types?
@anykeyh What do you mean with "seamless integration"?
3600 seconds
is not equivalent to1 hour
.
@straight-shoota what exactly are you basing that statement on?
The small option is like Java 8's
Period
class which only models the date componentsyears, months, days
(whereyears
can internally be represented asmonths
).
That's essentially what I've created with the DateSpan class. Internally, it possesses @days
and @months
, but it takes years in it's constructor, multiplying them by 12 and adding them to @months
.
Definition of second:
The duration of 9,192,631,770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the caesium-133 atom
Definition of minute
[...] equal to 1⁄60 (the first sexagesimal fraction[1]) of an hour, or 60 seconds. In the UTC time standard, a minute on rare occasions has 61 seconds, a consequence of leap seconds [...] minute is accepted for use with SI units
Definition of hour
[...] reckoned as 1⁄24 of a day and scientifically reckoned as 3,599–3,601 seconds, depending on conditions.
Definition of day
A day, symbol d, defined as 86 400 seconds
Definition of week
A week is a time unit equal to seven days.
TL;DR: we can easily support second (SI) and its derived units: minute, day and week. There values are constant. However, all hour, month and year units varies.
@j8r A day isn't 86400 seconds when the boundaries for that day straddle a DST boundary. Leap seconds are less problematic because they're rare and cause calculations to be off by only a second, but 1.day
being off by an entire hour for all calculations made for 2 full days each year causes real problems. That's the issue at hand. 😄
@jgaskins that depends of if we are talking of a civil day, which can have plus or minus an hour or a SI day (d), which is always 86 400 seconds.
That's important. Do we stick with SI, or change the behavior depending of the local time?
@j8r as the author of this issue, I can tell you that the need for software to match the behaviour of local time is the entire reason for this discussion.
Before we move foward, just few false assumptions I want to discuss:
For example, the value of
1 hour
should be represented in a unit of hours, not seconds.3600 seconds
is not equivalent to1 hour
.
False assumption. An hour is equivalent to 3600 seconds, even during DST leap.
Adding one hour at DST shifting moment to winter time will return the same hour but will change the timezone.
Basically, you may face 2am (IST) + 1.hour = 2am (GMT)
. And that's normal behavior.
Because DST is treated as timezone shift, not hour shift :smile:. Algorithm is:
* Add 1 to the hour counter
* Check whether the modification trigger DST rules
* If yes, change the Timezone accordingly
It's normal behavior, because the business case of adding or substracting 1 hour is really 1 hour. If I have a system which ping every hour a specific service, I assume it to works even on DST leap, and ping at 2am then 2am then 3am. I expect my system to work the same way if I wrote "add 1 hour", "add 60 minutes" or "add 3600 seconds".
However if I write a system which add a day to a time, I expect it to return the next day, same hour.
Basically in case of DST leap, it will return a duration of 23/25 hours.
To summarize:
time + 1.hour != time + 3600.seconds
. time + 1.day != time + 24.hours
. time + 1.week != time + 7.days
time + 30.day != time + 1.month
. time + 12.months != time + 1.year
The only counters which matters are months
, day
and [nano]seconds
.
So the ISO norm might declare interval of week, day, hours etc... But at it ends, the only interest I see in this system is: Having counter for each and any ISO field keep the "intention" of the user.
ISO is made to normalize systems, and to be very clear with what data you deal. But it's not mean to be implemented as-is, as it may interfers with perfomance and memory.
And the trade is real, should I really do a benchmark to show it to you? The ironical part is that if we use ISO counters storage, we will ends up treating week as 7 days, minutes as 60 seconds and so on for performance reason during addition and substraction.
My proposal above covers ALL cases with the best ratio space/performance and that's why it's what used to store in database software.
I'm not digging into the nanoseconds subject, I may have a lot to say on how much innaccurate and useless it is in a general purpose and standard library time computation system, but this one is not so much of an issue in my opinion.
Leap seconds are less problematic because they're rare and cause calculations to be off by only a second
You must understand that leap seconds should just disappear from the time counter. So basically they are not computed at all in both duration and calendar system.
At the perspective of a computer, they should not exists. That's why UT1 has been created.
So basically they should not even a subject of discussion in this thread.
With or without leap seconds, a normal (non-DST) day will ALWAYS compute to 86 400 seconds, even if the "physical" day will be 86 401 seconds.
It matters only in computing time range when range may be negative and for sorting things by date.
Also, leap seconds happens everyday in computer, when it resynchronize by NTP. Some implementation smear leap seconds, making adjustment to just accelerate (or slowdown) temporary the clock, allowing it to never fallback to a previous state and making it ALMOST IMPOSSIBLE to see at computer-level of awareness.
EDIT: Time class use unix_timestamp, so leap seconds are not covered at all, and thanks god. Using smearing or adjustment must be done on system-wise, not in the Crystal's software. Time.now.to_s
should never return 2013/12/31 23:59:60
; por favor, this will lead to a lot of more bugs while interfacing with systems than just ignoring the leap second. System relaying on absolute time should not use Time.now
but use an additional library.
Also, monotonic clock is made for treating this case. Anyway it's not our subject of discussion.
If someone need to compute a precise interval of time, they need to use something else than the time clock which can leap, end of story.
@anykeyh What do you mean with "seamless integration"?
I mean that a with the false assumption above covered, a Time::Span
IS a subset of a Time::Period
. And thus, allowing to write this is full of sense:
period = 1.month + 1.hour
struct AStoragePlace
@period : Time::Period
end
It has a business sens, it has 100% sens. I really don't want to store two objects while having only a Time
object in Crystal.
For me it's really strange and unconsistent: the team agree that Date object should not exists and must relay on Time (which I'm 100% with), but now we need to treat addition and substraction of time with two differents structures, one covering the calendar and one covering the seconds.
Edit: I rewrote a bit, I think I was a bit toxic on the first version, sorry for that ;) :smile:
It looks correct to have 2 types like @straight-shoota said:
@anykeyh Time zone shifts are not always +/- 1 hour, thus there are more cases where the typical relations between calendrical units don't apply. Please take a look at https://github.com/crystal-lang/crystal/issues/6522#issuecomment-412666986 which provides practical examples for every case you claimed to be impossible (except for the relation 1 year = 12 months
, which is uncontested).
You must understand that leap seconds should just disappear from the time counter. So basically they are not computed at all in both duration and calendar system.
This is correct. In fact, Crystal's time library is already based on a calendar which ignores leap seconds and assumes leap smear instead.
I really don't want to store two objects while having only a
Time
object in Crystal.
Yes, we only have a single Time
type, but actually support two different concepts of time. Monotonic time is simply expressed as a Time::Stamp
. This makes perfect sense since the monotonic clock returns an amount of elapsed time. That's also why we need a separate type for representing elapsed time vs. conceptual calendar units. And these types can not be subsets of each other, because they represent different concepts. You can convert between them, but this needs to be done explicitly. But that should not be used very often because as I already stated, their applications are pretty distinct.
That's also why we need a separate type for representing elapsed time vs. conceptual calendar units. And these types can not be subsets of each other, because they represent different concepts. You can convert between them, but this needs to be done explicitly. But that should not be used very often because as I already stated, their applications are pretty distinct.
Precisely. I can't foresee it being a common situation where one would need a union of the two types, or convert from one to the other, simply because the two types represent two different purposes.
Ok, I'm giving up my idea. _“If you're alone, that's probably because you're wrong”_, say the proverb. I guess I am on this one...
So you want the big real thing. As the proverb say (Yeah I'm proverbial tonight), _go big or go home_.
Okay then, let's go big. Let's make it real, let's make me seriously feel that I was wrong on the whole line :neutral_face:. Now, forgetting our disagreement, I might help you if you want on this crazy idea, because ~this shit~ the work is real.
region/city
, it sounds logicial than a region which has changed their mind the last 15 years should reflect it in the calendar. I don't think we should handle very distant events, yet I think it would be great if we can add them / customize, to handle also the new events which will probably occurs without the need to a new crystal version. I guess the first step should be to decide how to maintain a table of the decisions made, easy enough to populate every time a country decide something (sometime only few weeks prior to the application).On another subject, let's think how we are going to deal with second leaps and scientific calculus precision, and how you are going to implement TAI timezone. I understand than Time::Span
should be an absolute of time spent, and therefore it must deal with leap-seconds, no? One of the problem would lay in the fact than Time is an interface from unix_timestamp, and so the precision relay on the clock under Crystal. To be honest, even if second leaps, according to what I read above I feel like we really should go the precise way.
Also, any plan to deal with others calendar systems? I'm thinking about التقويم الهجري (hijri, sort-of hard) or ปฏิทินสุริยคติ (super easy to deal with, unless we want to handle BEFORE 1940). Both of them are used on daily-basis based on my experience, in addition to Gregorian calendar. Java stdlib implements different calendars, now we are going to add ISO-compliant period type, I think it's useful to deal with different calendars too. In some way, the hijri is used more or less by 1 billion people. If i understand then, adding 1.month
to a time should add one islamic month if the calendar is hijri-based; actually that's sounds good :smile:.
I think the initial PR should not contains all the things above. It would be a sabotage by overcharge of work, and would let you think I'm not sincere. However it should prepare the land for it; honestly, after all this discussion and the precision given to all the details, there's no middle-ground anymore. Or we decide to go full-support, or we don't. I'm genuinely interested by this subject and willing to help :smile:
Cheers,
@anykeyh You're mostly talking about problems that have already been solved. Time::Location
supports zone offset with second precision and time zone data is provided by the operating system, or a custom copy of the time zone database. So by default, no time zone data is baked into the binary (this is obviously possible, though) and can be easily updated externally. Leap seconds are ignored and assume leap smear to be applied by the originating clock.
Currently, there are no plans for supporting anything else but the ISO calendar, and I don't think there will be anytime soon. Making the Time API calendar-agnostic would add immense complexity to all parts of the library. The ISO calendar is the most commonly used calendar and an international standard. That's sufficient for the standard library. Additional calendar implementations can be provided by a shard.
However, we're drifting away from the topic of this issue which has become way too long anyway.
It seems we're pretty much in agreement on the general idea of having two separate types for time durations. So the next step would be diving deeper into the design questions of a calendrical duration type. I've been collecting some insights and will be ready to submit a propsal in the next days.
If I may make a request @straight-shoota, please include in your proposal an explanation of your statement that
3600 seconds
is not equivalent to1 hour
.
I'm honestly, extremely confused by this claim, as I know of no situation on this Earth where that is true (with the exception of leap seconds, which we all agree aren't being considered).
@HCLarsen We're talking about the representation in a calendrical duration type. And there it can make a difference whether the time span of an "hour" is supposed to be treated as a SI unit (then it's simply defined as the duration of 3600 seconds) or as a calendrical unit representing the difference in the hour component of local time (in which case it usually but not always equals to the duration of 3600 seconds).
When the time zone offset changes by an amount other than a full hour, the wall clock will eventually show a difference of one hour but parts of this hour have either been skipped of repeated, which means that the "hour" actually lasted fewer or more than 3600 seconds.
That's exactly the same thing as a day not always lasting 24 hours, just one level below. And not very common. However, I've already cited a real example of a time zone change of 30 minutes ( https://github.com/crystal-lang/crystal/issues/6522#issuecomment-412666986).
So in practice, this distinction is rarely relevant. But sometimes it is. And it's generally important to preserve the intention and be able to tell apart whether a duration instance is meant to represent 3600 seconds or 1 hour.
@straight-shoota if I understand correctly, you mean referring to an hour as the time passed between any two full hours on the clock, say between 1:00 and 2:00. I have to disagree that there's any practicality of doing so, as no system, nor person I know, does things that way.
Take a look at RFC5545, the iCalendar specification, which lists hours as an "accurate duration" along with seconds and minutes, as opposed to a "nominal duration" like weeks or days. When talking about time zone changes, the RFC gives a specific example:
March 11, 2007 at 3:30 A.M. EDT (UTC-04:00), one hour after 1:30 A.M. EST (UTC-05:00).
We also have to consider that in some cases, it's impossible to measure a duration in this manner. Take my upcoming DST change that comes in 3 weeks. The clocks will simply go from 1:59:59 to 3:00:00. Now, if you consider what's 1 hour after 1:00, and you look at as 2:00, well, that's impossible, since there is no 2:00 on that particular day.
Consider, if you will, how people talk about the DST changes. In the spring, they refer to it as "losing an hour," and the fall change is referred to as "gaining an hour."
I do hope you'll consider these things when writing your specification.
Yes, the practical impact is almost negligent. I could still construct an example where it matters, but actually, that's besides the point. This is a technical oddity that's somewhat worth considering, but the argument for having separate fields for every common calendrical unit is even more practically relevant.
Technically, the time span of 1 hour
is equivalent to 60 minutes
and 3600 seconds
thus these values are theoretically interchangeable. But it still makes a semantic difference whether a duration is defined in hours, minutes or seconds. For example, the length of movies is usually measured in minutes, even if the running time is several hours.
If all time units are only stored in seconds, there is no distinction between different input values which equal the same amount of seconds. That effectively changes the semantics of the value. Consider a user entering a value of 120 minutes
. I would expect that value to be represented in the same way as it was entered. But when there is no way to tell a difference, it's going to be implicitly converted to 2 hours
(or worse: 7200 seconds
).
Such a conversion might be intended in some cases, but often it is not. That's why it should only be explicitly applied. It must not be implicitly enforced because the data type can't represent the full data.
An example where it matters would actually be very important to this conversation. As it is, both of the examples you provided here are better solved without having separate fields for each unit.
In the case of the running time of movies, it makes more sense for someone just to store/measure that as an integer (or float), and use Time#shift
by the number of minutes.
In the case that you want the struct to return the same value that you entered, well, new getters based on each calendrical unit would solve that. Something like Span#total_minutes
that returns @seconds / 60
, or 60.0
if you want a float. If you're writing software that's going to be looking at one specific unit, it's easy to just use the appropriate getter.
On the other hand, if we implement storage of all the different units; seconds, minutes, hours, etc..., then it becomes a big mess if you want to go the other way. 2.hours + 60.minutes
suddenly becomes 2 hours and 60 minutes. Not something that would ever be practical. Whereas 2.hours + 60.minutes = 3.hours
makes far more sense.
I don't doubt that the examples you're talking about could happen. However, I think that they're so extremely few and far between that it doesn't make sense to implement it that way in the core library, when the other implementation will be useful far more often. That's why I'm saying that we should stick to the original plan, where Time::Span
represents seconds, minutes and hours as the passage of time, and Time::DateSpan
represents days, months and years as calendrical units.
On my local branch, I've created a Time::DateSpan
struct with Time#+(span : DateSpan)
and Time#-(span : DateSpan)
methods on the time class itself. Currently DateSpan only deals with days, months and years, and internally stores @days
and @months
, with years just being 12 months.
I'm not going to do a PR, because it seems like there isn't a consensus on whether or not to add smaller units of time to the DateSpan, and how they may be represented internally. Let me know once you have that proposal, and what everyone in the core team agrees should be done.
Most helpful comment
Yes, Europe voted to move out of summer/winter time changes (finally). By October 2019 we should have gotten rid of it, depending on what time each country decides to keep (summer or winter).
But, that doesn't change anything for programmers: countries are still using DST, it will have been in use for 1976 - 2019 in France (for example), and any date computation that is within this range needs to be correct around DST changes, whatever if it becomes a "thing of the past" :(