We currently allow the time zone in brackets at the end of the ISO string.
With the Calendar proposal, I suggested adding a calendar hint to the ISO string as well. Without it, round-tripping across the serialized string would not be possible as toString() would be lossy.
const d1 = Temporal.Date.from({ calendar: "hebrew", year: 5780, month: 1, day: 1 });
const d2 = Temporal.Date.from(d1.toString());
assertEquals(0, Temporal.Date.compare(d1, d2));
The initial suggestion was to simply append the calendar identifier to the end of the string: 2019-12-06T16:23+00:50[America/NewYork][hebrew]. However, @gibson042 points out that this could be probematic: "There are many aliases without / ... [including] #156. And it gets worse with author-defined time zone and calendar names."
I suggested adding additional c= and z= qualifiers to the brackets: 2019-12-06T16:23+00:50[z=America/NewYork][c=hebrew].
I think we should be _very_ careful about committing to the interchange incompatibilities that deviating even further from standard formats will incur. Bracketed time zones already aren't part of ISO 8601, though they are supported in Java (and I believe stem from Joda-Time). That said, = will not appear in a time zone name conforming to documented guidelines, so it would at least not suffer from internal ambiguity. But the whole thing still feels risky; it might be better to leave calendar-specific serialization up to authors.
@gibson042 also posted in the other thread:
2019-12-06[hebrew] reads to me like "Hebrew year 2019", not like "Hebrew calendar representation of Gregorian year 2019"—and the probability of confusion will be greater for calendars with closer alignment to Gregorian. I don't have a proposed fix, but felt like the issue deserved mention.
If we don't want to add syntax for calendars in the ISO string, then I would prefer to throw TypeError on toString when the calendar slot is not ISO. I really like the string round-trip invariant, and if we silently throw away the calendar field, it could create situations where the programmer unexpectedly loses critical information about the date.
If we want to maintain compatibility with other systems that apparently obey the non-standard [America/NewYork] syntax, that's fine. We can still use [c=blah] for calendar, assuming the time zone identifier won't contain an = sign.
FYI, @kaizhu256 pointed out string serialization as an important use case in #295 (which I closed as a duplicate of this issue).
Meeting Jan. 27: [c=Name] is preferable to [Name] due to the possibility for ambiguity with time zones. Concerns about introducing a new annotation whereas the time zone annotation is semi-widely used already. We have two competing principles; toString() shouldn't lose information but it also shouldn't throw. @littledan to post idea here about splitting toString() and toISOString().
If we used Partial ISO as the default calendar for #292, I might be okay with toString() or toISOString() being lossy, because this would prevent the user from having an implicit behavior change when round-tripping between Temporal object and string.
To follow up concretely on https://github.com/tc39/proposal-temporal/issues/293#issuecomment-578921860 : I think our requirements are:
I want to suggest that we do the following:
[object Temporal.DateTime], etc. This would be annoying but very far from unprecedented in JS.toISOString method, which converts to the Gregorian/ISO calendar and omits the calendar. This isn't unexpected, as ISO 8601 doesn't include a representation of a calendar. (Alternative: it throws on things which have non-ISO calendars.)(I would also be fine with just always omitting the calendar and converting to ISO; I'm not convinced that it needs to be a requirement that toString round-trips all the information.)
I'm okay with @littledan's suggestion from an Intl perspective. However, would this have unexpected negative impact on the ergonomics given that toString doesn't produce a useful string? It seems like many JavaScript built-in types like RegExp, Array, etc., all try to produce useful strings on toString.
As mentioned in #517, I think toString is also being used for binary comparison operators < and >, so returning something different would break comparison.
Another catch I realized: how do we represent YearMonth and MonthDay in a calendar-agnostic way in the toString() method?
The data model (#391) is that we store a year/month/day tuple in the ISO calendar under the hood in YearMonth and MonthDay. To illustrate the problem, what is the expected output of the following code, with an explicit non-ISO calendar for MonthDay?
let md = Temporal.MonthDay.from({
calendar: "hebrew",
month: 1,
day: 1
});
console.log(md.toString()); // ???
console.log(md.toISOString()); // ???
Meeting, May 7: About the YearMonth and MonthDay problem mentioned above in https://github.com/tc39/proposal-temporal/issues/293#issuecomment-617361469, we tried to identify all the options, good and bad, and will continue discussing them in this thread.
undefined'[object Temporal.YearMonth]'Examples of strings for options 4 and 5; these shouldn't be taken to be proposals for what toString() should actually return in those cases, but they are examples of the information that those strings might contain.
may2020 = Temporal.YearMonth.from({ calendar: 'iso8601', year: 2020, month: 5 })
iyar5780 = Temporal.YearMonth.from({ calendar: 'hebrew', year: 5780, month: 8 })
mayR2 = Temporal.YearMonth.from({ calendar: 'japanese', era: 'reiwa', year: 2, month: 5 })
may7 = Temporal.MonthDay.from({ calendar: 'iso8601', month: 5, day: 7 })
iyar13 = Temporal.MonthDay.from({ calendar: 'hebrew', month: 8, day: 13 })
// 4.i. (calendar-dependent string able to roundtrip)
may2020.toString() === "2020-05"
iyar5780.toString() === "5780-08[c=hebrew]"
mayR2.toString() === "R002-05[c=japanese]"
may7.toString() === "05-07"
iyar13.toString() === "08-13[c=hebrew]"
// 4.ii. (calendar-dependent string not able to roundtrip)
may2020.toString() === "2020-05"
iyar5780.toString() === "5780-08"
mayR2.toString() === "R002-05"
may7.toString() === "05-07"
iyar13.toString() === "08-13"
// 5.i. (internal data model string able to roundtrip)
may2020.toString() === "2020-05-01"
iyar5780.toString() === "2020-04-25[c=hebrew]"
mayR2.toString() === "2020-05-01[c=japanese]"
may7.toString() === "1972-05-07"
iyar13.toString() === "1976-05-13[c=hebrew]"
// 5.ii. (internal data model string not able to roundtrip)
may2020.toString() === "2020-05-01"
iyar5780.toString() === "2020-04-25"
mayR2.toString() === "2020-05-01"
may7.toString() === "1972-05-07"
iyar13.toString() === "1976-05-13"
Meeting, May 21: We'll initially ship the polyfill with a modified version of option 5.i from above, where if the calendar is ISO, then the extra reference field is _not_ printed:
may2020.toString() === "2020-05"
iyar5780.toString() === "2020-04-25[c=hebrew]"
mayR2.toString() === "2020-05-01[c=japanese]"
may7.toString() === "05-07"
iyar13.toString() === "1976-05-13[c=hebrew]"
In addition, we'll revisit this before Stage 3, and in the meantime @sffc will reach out to other organizations to see if we can reach some agreement about the [c=...] annotation that is generally acceptable.
It would be great if the MonthDay syntax were not ambiguous to casual interpretation.
E.g., is 05-07 May 7 or July 5? Your instinct will differ depending on whether you are accustomed to M/D/Y or D-M-Y conventions.
@poulsbo that's a good point, but i'm not sure how to avoid that (don't forget the Y-M-D convention, which strengthens the choice of "M-D")
Comments from some Googlers on the calendar hint syntax:
@mihnita says:
I looked at the doc, and LGTM.
No comments other than "virtual votes" for naming (if I had the right to vote :-)
I would vote against differentiating by case only.
So I would prefer {islamic} or c=... or ca= (with [] or {}).
But not [islamic]I would vote for [ca=...]. "ca" because I like the consistency with CLDR.
@macchiati says:
I like the ca= notation also.
When there's both a time zone and a calendar on an ISO string, is the calendar annotation always going to be after the time zone? I assume yes since no one is arguing about it above 😊, but wanted to make sure.
I agree it would be best to put the calendar always after the time zone. That way other software that understands the time zone annotation and not the calendar annotation will see the calendar annotation as "junk after the end of the string" but still get the time zone.
ISO-8601 includes the offset, so it makes sense that the IANA timezone is an annotation associated with the offset, adjacent to it in the string.
It could go at the beginning of the string, but @pipobscure pointed out that this would break lexicographic sorting of strings (at least amongst those with positive years).
That therefore leaves the end of the string as the only sensible place to put the calendar annotation.
This is implemented in the polyfill, and what's left to do is to revisit the form of the annotation based on our conversations with calendar-related organizations. So I'll unassign myself and mark it "feedback"
I'm still opposed to TC39 unilaterally extending ISO formats, but here's another data point in case that is a minority opinion (with bonus emphasis favoring omitting calendar): https://tools.ietf.org/html/rfc7529#page-4
When a Chinese calendar date is shown in text, it will use the format "{C}YYYYMM[L]DD" -- i.e., the same format as Gregorian but with a "{C}" prefix, and an optional "L" character after the month element to indicate a leap month [_following regular month MM, e.g. "{H}577405L08" is 8 Adar I 5774—cf. Section 4.2_]. Similarly, {E} and {H} are used in other examples as prefixes for Ethiopic (Amete Mihret) and Hebrew dates, respectively. The "{}" prefix is used for purely illustrative purposes and never appears in actual dates used in iCalendar or related specifications.
RFC 7529 also introduces SKIP=BACKWARD and SKIP=FORWARD, which may be useful in naming and/or reasoning about disambiguation within Temporal.
I looped in [email protected] on this conversation. Here is the thread:
https://mailarchive.ietf.org/arch/msg/calsify/1JwxQLUlkq07pGZtDpGHIpdrZxU/
Neil Jenkins says:
My view is that this is worth a revision to 3339; this RFC is widely used, but does not currently offer a representation for date-times with time zone or in non-Gregorian calendars. There is a clear demand for this and standardising a format seems worthwhile for interoperability.
Ken Murchison says:
If we're going to parameterize the date/time, why not use something more
generic, familiar, and extensible, e.g.:2020-05-22T07:19:35.356-04:00;tzid=America/Indiana/Indianapolis;cs=islamic-umalqura;foo=baror
2020-05-22T07:19:35.356-04:00[tzid=America/Indiana/Indianapolis;cs=islamic-umalqura;foo=bar]
I find this quite encouraging. I’d respond that since Java already supports parsing yyyy-mm-ddThh:mm:ss+01:00[Europe/Berlin] we should keep compatibility with that. And extend it with yyyy-mm-ddThh:mm:ss+01:00[Europe/Berlin;ca=hebrew;foo=bar]
in short drop the requirement of prefixing the timezone with tzid= and otherwise go with his suggestion. Precedent for this is in http headers like content-type: text/html;charset=utf-8
I’d respond that since Java already supports parsing
yyyy-mm-ddThh:mm:ss+01:00[Europe/Berlin]we should keep compatibility with that. And extend it withyyyy-mm-ddThh:mm:ss+01:00[Europe/Berlin;ca=hebrew;foo=bar]
This was exactly my thought, but then I tried it in Java and it throws. :-( So does 2017-06-16T21:25:37.258+05:30[Asia/Kolkata][ca=hebrew][foo=bar]. So other than the zone-only format which we should definitely suggest for Java compat, I'm not sure which ca= or foo= format would be best for interop.
My view: since [] is already in common use for time zone IDs, we should follow that same de-facto standard. For non-Gregorian calendars, it seems like we can propose something reasonable, and there hasn't been much pushback.
I agree with Ken's idea of being extensible to future key-value pairs. ca (technically u-ca) is already defined by UTS 35 as the prefix for calendar systems in locale identifiers.
I fear using [Europe/Berlin;ca=hebrew;foo=bar] because of interop concerns, like @justingrant mentioned. Using a separator like ; would at least mean you can slice off the whole string following ; if you are in an environment that doesn't yet understand the new key-value syntax.
A survey of ISO 8601-1 and ISO 8601-2 (2019 versions) does not find instances of ; being yet used in syntax. So, the following form might be safe:
yyyy-mm-ddThh:mm:ss+01:00[Europe/Berlin];ca=hebrew;foo=bar
I did find ; used in RFC 5545 in strings such as "FREQ=MONTHLY;BYDAY=MO,TU,WE,TH,FR;BYSETPOS=-1". I don't know if we're concerned about that.
One other note. I think it's probably smart to adopt a strategy such as the one laid out in BCP 47 section 3.7, which explicitly allows other standards bodies to maintain the definitions of the extension keywords. Unicode has the u- prefix, as documented in RFC 6067. Since Unicode maintains the list of calendar names, it makes sense that Unicode should maintain the calendar keyword in the RFC 3339 strings. Like in BCP-47, the extensions can be prefixed with u- for Unicode and x- for Private Use.
This makes my latest iteration:
yyyy-mm-ddThh:mm:ss+01:00[Europe/Berlin];u-ca=hebrew;x-foo=bar
I'll post this and the other thoughts above to the IETF email thread.
This makes my latest iteration:
yyyy-mm-ddThh:mm:ss+01:00[Europe/Berlin];u-ca=hebrew;x-foo=bar
One nice thing about this format: ergonomic assembly and disassembly with split and join which is easy in other languages that don't have Temporal, or in legacy JS:
['yyyy-mm-ddThh:mm:ss+01:00[Europe/Berlin]', 'u-ca=hebrew', 'x-foo=bar'].join(';');
'yyyy-mm-ddThh:mm:ss+01:00[Europe/Berlin];u-ca=hebrew;x-foo=bar'.split(';')[0];
Unfortunately, Java still throws when it sees the first semicolon.
(A side note, I saw on that IETF thread that combining duration weeks with other units is also an ISO 8601-2 extension in the 2019 edition. I opened #871 to reflect this in our ISO extensions page and documentation.)
Sorry for mass posting but wanted to ensure that we're aware of this:
There is a new project in ISO/TC 154 called ISO 8601-3 proposed by CalConnect last year as the third part of ISO 8601.
The syntax is as follows:
Date: {gre}2019Â-10Â-11; {jul}2019Â-10Â-11
Time: {utc}09:30:20; {gps}09:30:20
Datetime (Timezone): {usÂet}2019Â10Â11T09:30:20; {iso.34200.usÂet}2019Â-10Â-11T09:30:20
This syntax is based on the three ISO standards under process, ISO 34100 (reference time scales), ISO 34200 (timezones) and ISO 34300 (calendars).
In this syntax, values can be namespaced, so for example, the ISO 34100 value of {gps} can also be expressed as {iso.34100.gps}, which allows us to handle codes from other authorities e.g. IANA TZs.
During development of ISO 8601-2 we intentionally avoided the ; semicolon due to VCALENDAR syntax, where the symbol is used to delimit attributes which can contain date-time values. But if compatibility with VCALENDAR/VCARD isn't a concern, it's fine.
I'd say that CalConnect DATETIME and ISO TC 154/WG 5 (Date and time) are willing to work together on a unified syntax, so if we can agree on something that would be great for all users...
Thanks, this is new to me.
{gre}2019Â-10Â-11; {jul}2019Â-10Â-11
An issue that @pipobscure raised early on to this syntax is that it breaks sort order. By putting the calendar ID at the end instead of the beginning, you can (usually) sort dates by string comparison.
This syntax is based on the three ISO standards under process, ISO 34100 (reference time scales), ISO 34200 (timezones) and ISO 34300 (calendars).
Could you put us in contact with the group working on ISO 34300?
The following table appears in RFC 7529: Non-Gregorian Recurrence Rules in the Internet Calendaring and Scheduling Core Object Specification (iCalendar):
| Hebrew Date | Gregorian Date |
+--------------+--------------------------+
| {H}577405L08 | 20140208 - DTSTART value |
| {H}57750608 | 20150227 |
| {H}577605L08 | 20160217 |
| {H}57770608 | 20170306 |
| {H}57780608 | 20180223 |
+--------------+--------------------------+
However, the document says the following about the {H} syntax:
When a Gregorian calendar date is shown in text, it will use the
format "YYYYMMDD", where "YYYY" is the 4-digit year, "MM" the 2-digit
month, and "DD" the 2-digit day (this is the same format used in
iCalendar [RFC5545]). The Chinese calendar will be used as an
example of a non-Gregorian calendar for illustrative purposes. When
a Chinese calendar date is shown in text, it will use the format
"{C}YYYYMM[L]DD" -- i.e., the same format as Gregorian but with a
"{C}" prefix, and an optional "L" character after the month element
to indicate a leap month. Similarly, {E} and {H} are used in other
examples as prefixes for Ethiopic (Amete Mihret) and Hebrew dates,
respectively. The "{}" prefix is used for purely illustrative
purposes and never appears in actual dates used in iCalendar or
related specifications.
@sffc - How are you thinking that non-ISO additional properties (like era in the japanese calendar) should be encoded into a string? I ran across this while working on the "Array Model" chart for #887 and wasn't sure how to handle. I'll probably change the chart to use another calendar like islamic that doesn't use extra properties, but figured it was worth capturing the issue here.
@sffc - How are you thinking that non-ISO additional properties (like
erain the japanese calendar) should be encoded into a string?
They aren't. That's the very foundation of the data model we proposed. If you want to represent September 11, Reiwa 2, your string is "2020-09-11,u-ca=japanese". By not representing any calendar-specific fields in the string, we eliminate the need for microsyntaxes specific to each and every calendar.
Got it. So there's no case where a non-ISO field can actually change the ISO content. That's great.
Meeting, Sept. 17: we reaffirmed that we are committed to putting this annotation on a standards track, such as eventually including it in RFC 3339. Until a notation is settled on that is standardized, we'll keep the status quo, noting in the proposal that it's intended to match whatever format gets standardized. Moving this to the "Stage 4" milestone.
@ptomato browsers ship in stage 3, which is when web compatibility would kick in, possibly preventing a change of the format (stage 4 is just when it lands in the spec). Would changing this in the future be possible in a non-breaking way?
I agree. This seems like a Stage 3 thing.
Yes, we have intentionally designed in a forward compatible way so as to
allow any future standardisation of this by other bodies.
On Thu, 17 Sep 2020 at 21:23, Jordan Harband notifications@github.com
wrote:
>
>
@ptomato https://github.com/ptomato browsers ship in stage 3, which is
when web compatibility would kick in, possibly preventing a change of the
format (stage 4 is just when it lands in the spec). Would changing this in
the future be possible in a non-breaking way?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tc39/proposal-temporal/issues/293#issuecomment-694479124,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AADM5L275BNYIPPVTZVRCFTSGJV5ZANCNFSM4JYAFJDQ
.
My thinking was having it on a standards track, but not yet standardized, shouldn't preclude advancement to Stage 3. Since it's noted in the spec that it's intended to match whatever is standardized, presumably browser implementers would be careful about exposing this particular part to the web before that standardization happened.
Generally entire proposals are shipped at once; given that it's forward compatible, it should be fine, but if not, then the whole proposal might want to wait for advancement on it.
I've shared our timeline with the IETF people and I'm hoping we can have a draft proposal (rough equivalent of Stage 3) by the end of the year.
Most helpful comment
My view: since
[]is already in common use for time zone IDs, we should follow that same de-facto standard. For non-Gregorian calendars, it seems like we can propose something reasonable, and there hasn't been much pushback.I agree with Ken's idea of being extensible to future key-value pairs.
ca(technicallyu-ca) is already defined by UTS 35 as the prefix for calendar systems in locale identifiers.I fear using
[Europe/Berlin;ca=hebrew;foo=bar]because of interop concerns, like @justingrant mentioned. Using a separator like;would at least mean you can slice off the whole string following;if you are in an environment that doesn't yet understand the new key-value syntax.A survey of ISO 8601-1 and ISO 8601-2 (2019 versions) does not find instances of
;being yet used in syntax. So, the following form might be safe:I did find
;used in RFC 5545 in strings such as"FREQ=MONTHLY;BYDAY=MO,TU,WE,TH,FR;BYSETPOS=-1". I don't know if we're concerned about that.