Csswg-drafts: [css-text-3] Should enclosed counting rods / tai xuan jing / yi jing hexagrams be space-discarding?

Created on 23 Apr 2020  Â·  8Comments  Â·  Source: w3c/csswg-drafts

In #337 we decided to key line-break transformation behavior by Unicode Block. Most of the blocks are pretty straightforward: Han, Kana, Yi, and CJK punctuation blokcs discard, and everything else converts to a space. But there are a few interesting cases...

One interesting case are some symbols that seem to originate primarily in CJK usage:
https://en.wikipedia.org/wiki/Yijing_Hexagram_Symbols_(Unicode_block)
https://en.wikipedia.org/wiki/Taixuanjing
https://en.wikipedia.org/wiki/Counting_Rod_Numerals_(Unicode_block)

Our intent is to discard if it's safe to do so (Chinese / Japanese context) but not otherwise (Korean, English, etc.). Note that we only discard if both sides (before and after) the line break are part of the space-discarding character set.

What should we do with these blocks?

Closed Accepted by CSSWG Resolution Testing Unnecessary Tracked in DoC css-text-3 i18n-clreq i18n-jlreq i18n-klreq i18n-tracker

Most helpful comment

Is it worth keeping the hexagrams’ behavior consistent with the monograms’ and trigrams’ in Miscellaneous Symbols?

All 8 comments

From the example pictures submitted to Unicode, none of them use spaces to delimit words, so I prefer to include, but I'm fine not to if others think so.

Is it worth keeping the hexagrams’ behavior consistent with the monograms’ and trigrams’ in Miscellaneous Symbols?

discard if it's safe to do so (Chinese / Japanese context) but not otherwise (Korean, English, etc.).

I wasn't able to find the text in https://drafts.csswg.org/css-text-3/#line-break-transform that indicates how the browser determines whether it's in a CJ context or not.

My current thinking is that it will be important to identify language settings before applying the discard rules.

For example, the counting rods block also contains Western tally marks, and it may be better to keep spaces between those if they appear on either side of a line break in English content.

I wasn't able to find the text in https://drafts.csswg.org/css-text-3/#line-break-transform that indicates how the browser determines whether it's in a CJ context or not.

For now, it doesn't. It could be change to take the lang attribute into account if we wanted to introduce some notion of a language dependent context.

@r12a @frivoal I think the CSSWG wanted to avoid introducing language-dependency for the space-discarding rules.

My take on this, based on @dscorbett’s comment, is to exclude these characters from the space-discarding set. Based on that I propose to close this issue as no change.

Looks good to me.

The CSS Working Group just discussed [css-text-3] Should enclosed counting rods / tai xuan jing / yi jing hexagrams be space-discarding?, and agreed to the following:

  • RESOLVED: Close no change

The full IRC log of that discussion
<dael> Topic: [css-text-3] Should enclosed counting rods / tai xuan jing / yi jing hexagrams be space-discarding?

<dael> github: https://github.com/w3c/csswg-drafts/issues/4993#issuecomment-633723924

<dael> fantasai: Line breaks between these character categories are dropped. Do we include these symbols in that set? Prop in issue is no

<dael> fantasai: Reason is to keep hexagrams consistent with misc symbols block. koji and I think this is good idea, checking with WG. Prop: close no change

<dael> astearns: Richard's opinion?

<dael> fantasai: Mentioned countring rods might be used in western context so keeping space is better idea. That's in favor of no change

<dael> astearns: Other comments?

<dael> astearns: Prop: Close no change to current spec

<florian> +1

<dael> astearns: Anything clarifying?

<dael> fantasai: No, it's an explicit list of codepoints

<dael> RESOLVED: Close no change

Was this page helpful?
0 / 5 - 0 ratings