Note that the SCS escape sequence doesn't work in the Linux text console [...]
You're absolutely right here.
I also realized I was wrong with PuTTY. Up to version 0.70 (which I tested) PuTTY didn't support line drawing in UTF-8 (as per Markus Kuhn's recommendation for UTF-8 being stateless). You either have to have a legacy charset, or version 0.71 with "Window -> Translation -> Enable VT100 line drawing even in UTF-8 mode". I now tried the latter, and it indeed converts the underscore to a space.
So it looks like Windows Terminal and VTE are the buggy ones here. I've just filed VTE 157.
Just for curiosity: Are you aware of any application which emits this? Why would any app do so, given that the regular space is also a space? :)
As for the choice of diamond character, I don't think the width is something that can be "fixed" in the terminal code. I believe the dimensions of an ambiguous width character are decided by the font.
I firmly disagree here. In terminal emulation, apps have to be able to print something and keep track of the cursor, whereas they by design have no idea of the font being used. In many terminals the font can also be changed runtime and it's absolutely not feasible to then rearrange the cells. In some other cases there is no font at all (e.g. the libvterm headless terminal emulation library, or a detached screen/tmux), or there are multiple fonts at once (a screen/tmux attached from multiple graphical emulators).
The only way to do that is via some external agreement on the number of cells, which is typically the Unicode EastAsianWidth, often accessed via
wcwidth(). It's not perfect (changes through Unicode versions, has ambiguous characters, etc.) but is still the best we have.glibc's
wcwidth()reports 1 for ambiguous width characters, so the de facto standard is that in terminals they are narrow.If the glyph is wider then the terminal has to figure out what to do. It could crop it (newer versions of Konsole, as far as I know), overflow to the right (VTE), shrink it (Kitty I believe does this), etc.
_Originally posted by @egmontkob in https://github.com/microsoft/terminal/issues/2049#issuecomment-513588977_
From @egmontkob's note above, and from seeing how some other terminal emulators do this, it looks like this might be the correct choice. There's some affordances in certain projects for supporting "legacy" ambiguous character widths, but by and large terminals have agreed that they should be a single cell wide.
And for what it's worth, here's what I get when I try it:
| before | after |
|-|-|
|
|
|
|
|
|
@DHowett-MSFT how does this play with emoji? Aren't they usually ambiguous, but actually double wide?
Nah, emoji are specifically double-width:

This is good approach, it seems to solve part of the unicode rendering issue, which might solve Chinese/double-width character issues, quite a lot emoji issues. but I wonder if it only solves some issues. as unicode 9 is soon a headache
VS Code and hyper.js use xterm.js as terminal engine, as they are working on similar Unicode
handling solution here. They had a long history with only wcwidth-ish solution, and now UTS#51 is a big issues, especially missing Unicode 8/9(till latest 13) and Unicode modifier/sequence.
Also, iterm2 a popular terminal app on Mac OS made a lot changes years back to suppor Unicode.
Since terminal/console/wsl is system app, I hope a more mature and overall solution is discussed, proposed, reviewed and implemented for further extension. Current Unicode support is partial and kind of bugfix only
@DHowett-MSFT
Maybe you can make an option to run WT in “old far east application mode” to keep CP 932/936/949/950 compatibilty:
\ and ~ into ¥ and ‾;\ into ₩;No.
@DHowett-MSFT
So keep all the weird CP things in CONHOST (V1)?
Codepages have proven, almost without exception, to be an unmitigable disaster. They complicate the text buffer, they complicate the handling of DBCS characters, they provide little to no value in modern UTF-8-aware applications.
The codepage stuff will stay on the far side of ConPTY and be rendered to the terminal in nice good and clean UTF-8. :smile:
@DHowett-MSFT Well what I mean is that, some far east console applications may assume that characters' width follows the code page byte count, so turning them into single-width may break these applications (though... you can still throw them into ConHost V1).
Another issue may include:
①: Many fonts (afaik Pragmata Pro) will make it double-width since they are “complex”.I get that, but to quote the initial post that spawned this issue:
I firmly disagree here. In terminal emulation, apps have to be able to print something and keep track of the cursor, whereas they by design have no idea of the font being used.
@DHowett-MSFT
Hmmm, can we make use of OpenType tags?
hwid to them, so font makers can switch their glyphs to a narrower one.fwid instead.This is somehow like how UAX #50 works: Analyze runs first, then apply vert on upright runs and vrtr on rotated runs.
:tada:This issue was addressed in #2928, which has now been successfully released as Windows Terminal Preview v0.6.2951.0.:tada:
Handy links:
Most helpful comment
And for what it's worth, here's what I get when I try it:
| before | after |
|
|
|
|
|-|-|
|
|