Powershell: Console does not print Unicode correctly

Created on 6 Dec 2018  ·  13Comments  ·  Source: PowerShell/PowerShell

I'm trying to print some Unicode characters on the (Admin) Powershell Console.
However, for the characters I am trying to print, the unicode doesn't seem to exist for default fonts available on the PS console.

I am using: The next-to-latest Poweshell Core 6.1.1.

$ $PSVersionTable

Name                           Value
----                           -----
PSVersion                      6.1.1
PSEdition                      Core
GitCommitId                    6.1.1
OS                             Microsoft Windows 6.3.9600
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0

> [Console]::OutputEncoding


Preamble          :
BodyName          :
EncodingName      : OEM United States
HeaderName        :
WebName           : ibm437
WindowsCodePage   :
IsBrowserDisplay  :
IsBrowserSave     :
IsMailNewsDisplay :
IsMailNewsSave    :
IsSingleByte      : True
EncoderFallback   : System.Text.InternalEncoderBestFitFallback
DecoderFallback   : System.Text.InternalDecoderBestFitFallback
IsReadOnly        : False
CodePage          : 437



> [Console]::InputEncoding


Preamble          :
BodyName          :
EncodingName      : OEM United States
HeaderName        :
WebName           : ibm437
WindowsCodePage   :
IsBrowserDisplay  :
IsBrowserSave     :
IsMailNewsDisplay :
IsMailNewsSave    :
IsSingleByte      : True
EncoderFallback   : System.Text.InternalEncoderBestFitFallback
DecoderFallback   : System.Text.InternalDecoderBestFitFallback
IsReadOnly        : True
CodePage          : 437



$ Get-Variable OutputEncoding

Name                           Value
----                           -----
OutputEncoding                 System.Text.UTF8Encoding

$ Get-ItemProperty -Path $key

# Output Pastes HERE as: 

932          : *MS ゴシック
949          : *굴림체
00           : Consolas
0            : Lucida Console
950          : *細明體
936          : *新宋体
PSPath       : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Console\TrueTypeFont
PSParentPath : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Console
PSChildName  : TrueTypeFont
PSDrive      : HKLM
PSProvider   : Microsoft.PowerShell.Core\Registry

But look like this:
image

Steps to reproduce

Run this python script:

import os, sys
import curses

def isColorCapable(): 
    # Return TRUE if the output console is color capable
    # This is not an easy problem, that often fails on Windows for 
    # all sorts of technical reasons. 
    try:
        curses.setupterm(); 
    except:
        return False
        pass
    if curses.tigetnum('colors') > 0 :
        return True
    else :
        return False

isColorTerm = isColorCapable()
def color(text, color_code):
    if not isColorTerm:
        return text
    return '\x1b[%dm%s\x1b[0m' % (color_code, text)

def red(text):    return color(text, 31);
def green(text):  return color(text, 32);
def yellow(text): return color(text, 33)
def blue(text):   return color(text, 34)
def purple(text): return color(text, 35)
def cyan(text):   return color(text, 36)

cc = u'\u2585' # Unicode Character for a "box" (U+2585)
print("\nPrinting the Unicode character for a \"box\" (U+2585): \"%s\"" % cc)
print( " {} = RED box".format(red(cc)) )
print( " {} = Green box".format(green(cc)) )
print( " {} = Yellow box".format(yellow(cc)) )
print( " {} = Blue Box!".format(blue(cc)) )
print( " {} = Magenta box".format(purple(cc)) )
print( " {} = Cyan box\n".format(cyan(cc)) )
$ python3.6m.exe .\pycol.py
# (See picture)

$ powershell -c "write-host -fore Cyan This is Cyan text"
# This is correct color
This is Cyan text

> powershell -c "'?[1;31mRed ?[32mGrn ?[33mYel ?[35mMag ?[36mCya ?[m'.Replace('?', [char]27);"
# This is not
←[1;31mRed ←[32mGrn ←[33mYel ←[35mMag ←[36mCya ←[m

$ [char]0x2585
# and neither this, just as a "[?]"  (although rendering ok when pasting here in this issue)
▅

image

Expected behavior

image

Here is the correct definition/rendering of U+2585.

Issue-Question Resolution-Answered

Most helpful comment

Yeah, they technically support the same scripts, but ... in every subrange, Consolas has more characters than Lucida. In each supported subrange, Consolas has more than double the character glyphs 😉 ...
image
image

All 13 comments

This is not under PowerShell's control: you need to pick a font for the console window that can render these characters; the preinstalled TrueType fonts don't all support _all_ Unicode characters.

One that would work in your case is NSimSun.

Additionally:

  • If you want Python to be able to use VT (Virtual Terminal) escape sequences, you must turn support for it on via the registry; do it globally for all console windows, run
    Set-ItemProperty HKCU:\Console VirtualTerminalLevel -Type DWORD 1

  • For your Python script to actually produce colored output (once enabled via the registry), I had to make isColorCapable() return True _uncondtionally_ - the color support was otherwise not recognized.

This is not under PowerShell's control: you need to pick a font for the console window that can render these characters; the preinstalled TrueType fonts don't all support all Unicode characters.

I would argue that it is PowerShell (developers) that decide what fonts to use by-default, so it's quite surprising that the ones chosen are barely supporting any. It's too bad that one should have to resort to horrible hacks like having to recompile and manually install "proper" fonts, like DejaVu.

I managed to resolve this issue only by installing that font! (This IMO should not have to be the case.)

One that would work in your case is NSimSun.

I just tried installing that, according to std procedures, but it doesn't show up in the console settings.
Perhaps it is not compatible with Console? Are there new procedures for doing this?

If you want Python to be able to use VT (Virtual Terminal) escape sequences, you must turn support for it on via the registry; do it globally for all console windows...

This is funny, because although I have done that, the following outpyt from ConoutMode seem to indicate that it is not picked up... Very confusing, indeed.

$VTkey = 'HKCU:\Console'
Get-ItemProperty -Path $VTkey |grep VirtualTerminalLevel
# VirtualTerminalLevel   : 1

# .\ConinMode.exe
mode: 0x1f
ENABLE_PROCESSED_INPUT        0x0001 ON
ENABLE_LINE_INPUT             0x0002 ON
ENABLE_ECHO_INPUT             0x0004 ON
ENABLE_WINDOW_INPUT           0x0008 ON
ENABLE_MOUSE_INPUT            0x0010 ON
ENABLE_INSERT_MODE            0x0020 off
ENABLE_QUICK_EDIT_MODE        0x0040 off
ENABLE_EXTENDED_FLAGS         0x0080 off
ENABLE_AUTO_POSITION          0x0100 off
ENABLE_VIRTUAL_TERMINAL_INPUT 0x0200 off   <===

# .\ConoutMode.exe
mode: 0x3
ENABLE_PROCESSED_OUTPUT            0x0001 ON
ENABLE_WRAP_AT_EOL_OUTPUT          0x0002 ON
ENABLE_VIRTUAL_TERMINAL_PROCESSING 0x0004 off   <====
DISABLE_NEWLINE_AUTO_RETURN        0x0008 off
ENABLE_LVB_GRID_WORLDWIDE          0x0010 off

For your Python script to actually produce colored output (once enabled via the registry), I had to make isColorCapable() return True uncondtionally - the color support was otherwise not recognized.

:+1: Hmm, that's interesting too. What python are you using?

BTW. What does this mean??

• The font must be FF_MODERN for TrueType fonts.
• The font must be OEM_CHARSET for non TrueType font.

I would argue that it is PowerShell (developers) that decide what fonts to use by-default,

Still, it is a feature of the Windows Console (conhost.exe), not any given shell's, so please discuss the issue with the Windows Console team.

I don't know ConoutMode, so I can't speak to why it doesn't see the VT sequences as enabled, but de facto they do work once the registry is patched; e.g.,
python -c 'print (\"\x1b[32mword\x1b[m up\")' prints word in green.

I just tried installing that [NSimSun]

On my Microsoft Windows 10 Pro (64-bit; Version 1803, OS Build: 17134.407), this font is available _by default_.

Re Python version: Mine is 3.7.0

Re FF_MODERN and OEM_CHARSET: My - superficial - understanding is that these attributes signal basic suitability as a console font by indicating support for the characters in the legacy OEM code pages.

FF_MODERN is a property of the font family, which describes the "look" of the font in general terms. It means the font has a non-variable stroke width. Most (but not all) fixed-width fonts are also modern. see docs

You _can_ set your console font to any font (including non-fixed width fonts) _via the API_. I don't recommend using non-fixed-width fonts, because they break anything like Format-Table 😉

You can use my WindowsConsoleFonts module to set fonts that don't show up in the property dialog, if you want to -- or to see what sorts of contortions you'll need to figure out to do it yourself.

Having said all that:

The Chinese characters in the output of your Get-ItemProperty are the names of the fonts that you should use if you want to see the characters 😉 -- probably DoumChe and GulimChe (Korean), MingLiU, NSimSun, SimSun-ExtB (Chinese). They're all FF_MODERN | FF_ROMAN fonts (or at least, they claim to be) and are usable in the console through the regular property dialog. They support a lot more extended characters (basically just Chinese).

Consolas also has a full set of Light, Heavy, and Double box characters. It doesn't have any support for Chinese, Japanese or Korean...

However, if you have an East Asian environment chosen in Windows, there are additional caveats

However, I have to agree with @E3V3A on this: PowerShell is responsible for the default font choice. It choses the "Lucida Console" font, even when the conhost default is "Consolas" ... The font is stored _in the shortcut_ and the registry_, 😒 along with the color scheme 😡

However, the notion that we should use a unifont by default is preposterous. Unifonts (Asian fonts, in general) are HUGE and most people can't actually read every language script anyway. The original GNU Unifont is about 11,998KB of font -- some of the Microsoft East Asian fonts mentioned above are as large as 30,409KB -- while Consolas is about 1,693KB

P.S. If you want to use a unifont, pick one that's _merged_ with a decent console font, like this one, merged with DejaVu Sans. If you don't need all that, you can try NerdFonts which adds vector icons to fixed-width fonts...

Yeah, I have spent half the day finding out what is going on. The problem I have, is really only for printing a few glyphs. So my choice of glyph above, is outside the IBM 437 OEM_CHARSET, so if I had used another box from that page, it would have worked with any console font. However, one always wonder why not more glyphs...

The thing that is still an unknown, is the definition of the statement constant stroke width. What does that really mean?

Anyway, I really appreciate your effort in trying to help here. :) However, I don't think there is much more to be done about this issue until this ConPty stuff actually get more stable and implemented widely. The other mega issue is how color and color-escape codes are handled on different terminals.

At the end of the day, it would have been nice is pwsh console would just allow you to select whatever font yo have, that is compatible, so user doesn't have to bother with all these registry hacks etc, and thanks @Jaykul for trying to simplify that issue with your tool, but yet to be tried.

Thanks, @Jaykul - great info.

PowerShell is responsible for the default font choice

Good point, but in terms of the _scripts_ (writing systems, alpabets, i.e.: set of characters) Lucida Console supports, it's the same as what Consolas supports, according to https://en.wikipedia.org/wiki/List_of_typefaces_included_with_Microsoft_Windows: Latin, Greek, Cyrillic

As for what fonts are _available_, that's definitely not PowerShell's responsibility.

Yeah, they technically support the same scripts, but ... in every subrange, Consolas has more characters than Lucida. In each supported subrange, Consolas has more than double the character glyphs 😉 ...
image
image

Interesting, @Jaykul - I had no idea.

So, is it worth opening an issue to have the installer configure Consolas as the default font, assuming it is a true superset of the Lucida Console font?

@Jaykul
Any ideas how to make a merged font like the one you linked?
Both the Unifont and DejaVu it's based on are a bit dated. I'd like to provide a new release.

Apparently there is something called a fontlink in the registry, that can be used to link different fonts together. So if I understood it correcly, you can use this, to give a certain font, a type of search path to look at other fonts if one is not available... Has anyone tried this?

I would suggest continuing this discussion in this issue https://github.com/Microsoft/console/issues/226

Was this page helpful?
0 / 5 - 0 ratings