Powershell: The help function breaks when [Console]::OutputEncoding is set to Unicode

Created on 23 Jan 2018  Â·  8Comments  Â·  Source: PowerShell/PowerShell

Given the fact that PowerShell is moving to Unicode (see #4681, https://github.com/PowerShell/PowerShell-RFC/issues/71#issuecomment-306614751, etc.), it might be a good idea to fix the help function for this particular case.

https://github.com/PowerShell/PowerShell/blob/5b5168d72e0a51679667ec26e31d426b5ab4a122/src/System.Management.Automation/engine/InitialSessionState.cs#L4261-L4264

Known cases where [Console]::OutputEncoding is set to Unicode:

  • PowerShell is launched inside Visual Studio Code
  • PowerShell is launched in a non-Windows environment (even Bash on Ubuntu on Windows)
  • [Console]::OutputEncoding has been manually set to Unicode

Blocks #7233


A possible workaround is to install less for Windows and set $env:PAGER to less.exe, which supports UTF-8 properly, unlike the old more.com.

WG-Engine WG-Interactive-HelpSystem

Most helpful comment

We're keen to add more *NIX command-line tools where we can. Bear with ;)

All 8 comments

More has related comment

Indeed, we need a replacement for more.com.

Note that "Unicode" in Windows speak often refers to UTF-16LE specifically, not to Unicode encodings in general.

UTF-8 is Core's default, though not yet fully implemented in the console / terminal - see #7233 and #7634

And with UTF-8 also assigned to [console]::InputEncoding (not just [console]::OutputEncoding and $OutputEncoding), as will be the case, more.com goes haywire.

(Curiously, using UTF-16LE output results only in minor misbehavior: non-Basic Latin / Latin1- Supplement-range characters render as ?).

PS> $OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = [text.utf8encoding]::new($False); 'ü', '€' +  0..5 | more.com
(gobbledygook along the lines of:
ü
€            ♛ᅳ̀耀                                    ♞￑Ѐ耀                                    ♕ᅯԀ耀                                    ♨→؀耀                   ♯○܀耀                                    ♢¥ࠀ耀"C:\WINDOWS\system32\more.com"
)

For the sake of completeness: UTF-16LE:

PS> $OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = [text.encoding]::unicode; 'ü', '€' +  0..5 | more.com
ü
?
0
1
2
3
4
5

€, which is outside the Latin-1 Supplement Unicode range, rendered as ?.

Indeed, we need a replacement for more.com.

Since more.com is important to the command line user experience in general, I wonder if we should make a plea to the WSL / Console team to improve this utility? Heck maybe like they did with curl and tar (and ssh), a future drop of Windows could include the GNU less utility. That might be a "bit" much for Windows users given the VI style key bindings but as an option to more.com, I'd certainly use it. And conveniently, PowerShell allows you to pick a "pager" utility that it will use.

cc @bitcrazed

@rkeithhill The issue is that less is GPL licensed, whereas ssh and tar are BSD licensed and curl is MIT/X-derivate licensed.

Edit: After inspecting the less source code, I’ve found that it’s also distributable under the less license, which is a BSD‑style license.

OK, well then perhaps an overhaul of more.com or more likely, a new pager for Windows. Probably would be better to have more typical Windows key bindings anyway, like Ctrl+f for find, etc.

We're keen to add more *NIX command-line tools where we can. Bear with ;)

@bitcrazed There’s also UnxUtils (chocolatey package), which does precisely that, but it’s not been updated since Windows XP, and the patch utility doesn’t have a manifest for disabling the need for elevation.

And many other utilities which are part of it have subtle, or not so subtle (i.e. whoami not working), incompatibilities with modern windows.

Was this page helpful?
0 / 5 - 0 ratings