I have a Windows executable that produces unicode (utf-16) output. In PowerShell 5.1, I can set the [Console]::OutputEncoding property so the output of that command gets correctly interpreted. On PowerShell Core 6.2.3, that doesn't appear to work.
I've also tried setting [Console]::InputEncoding and $OutputEncoding, but the problem persists.
For example, I use the wsl.exe binary here, so this should repro on any system that has the Windows Subsystem for Linux installed.
[Console]::OutputEncoding = [System.Text.Encoding]::Unicode
wsl.exe --list -v | ForEach-Object { $_ }
PS C:\Users\svgroot> [Console]::OutputEncoding = [System.Text.Encoding]::Unicode
PS C:\Users\svgroot> wsl.exe --list -v | ForEach-Object { $_ }
NAME STATE VERSION
* Ubuntu Stopped 2
Ubuntu-18.04 Stopped 2
Alpine Stopped 1
PS C:\Users\svgroot> [Console]::OutputEncoding = [System.Text.Encoding]::Unicode
PS C:\Users\svgroot> wsl.exe --list -v | ForEach-Object { $_ }
N A M E S T A T E V E R S I O N
* U b u n t u S t o p p e d 2
U b u n t u - 1 8 . 0 4 S t o p p e d 2
A l p i n e S t o p p e d 1
Name Value
---- -----
PSVersion 6.2.3
PSEdition Core
GitCommitId 6.2.3
OS Microsoft Windows 10.0.19001
Platform Win32NT
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0鈥
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1
WSManStackVersion 3.0
That's not good - the bug is still present as of PowerShell Core 7.0.0-preview.4.
Here's a repro that doesn't require WSL:
[Console]::OutputEncoding = [text.encoding]::unicode; sfc /? | Write-Output
As a Pester test:
[Console]::OutputEncoding = [text.encoding]::unicode; sfc /? | Write-Output | Should -Not -Match "`0"
Has this ever worked in PowerShell Core, or has it been broken ever since 6.0.0?
@vexx32: It's also broken in 6.0.0.
This is due to a breaking change in .NET Core. You should initialize ProcessStartInfo.StandardInputEncoding/StandardErrorEncoding/StandardOutputEncoding
if they're redirected. .NET Framework defaults to using Console.OutputEncoding
if you don't initialize StandardOutputEncoding
, but .NET Core defaults to calling Process.GetEncoding((int)Interop.Kernel32.GetConsoleOutputCP())
which is UTF8 (on my system).
This is the code that creates ProcessStartInfo
:
If I understand correctly, then, a fix should be to set the ProcessStartInfo.StandardInput(/Output)Encoding
to match [console]::Input(/Output)Encoding
values explicitly?
Should this respect [console]
encoding settings, or $OutputEncoding
? From what I recall, those values don't always align, if I'm not mistaken?
I don't know if it should use $OutputEncoding
or Console.OutputEncoding
, but the code would be something like this:
C#
bool redirectStdOut = true;
bool redirectStdErr = true;
bool redirectStdIn = false;
var startInfo = new ProcessStartInfo();
if (redirectStdOut)
{
startInfo.RedirectStandardOutput = true;
startInfo.StandardOutputEncoding = Console.OutputEncoding;
}
if (redirectStdErr)
{
startInfo.RedirectStandardError = true;
startInfo.StandardErrorEncoding = Console.OutputEncoding;
}
if (redirectStdIn)
{
startInfo.RedirectStandardInput = true;
startInfo.StandardInputEncoding = Console.InputEncoding;
}
Actually to match PS 5.1 behavior it should not use $OutputEncoding
Agreed, @0xd4d: I don't know how .StandardInput
comes into play, but on the output side It should definitely be [Console]::OutputEncoding
, because that is how it has always worked in Windows PowerShell, where it determines how PowerShell decodes stream output _from_ external programs.
$OutputEncoding
controls what encoding is used to send data from Powershell _to_ external programs, via a pipe. It defaults to UTF-8 in PSCore and to ASCII(!) in WinPS. In either edition it can differ from [Console]::OutputEncoding
.
Thanks for looking into this, everyone. Hopefully this can get fixed soon.
Yep. Note that C:\Windows\system32\sfc.exe in Windows 10 outputs utf-16. It's a powershell question that comes up occasionally.
@mklement0 I guess that would mean .StandardInputEncoding
should match $OutputEncoding
, then? 馃 On the assumption that we may be piping _into_ such a command as well.
@vexx32 I've only glanced at the code, and I see that the pipe that is connected to the child process' stdin explicitly uses $OutputEncoding
:
I don't fully understand how that relates to the default .StandardInput
encoding - it looks like it may override it.
@SvenGroot do you have any examples on the input side where we have issues with encoding?
@SteveL-MSFT No, I only use OutputEncoding in my scenario.
Here's my _guess_ as to what we should do:
When piping data from PowerShell to an external process, it is $OutputEncoding
that already drives the standard input encoding for the child process (no change there - this was never broken).
When _not_ piping (starting an interactive console application, for instance), i.e. when stdin is _not_ redirected, we should set .StandardInput
to [Console]::InputEncoding
.
Whether redirected or not, .StandardOutput
should always be set to [Console]::OutputEncoding
@mklement0 when not piping, what is value of setting .StandardInput
to any encoding? For your 3rd bullet, I believe you meant [Console]::OutputEncoding
. For my PR, I'm focusing on output only unless someone brings a case where input encoding is a problem.
@SteveL-MSFT: Thanks for the correction re 3rd bullet point - I've fixed my previous comment.
what is value of setting
.StandardInput
to any encoding?
My thinking is: An interactive console application that reads from stdin probably expects the _console's_ (terminal's) input encoding to be in effect (that's presumably how it works in Windows PowerShell).
Glad to see this was fixed for .StandardOutput
.
As for setting .StandardInput
to [Console]::InputEncoding
: please see #10907, @SteveL-MSFT.
:tada:This issue was addressed in #10824, which has now been successfully released as v7.0.0-preview.6
.:tada:
Handy links:
Most helpful comment
This is due to a breaking change in .NET Core. You should initialize
ProcessStartInfo.StandardInputEncoding/StandardErrorEncoding/StandardOutputEncoding
if they're redirected. .NET Framework defaults to usingConsole.OutputEncoding
if you don't initializeStandardOutputEncoding
, but .NET Core defaults to callingProcess.GetEncoding((int)Interop.Kernel32.GetConsoleOutputCP())
which is UTF8 (on my system).This is the code that creates
ProcessStartInfo
:https://github.com/PowerShell/PowerShell/blob/master/src/System.Management.Automation/engine/NativeCommandProcessor.cs#L1088-L1150