Before dismissing this as a potential issue with git and not with powershell, please read to the end.
Create an empty git repository (git 2.21.0.windows.1) and put a file there that contains german umlauts, e.g. ä
mkdir gitrep
cd gitrep
git init .
"ä" | Out-File -Encoding utf8 file.txt
git add file.txt
git diff --cached > output.patch
Get-Content output.patch
diff --git a/file.txt b/file.txt
new file mode 100644
index 0000000..8be8316
--- /dev/null
+++ b/file.txt
@@ -0,0 +1 @@
+ä
diff --git a/file.txt b/file.txt
new file mode 100644
index 0000000..8be8316
--- /dev/null
+++ b/file.txt
@@ -0,0 +1 @@
+ä
> $PSVersionTable
Name Value
---- -----
PSVersion 6.1.3
PSEdition Core
GitCommitId 6.1.3
OS Microsoft Windows 10.0.17134
Platform Win32NT
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0...}
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1
WSManStackVersion 3.0
The the umlaut does display correctly in the terminal, see the screenshot:

But, whenever the output of "git diff" is redirected to a file, the umlaut character becomes garbage. This works without problems in the windows commandline (cmd). To me, an indication that the problem is rather within powershell.
I have created a respective question at stackoverflow, but I think this may rather be a bug that should be brought to attention: https://stackoverflow.com/questions/52205297/the-output-of-git-diff-is-not-handled-correctly-in-powershell
There is a related Q&As, but I think this issue is much simpler and easier to reproduce
https://stackoverflow.com/questions/13675782/git-shell-in-windows-patchs-default-character-encoding-is-ucs-2-little-endian/13751617#13751617
https://stackoverflow.com/questions/36494026/git-diff-does-not-handles-character-encoding-other-than-utf-8
I tested with version 6.1.3 of powershell core and latest git and the problem is still exactly the same.
Redirection to file uses Out-File and doesn't allow you to specify the encoding. I believe it should be defaulting to UTF8 w/o BOM in 6.1+
Might I suggest in the meantime using | Set-Content instead of >?
But regardless, whatever's going on here should still be sorted out. 🙂
Might I suggest in the meantime using
| Set-Contentinstead of>?
I tried git diff --cached | Set-Content output.patch with and without --no-pager but the result is the same as with output redirection.
But regardless, whatever's going on here should still be sorted out. 🙂
For regular Powershell 5.1 the results are even stranger, there it makes a difference whether output redirection > or Set-Content is used: +ñ with Set-Content and ├ñ with output redirection, obviously both are wrong. In both Powershell 5.1 and Powershell core 6.1 the result looks good when it is printed to the terminal (using --no-pager or setting $Env:LESSCHARSET="utf8").
It works in good old cmd.exe. Still, I noted that executing cmd /c "git --no-pager diff --cached > output.patch" and then viewing the file with Get-Content .\output.patch in the console window looks okay in Powershell core 6.1, but looks wrong in Powershell 5.1 (ä)
What are your settings for the following values?
[console]::OutputEncoding[console]::InputEncoding$OutputEncodingInputEncoding> [console]::OutputEncoding
Preamble :
BodyName :
EncodingName : Western European (DOS)
HeaderName :
WebName : ibm850
WindowsCodePage :
IsBrowserDisplay :
IsBrowserSave :
IsMailNewsDisplay :
IsMailNewsSave :
IsSingleByte : True
EncoderFallback : System.Text.InternalEncoderBestFitFallback
DecoderFallback : System.Text.InternalDecoderBestFitFallback
IsReadOnly : False
CodePage : 850
> [console]::InputEncoding
Preamble :
BodyName :
EncodingName : Western European (DOS)
HeaderName :
WebName : ibm850
WindowsCodePage :
IsBrowserDisplay :
IsBrowserSave :
IsMailNewsDisplay :
IsMailNewsSave :
IsSingleByte : True
EncoderFallback : System.Text.InternalEncoderBestFitFallback
DecoderFallback : System.Text.InternalDecoderBestFitFallback
IsReadOnly : True
CodePage : 850
> $OutputEncoding
Preamble :
BodyName : utf-8
EncodingName : Unicode (UTF-8)
HeaderName : utf-8
WebName : utf-8
WindowsCodePage : 1200
IsBrowserDisplay : True
IsBrowserSave : True
IsMailNewsDisplay : True
IsMailNewsSave : True
IsSingleByte : False
EncoderFallback : System.Text.EncoderReplacementFallback
DecoderFallback : System.Text.DecoderReplacementFallback
IsReadOnly : True
CodePage : 65001
InputEncoding - I don't know
Oops, forgot the $ on that last one, sorry. Should be $InputEncoding
Oops, forgot the $ on that last one, sorry. Should be
$InputEncoding
$InputEncoding returns nothing
The output comes from less, so try setting
$env:LESSCHARSET='UTF-8'
@powercode Could you reproduce the issue?
The problem does not seem to be related to pagers. You can always use the --no-pager option in git, which still shows the problem.
For me, the combination of setting LESSCHARSET and [console]::OutputEncoding to utf8 worked.
@powercode That did it!
Setting [console]::OutputEncoding = [System.Text.Encoding]::UTF8 solved the issue. LESSCHARSET is not needed. I would think that UTF8 should be the default these days, but okay.
Setting
[console]::OutputEncoding = [System.Text.Encoding]::UTF8solved the issue.
This fixed it for me too. Thanks!
The output comes from less, so try setting
$env:LESSCHARSET='UTF-8'
This has bugged me for YEARS and settings $env:LESSCHARSET='UTF-8' fixed it in git log output for me (e.g. author name).
Its still not perfect though, using
git log -1 --show-signature
to show a gpg signed commit, German Umlaute, (and Line Breaks it seems) are not displayed correctly, but I can live with that. The linebreak issue may come from using gpg4win ¯\_(ツ)_/¯
Behavior is that same for powershell 5, powershell 7 and vscode integrated powershell terminal and git bash for windows (MINGW64).
Most helpful comment
@powercode That did it!
Setting
[console]::OutputEncoding = [System.Text.Encoding]::UTF8solved the issue.LESSCHARSETis not needed. I would think that UTF8 should be the default these days, but okay.