Git: Japanese text displayed incorrectly when running `git diff` in version 2.11.0

Created on 5 Dec 2016  ·  16Comments  ·  Source: git-for-windows/git

  • [x] I was not able to find an open or closed issue matching what I'm seeing

Setup

  • Which version of Git for Windows are you using? Is it 32-bit or 64-bit?
$ git --version --build-options
git version 2.11.0.windows.1
sizeof-long: 4
machine: x86_64

I'm using a portable release: PortableGit-2.11.0-64-bit.7z.exe

  • Which version of Windows are you running? Vista, 7, 8, 10? Is it 32-bit or 64-bit?
$ cmd.exe /c ver

Microsoft Windows [Version 6.1.7601]

Windows 7 Professional, Service Pack 1, 64-bit

  • What options did you set as part of the installation? Or did you choose the
    defaults?

I have no install-options.txt file.

  • Any other interesting things about your environment that might be related
    to the issue you're seeing?

I'm using diff-highlight.
I downloaded diff-highlight file from https://github.com/git/git/tree/master/contrib/diff-highlight and put it in %PathToPortableGit%\usr\bin\ directory.

.gitconfig

[core]
    autocrlf = false
    quotepath = false
[pager]
    log = diff-highlight | less
    show = diff-highlight | less
    diff = diff-highlight | less
[diff]
    compactionHeuristic = true

Details

  • Which terminal/shell are you running Git from? e.g Bash/CMD/PowerShell/other

Bash

  1. Build a test repository.
$ mkdir gitrepo
$ cd gitrepo
$ git init
$ touch test.txt
  1. Rewrite test.txt as following, and save with BOM-less UTF-8 encoding.
+日本語のテキスト
+Write in Japanese
+
  1. Add test.txt to the repository.
$ git add .
$ git commit -m "First commit"
  1. Rewrite test.txt as following.
-日本語のテキスト
-Write in Japanese
+英語のテキスト
+Write in English

  1. Now run git diff
$ git diff head
  • What did you expect to occur after running these commands?
$ git diff head
diff --git a/test.txt b/test.txt
index 19ebe7c..e45d453 100644
--- a/test.txt
+++ b/test.txt
@@ -1,2 +1,2 @@
-日本語のテキスト
-Write in Japanese
+英語のテキスト
+Write in English

  • What actually happened instead?
$ git diff head
diff --git a/test.txt b/test.txt
index 19ebe7c..e45d453 100644
--- a/test.txt
+++ b/test.txt
@@ -1,2 +1,2 @@
-<E6><97><A5><E6><9C><AC><E8><AA><9E><E3><81><AE><E3><83><86><E3><82><AD><E3><82><B9><E3><83><88>
-Write in Japanese
+<E8><8B><B1><E8><AA><9E><E3><81><AE><E3><83><86><E3><82><AD><E3><82><B9><E3><83><88>
+Write in English

Additional Notes

  • It worked correctly when I used version 2.10.2.windows.1
  • I did not change any config when updating to version 2.11.0.windows.1 from 2.10.2.windows.1
bug sdk-packages

All 16 comments

I see the same behavior for german umlauts like äöü in git log.

The same problem with Russian symbols when upgraded from 2.10 to 2.11.0.windows.1
It seems that 2.11.0.windows.1 does not honor the quotepath option anymore.

I encountered the same problem for Chinese characters after upgrading from 2.10.2 to 2.11.0.

@nyoro712 to ensure the BOM settings and encodings are correct, are you able to publish this repository somewhere so we're able to test the same bytes?

@Egor-Skriptunoff @Suchiman same thing, are you able to provide sample repositories - and repro steps for the commands you are running - to ensure we're testing the right encodings?

Published https://github.com/nyoro712/git-for-windows-issue-981

  1. Clone the repository.
$ git clone https://github.com/nyoro712/git-for-windows-issue-981 issue
$ cd issue
  1. Rewite test.txt
-日本語のテキスト
-Write in Japanese
+英語のテキスト
+Write in English

  1. Run git diff
$ git diff head
  1. The problem will be reproduced.

.gitconfig and diff-highlight are not included in the repository.
They are global config (git config --global) in my environment.

Similar problem with letter Õ in git log.

Could you create .minttyrc in your home directory with the single line Charset=UTF-8, like so:

echo Charset=UTF-8 >>~/.minttyrc

then restart Git Bash and try again?

I tried .minttyrc, but nothing changed.

@nyoro712 I settled on another solution.

If you want to work around it in the meantime, right-click on the windows icon in the top left of Git Bash's window, select Options, and make sure that the Text tab lists "UTF-8" as encoding.

@dscho It still display incorrectly.

screenshot

The result of locale command on Git Bash with version 2.10.2

$ locale
LANG=ja_JP.UTF-8
LC_CTYPE="ja_JP.UTF-8"
LC_NUMERIC="ja_JP.UTF-8"
LC_TIME="ja_JP.UTF-8"
LC_COLLATE="ja_JP.UTF-8"
LC_MONETARY="ja_JP.UTF-8"
LC_MESSAGES="ja_JP.UTF-8"
LC_ALL=

In version 2.11.0

$ locale
LANG=
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=C

So I unset LC_ALL and set LANG, it worked well.

  1. Unset LC_ALL
$ unset LC_ALL

The result of locale is

$ locale
LANG=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_ALL=

Then git diff works fine, but Japanese filenames are still displayed incorrectly.

  1. Set LANG
$ export LANG=ja_JP.UTF-8

The result of locale is

$ locale
LANG=ja_JP.UTF-8
LC_CTYPE="ja_JP.UTF-8"
LC_NUMERIC="ja_JP.UTF-8"
LC_TIME="ja_JP.UTF-8"
LC_COLLATE="ja_JP.UTF-8"
LC_MONETARY="ja_JP.UTF-8"
LC_MESSAGES="ja_JP.UTF-8"
LC_ALL=

Everything works fine. (However, restarting Git Bash, it goes back)

The fix that closed this ticket is not yet in any published version. Hold on, though, I am still busy preparing a prerelease for you to test.

Bummer, I thought this ticket would be auto-tagged... https://github.com/git-for-windows/git/releases/tag/prerelease-v2.11.0.windows.1.1

Could you test that, please?

@dscho I tried prerelease-v2.11.0.windows.1.1.
It's fine! 🎉

Bummer, I thought this ticket would be auto-tagged... https://github.com/git-for-windows/git/releases/tag/prerelease-v2.11.0.windows.1.1
Could you test that, please?

Aside: I mistakenly thought there was a numbering err on the release page, as I'd expected an increment to the 2.11.0 part (i.e. just before the .rc0). In fact the increment is after the .windows. part of the version < feels like a fool >.

I have downloaded and installed the release.
Install title bar 'Git 2.11.0.0.2 Setup'.
On the finish screen selected 'start the Bash', and deselected the show release notes
$ git version
git version 2.10.2.windows.1.895.g8126884

Should this be the version displayed??

I tested it against Robert's remote.origin.url=https://github.com/rcdailey/test977 and out of the box it displayed the 'special' (above asci) UTF-8 chars just fine.

Philip

(pasted text has some line wrap..)

Philip@House-PC MINGW64 ~
$ git version
git version 2.10.2.windows.1.895.g8126884

Philip@House-PC MINGW64 ~
$ cd C:/Users/Philip/test977
Philip@House-PC MINGW64 ~/test977 (master)
$ git log --abbrev-commit --decorate --date=relative --format=format:'\
%C(bold blue)%h%C(reset) | %C(bold green)(%ar)%C(reset)%C(bold yellow)%d%C(reset ) %C(dim white)%an%C(reset) - %C(white)%s%C(reset)' --graph

  • \
    | ce873bd | (4 days ago) (HEAD -> master, origin/master, origin/HEAD) Jesus Mont
    año - asdf
  • \
    | 9c6f104 | (4 days ago) Jesus Montaño - More stuff
  • \
    8c6b326 | (4 days ago) Jesus Montaño - test test test test test test test test
    test test test test test test test test test test test test test test test test
    test test test

Philip@House-PC MINGW64 ~/test977 (master)
$

Aside: I mistakenly thought there was a numbering err on the release page, as I'd expected an increment to the 2.11.0 part (i.e. just before the .rc0). In fact the increment is after the .windows. part of the version < feels like a fool >.

@PhilipOakley it is I who feels like a fool. I assumed that prerelease-v2.11.0.windows.1.1 would be a good name for a release based on v2.11.0.windows.1.

I worked a bit on streamlining the prerelease engineering (basically, I want to start one command in the VM dedicated to build releases and forget about it, it should do everything from coming up with a good tag, to building, to uploading the prerelease and publishing it). The way this script works right now, it will take the base version (v2.11.0), increment the last digit, and then append an auto-incrementing prerelease suffix.

The prerelease which is building right now, as I write this, is labeled v2.11.1.windows-prerelease.1.

Thanks for helping this project!

Was this page helpful?
0 / 5 - 0 ratings