If there's a non-ASCII character in the path to the node directory, the build will break. This issue is only happening on Windows as far as I'm aware (there's a similar bug on POSIX systems, but it was already fixed). See the discussion on https://github.com/nodejs/node/pull/16735 and the errors below:

(https://gist.github.com/anonymous/875aac0ab3e4dfa219854b691f8d3431)
It's ironic that ICU doesn't support Unicode properly.
The issue here is that icutrim.py fails with a non-ASCII path:
C:\Users\Nikolai\WindĂ’o\node\tools\icu>python icutrim.py -P C:\Users\Nikolai\WindĂ’o\node\Debug\ -D ..\..\deps\icu-small\source\data\in\icudt60l.dat --delete-tmp -T C:\Users\Nikolai\WindĂ’o\node\Debug\obj\global_intermediate\icutmp -F icu_small.json -O icudt60l.dat -v -L en,root
Options: {'verbose': 1, 'filterfile': 'icu_small.json', 'toolpath': 'C:\\Users\\Nikolai\\Wind\xd2o\\node\\Debug\\', 'deltmpdir': 1, 'outfile': 'icudt60l.dat', 'datfile': '..\\..\\deps\\icu-small\\source\\data\\in\\icudt60l.dat', 'locales': 'en,root', 'endian': 'little', 'tmpdir': 'C:\\Users\\Nikolai\\Wind\xd2o\\node\\Debug\\obj\\global_intermediate\\icutmp'}
icu_small.json: icutrim.py config: Trim down ICU to just a certain locale set, needed for node.js use.
* converters: 0 items
* stringprep: 0 items
* translit: 2 items
* brkfiles: 0 items
* brkdict: 0 items
* confusables: 0 items
* brkitr: 0 items
* coll: 1 items
* curr: 1 items
* lang: 0 items
* rbnf: 0 items
* region: 0 items
* ROOT: 1 items
* unit: 0 items
* zone: 1 items
###WARNING: Encountered abnormal bytes while converting input stream to target encoding: U_ILLEGAL_CHAR_FOUND
Pre-context: \Users\Nikolai\Win
Context: ĂŠ
Post-context: o\node\Debug\iculs
An error occurred processing file C:\Users\Nikolai\WindĂŠo\node\Debug\obj\global_intermediate\icutmp\lang/\res_index.txt. Error: U_ILLEGAL_CHAR_FOUND
FAILED: C:\Users\Nikolai\WindĂŠo\node\Debug\genrb -d C:\Users\Nikolai\WindĂŠo\node\Debug\obj\global_intermediate\icutmp\lang/ -s C:\Users\Nikolai\WindĂŠo\node\Debug\obj\global_intermediate\icutmp\lang/ res_index.txt
Who maintains this file? It contains a link to http://bugs.icu-project.org/, but its history consists only of Node.js-local changes. cc @srl295
Welcome to the irony of my life. You can workaround with full-icu as an option. Otherwise will look at this next week when I am at a windows box. And sorry.
@seishun I can't even get this far. I get:
C:\Users\StevenLoomis\Documents\GitHub\NĂ’DE>vcbuild.bat
ERROR: The system was unable to find the specified registry key or value.
ERROR: The system was unable to find the specified registry key or value.
Cannot determine current version of Node.js
C:\Users\StevenLoomis\Documents\GitHub\NĂ’DE>
… and then an editor is popped up to _edit_ getnodeversion.py
(running node --version returns v8.9.4)
I ran this from in powershell and cmd.exe, in and out of the visual studio command tools for 2015 and 2017.
@mmarchini what is the actual path of your node dir ? I see both:
C:UsersNikolaiWindĂ’o
and
C:UsersNikolaiWindĂŠo
@srl295 That looks like you don't have Python installed (see https://github.com/nodejs/node/issues/16864).
what is the actual path of your node dir ? I see both:
I reproduced it by placing it in "WindĂ’o". I don't know why Python outputs "WindĂŠo".
Thx. Will retry when back at that box.
@srl295 do you think you'll have to time to look into it any time soon? I could try fixing it myself, but I might not have enough knowledge about it.
@seishun I'll have to see, I just reinstalled my windows env and have some things coming up. So not necessarily soon.
The issue is in iculslocs. When run from a non-ASCII path, it generates a file with invalid UTF-8, which genrb chokes on.
To reproduce, run:
C:\<non-ascii path>\iculslocs -i icudt60l.dat -N icudt60l -T lang -b res_index.txt
@srl295 how would you suggest to proceed with this? It seems iculslocs is a third-party project.
edit I wrote iculslocs for node, but thought that it might be used upstream in ICU. As things stand now, node is the only user and I have not maintained any changes upstream.
@seishun
how would you suggest to proceed with this
it can be fixed by a normal PR. Still did not get to investigate.
It looks like you posted an unfinished comment.
@seishun can you attach the res_index.txt file here? There are multiple files under ./out/Release/gen/icutmp/
You mean the file iculslocs generates? Here you go: res_index.txt
^ that would still be interesting, but I think I found the problem.
iculslocs writes out:
(BOM)// -*- Coding: utf-8; -*-
//
// Warning this file is automatically generated
// Updated by ../../out/Release/iculslocs based on icudt60l:res_index.txt
… but the path given is based on argv[0] which is in process encoding, not necessarily utf-8. The path doesn't really add any value here. I'll change it to the following (note that the ../.. path is gone)
// -*- Coding: utf-8; -*-
//
// NOTE: This file was generated during the build process.
// Generator: tools/icu/iculslocs.cc
// Input package-tree/item: icudt60l-lang/res_index.res
(package icudt60l, tree lang, item res_index.res - probably only of interest to someone debugging )
@seishun are you able to test the commit I made to https://github.com/srl295/node/tree/fix-iculslocs ?
@seishun perfect, that confirms what I thought. I think the above fix will solve the issue. Thanks for your investigation.
For the record, your file had broken utf-8 in the comment mentioned above.
// Updated by C:\Users\Nikolai\Wind“o\node\Debug\iculslocs based on icudt60l-lang:res_index.txt
That fixes the issue with iculslocs, but doesn't completely fix the build issue because there is another similar issue with "protocol_generated_sources" from V8.
In addition, fixing both issues would allow builds from Latin-1 paths, but not for all Unicode paths. Fixing it generally probably requires moving to a different build system (https://github.com/nodejs/node/issues/18560#issuecomment-363120101).
@seishun OK. this fix should at least make the ICU tooling subsystem not dependent on the path (whether latin1 or all unicode). I'll open a PR.