Node: Issues with `os.homedir()` on Windows when the username contains non-ASCII Unicode symbols

Created on 10 Dec 2017  Â·  18Comments  Â·  Source: nodejs/node

  • Version: Node.js v8.9.3
  • Platform: Windows
  • Subsystem: os
> os.homedir()
'C:\\Users\\Björn'
> os.tmpdir()
'C:\\Users\\BJRN~1\\AppData\\Local\\Temp'

os.homedir() should match os.tmpdir() and return 'C:\\Users\\BJRN~1'. Not doing so results in errors when performing file operations, e.g. https://github.com/GoogleChromeLabs/jsvu/issues/11.

os windows

All 18 comments

AFAICT it's not that simple. os.tmpdir() return the environment variable verbatim:

d:\code\node$ set "TEMP=C:\\Users\\Björn"

d:\code\node$ node.exe -p "const os=require('os');os.tmpdir()"
C:\\Users\\Björn

d:\code\node$ set "TEMP=c:\Program Files (x86)"

d:\code\node$ node.exe -p "const os=require('os');os.tmpdir()"
c:\Program Files (x86)

As can os.homedir():

d:\code\node$ set "USERPROFILE=C:\Users\BJRN~1\"

d:\code\node$ node.exe -p "const os=require('os');os.homedir()"
C:\Users\BJRN~1\

As requested on Twitter, here’s the output for set:

C:\Users\Björn>set
ALLUSERSPROFILE=C:\ProgramData
APPDATA=C:\Users\Björn\AppData\Roaming
CommonProgramFiles=C:\Program Files\Common Files
CommonProgramFiles(x86)=C:\Program Files (x86)\Common Files
CommonProgramW6432=C:\Program Files\Common Files
COMPUTERNAME=WIN7
ComSpec=C:\Windows\system32\cmd.exe
FP_NO_HOST_CHECK=NO
HOMEDRIVE=C:
HOMEPATH=\Users\Björn
LOCALAPPDATA=C:\Users\Björn\AppData\Local
LOGONSERVER=\\WIN7
NUMBER_OF_PROCESSORS=2
OS=Windows_NT
Path=C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32
\WindowsPowerShell\v1.0\;C:\Program Files\nodejs\;C:\Users\Björn\AppData\Roaming
\npm
PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC
PROCESSOR_ARCHITECTURE=AMD64
PROCESSOR_IDENTIFIER=Intel64 Family 6 Model 69 Stepping 1, GenuineIntel
PROCESSOR_LEVEL=6
PROCESSOR_REVISION=4501
ProgramData=C:\ProgramData
ProgramFiles=C:\Program Files
ProgramFiles(x86)=C:\Program Files (x86)
ProgramW6432=C:\Program Files
PROMPT=$P$G
PSModulePath=C:\Windows\system32\WindowsPowerShell\v1.0\Modules\
PUBLIC=C:\Users\Public
SESSIONNAME=Console
SystemDrive=C:
SystemRoot=C:\Windows
TEMP=C:\Users\BJRN~1\AppData\Local\Temp
TMP=C:\Users\BJRN~1\AppData\Local\Temp
USERDOMAIN=Win7
USERNAME=Björn
USERPROFILE=C:\Users\Björn
windir=C:\Windows
windows_tracing_flags=3
windows_tracing_logfile=C:\BVTBin\Tests\installpackage\csilogfile.log

C:\Users\Björn>

Right now, this is broken for both a) the common case where the user did not overwrite any environment variables as well as b) the less common case where they did overwrite env vars without explicitly using 8.3 names.

I get that it’s harder to fix it for the uncommon case. But fixing it for the common case would already help a great deal.

@mathiasbynens Maybe this is a naïve question, but how would Node go about ‘fixing’ this? I assume what you’re talking about is translating non-ASCII path segments like Björn → BJRN~1 transparently? Is that possible without disk access?

WIN32 has an API for this: GetShortPathName but it does I/O since the "short name" is stored in the FS (could be disk or network). The inverse also exists (8.3 name to full name).
But it get more complicated, since 8.3 names can be disabled on NTFS (https://support.microsoft.com/en-us/help/121007/how-to-disable-8-3-file-name-creation-on-ntfs-partitions), need to check what GetShortPathName returns in that case.

Shouldn't both formats be equivalent when it comes to reading files from them? It might help to quote the non-8.3 path (see windowsVerbatimArgs option in child_process).

@refack Yeah, that’s what I was afraid of – I don’t think people would welcome hidden synchronous I/O…

Shouldn't both formats be equivalent when it comes to reading files from them? It might help to quote the non-8.3 path (see windowsVerbatimArgs option in child_process).

In this particular case, the Node.js script (jsvu) generates a batch file foo.cmd:

@echo off
"C:\Users\Björn\foo.exe" %*

The C:\Users\Björn part is the output of os.homedir().

Running this (with foo.exe at the expected location) results in:

The system cannot find the path specified.

Note that this example is _already_ using quotes as @silverwind suggested. Removing the quotes doesn’t help. The only way I found to make this work is to use the 8.3-formatted path instead, i.e.:

@echo off
"C:\Users\BJRN~1\foo.exe" %*

My PR to @sindresorhus’ untildify package lets me get that output in a very indirect way. This is why I felt like this should be addressed at the Node.js level somehow.

This seems to stem from a known limitation in cmd.exe that it doesn't understand Unicode (or UTF-8).
More info and suggested workaround in https://github.com/GoogleChromeLabs/jsvu/issues/11

After reading around this and the related issue, IMHO what node (or libuv) does need is a wrapper for GetShortPathName. Otherwise there is no simple workaround for @mathiasbynens's issue.

You can for %I in (directory name) do echo %~sI - it will get the path in 8.3 format. I'm not sure if GetShortPathName belongs in node core (but I'm not against it).

I see the patch already landed in jsvu. I'm closing this. Feel free to reopen this issue, or create another one for GetShortPathName to be implemented.

I think we should at least consider a fix. Requiring every user of homedir to apply a workaround surely can't be optimal.

I'm not sure what else can be done here. We cannot change os.homedir() to return short path name, and it looks like the issue in jsvu is already fixed. We can discuss adding some sort of fs.getShortPath but I think this should be done in another issue.

Maybe add a note to the docs that users who use non-ascii paths should use Powershell?

The fix in jsvu is a workaround, at best. I agree with @silverwind that it would still be nice to fix the underlying issue at the Node.js level, e.g. by exposing something like GetShortPathName as @refack suggested.

I'm not sure where such note. should be added. Anyhow, PRs are always welcomed. If you feel that such note should be added, feel free to add one yourself.

FWIW, I don't think this is for node to fix. You'd have the same issue in e.g. python where you don't have access to GetShortPathName() either (except through ctypes.)

Was this page helpful?
0 / 5 - 0 ratings