Node: Discussion: File URLs in Node.js

Created on 24 Aug 2018  路  4Comments  路  Source: nodejs/node

I wanted to open up a discussion on this topic, as previous work around this has perhaps lacked wider context (https://github.com/nodejs/node/pull/20950)

Basically when we provide file:/// URLs to users (through import.meta.url, the Loader API for modules which relies heavily on file URLs, or any other means), we are expecting users to understand a lot of intricacies of how file URLs work in Node, which it already seems 90% of people will miss.

For example, see - https://github.com/wasm-tool/node-loader/commit/e4f6b7d32355a464055e064b6b2bb33bde2b3c96.

There are two major problems most people will walk into blindly when converting from file URLs to paths in Node.js:

  1. _fs.readFile(url.pathname) works in unix systems, but will break on Windows. This means Windows support will naturally hit a wide and reliably propagating point of friction as these workflows integrate into the npm ecosystem. This issue will keep coming up across many projects as they work with file URLs._

  2. _Non-latin characters need to be percent decoded. import './浣犲ソ.mjs' will be resolved into file:///.../%E4%BD%A0%E5%A5%BD.mjs, so that in order to support loading the native characters from the file system, a percent decode operation needs to be performed on the path, with some special cases (eg not percent decoding path separators)._

(1) is the immediate issue that will show as one of the standard Windows compatibility issues (alongside path.replace(/\\/g, '/')), and (2) seems like a deeper less seen Anglocentric preference that will continue to propagate here as well.

Since I've personally not been able to make any progress on this problem through https://github.com/nodejs/node/pull/20950 I'd be interested to hear what we might be able to do about this.

What I would like to suggest here is two new native functions:

fileUrlToPath(string | URL) -> Node.path path
pathToFileUrl(path) -> URL

Let me know if that sounds like a good idea here, and I can see if we can get something into path or url... (suggestions on which is best are welcome too).

feature request fs whatwg-url

Most helpful comment

those two functions already exist in internal/url. we would just need to expose them

All 4 comments

I think utility methods like these would make sense to add along with some documentation around how and why they're useful.

those two functions already exist in internal/url. we would just need to expose them

I think some examples would get more clear explanation on why this is a great idea:

When converting path to URL there are multiple ways to mess up:

new URL(__filename) // errors (needs scheme)

__filename = './foo#1';
// '/foo' instead of the correct '/foo%231'
new URL(__filename, 'file:///').pathname;

__filename = './foo?2';
// '/foo' instead of the correct '/foo%3F1'
new URL(__filename, 'file:///').pathname;

__filename =  '//nas/foo.txt';
// '/nas/foo.txt' instead of the correct '/foo.txt'
new URL(`file://${__filename}`).pathname;

When converting from URL to path similar errors can occur (not just limited to other languages):

url = new URL('file://nas/foo.txt');
// foo.txt, but that is missing the remote host 馃槺
fs.readFile(url.pathname, () => {});

url = new URL('file:///浣犲ソ.txt');
// reads '/%E4%BD%A0%E5%A5%BD.txt' instead of '/浣犲ソ.txt'
fs.readFile(url.pathname, () => {});

url = new URL('file:///hello world.txt');
// reads '/hello%20world.txt' instead of '/hello world.txt'
fs.readFile(url.pathname, () => {});

Another concern that hasn't been mentioned here is that lots of windows tooling uses \ as the delimiter for CLI arguments. Having this properly convert to the native form of \ instead of people trying to manually manipulate / would be great and solve the following buggy code:

url = new URL('file:///c:/foo/data.csv');
// passes 'c:/foo/data.csv' instead of 'c:\\foo\\data.csv'
spawn('script.bat', [url.pathname], () => {});
Was this page helpful?
0 / 5 - 0 ratings