Node: Changes to `file:` URL formatting in Node v7 cause problems for npm

Created on 8 Nov 2016 · 13Comments · Source: nodejs/node

Version: v7.0.0
Platform: platform agnostic
Subsystem: url.format

IMPACT: Before I get into the specifics, I want to be clear about what the impact is for npm. It's actually pretty limited. As best as we can tell this only impacts when _saving_ a file-type specifier to a package.json with npm, not when installing from one. As such, it can be worked around by editing the package.json.

#7234 introduced a change to how url.format printed invalid file: type URLs. Specifically if someone gave it the invalid file:/absolute/path/to/file.txt it would translate that into the valid file:///absolute/path/to/file.txt. Previously invalid URLs were left unaltered.

npm's use of file URLs are usually to point at relative paths, which isn't something file URLs are really setup to support. The way npm does this is tack file: on the front, so npm -i -S ./foo/ becomes file:foo and npm -i -S ../foo becomes file:../foo.

As of v7, those become file://foo and file://../foo respectively. The first being a reference to / on the foo host, and the latter being a reference to /foo on the .. host.

Ok, so Node v7 is making legacy URLs "healthy" where it didn't used to and that bit our particular use case. We can update npm to work around that, but there is an additional wrinkle…

We don't actually call url.format('file:foo'). What we call is:

url.format({protocol: 'file', slashes: false, pathname: 'foo'})

It'd be legit to call this a doc bug and update there, imo, though ignoring properties in the object form feels odd to me. Either way, we'll be updating our code to not rely onurl.format.

url

Source

iarna

Most helpful comment

We shipped our own change for this in [email protected] which I'll be downstreaming to you all this Thursday, Dec 22nd. (Well, 4.1.1 actually, but yeah.)

iarna on 25 Dec 2016

❤1 🎉1

All 13 comments

Thanks for the use-case @iarna

@nodejs/ctc since we approved this we ought to consider the impact and whether and how we want to apply mitigation in this case. I see a few ways forward:

Accept that the new behaviour is what we intended and make our docs more clear about the non-absolute case (edit: also a test case to cover this!)
Revert our changes and embrace url.URL as the future to fix these kinds of mistakes and just leave url alone
Find a middle-ground that takes the relative path case into account differently to the absolute. If you look at #7234 you can see the intention was to deal with absolute paths, there's no explicit test included for relative paths. When the input is sufficiently vague, as is the case with url.format({protocol: 'file', slashes: false, pathname: 'foo'}), it's arguably bad form of us to make an assumption that it's an absolute path, even if file urls officially don't do relative.

rvagg on 9 Nov 2016

@iarna Just to make sure I'm understanding: npm can and will work around this, so even if we update the docs and that's it, things will be OK with npm (assuming people upgrade npm). In other words, you are informing us of a use case that we seem to have failed to consider and we should make a decision as to whether or not we should accommodate that use case. But either way, npm will be fine. Or am I misunderstanding?

Trott on 9 Nov 2016

I'm OK with any of the options @rvagg listed, but (assuming @iarna's answer to my question in the preceding comment is "yes, that is correct") I'd prefer the first one: retain the current behavior and update the docs.

And my second choice is the last one (basically, keep the current behavior generally, but add relative paths as a possibility even though relative paths are not permitted in file urls according to spec).

But I can see reasons to be for and against each of the possibilities.

And I would (of course) feel differently if this were causing widespread breakage in the ecosystem. If that were the case, I'd probably be all for reverting.

Trott on 9 Nov 2016

Actually...I just did a test in a browser, and Chrome, at least, understands relative file paths. Given that "do what the browser does" seems to be the general philosophy for url, I have to change my preference to: Support relative paths.

Trott on 9 Nov 2016

👍1

While I would generally prefer url.parse() to work correctly, the key reason I implemented url.URL separately was to avoid ecosystem breakage. I'd rather leave url.parse() as it is, warts and all, and move users over to url.URL as we can.

jasnell on 9 Nov 2016

I gotta say, while I understand that there are times when a strictly conformant URL implementation is useful, I have found url's easy going nature to be very useful, in both formatting and parsing, it allows it to be used for all kinds of non-internet standard encoding of url-like data, as npm found.

sam-github on 10 Nov 2016

I have found url's easy going nature to be very useful

At the risk of getting a little too far off topic: I mostly agree, but...there are things lurking... Well, here's an example: Given a FQDN with one component longer than the officially-allowed 63 characters, url.parse() will truncate it to 63 characters (which is acceptable IMO) and decide that the leftover part should be pushed over to the path (which is a problem IMO). Behold:

> url.parse('http://www.ThisIsAnAbsurdlyLongHostNameButSurelyNoneOfThisShouldEndUpAsPartOfThePathBecauseWhoCouldPossiblyWantThatAMIRITE.com/CorrectPathBeginning')
Url {
  protocol: 'http:',
  slashes: true,
  auth: null,
  host: 'www.thisisanabsurdlylonghostnamebutsurelynoneofthisshouldendupaspar',
  port: null,
  hostname: 'www.thisisanabsurdlylonghostnamebutsurelynoneofthisshouldendupaspar',
  hash: null,
  search: null,
  query: null,
  pathname: '/tOfThePathBecauseWhoCouldPossiblyWantThatAMIRITE.com/CorrectPathBeginning',
  path: '/tOfThePathBecauseWhoCouldPossiblyWantThatAMIRITE.com/CorrectPathBeginning',
  href: 'http://www.thisisanabsurdlylonghostnamebutsurelynoneofthisshouldendupaspar/tOfThePathBecauseWhoCouldPossiblyWantThatAMIRITE.com/CorrectPathBeginning' }
>

Maybe I just have failed to think of an appropriate use case (a 63-character TLD maybe?), but I cannot imagine a situation where the above behavior is desirable. It's easy to think of situations where it leads to bugs. (Basically: Every situation where it might happen.) And I would not be surprised to find that it has the potential for security issues in the right context, although I may lack the creativity to contrive a reasonable example of that myself.

In any event, I think we should do what browsers do, which appears to be to not bother validating the char length of components for the FQDN. Garbage in/garbage out, but at least make it sensible garbage that clearly maps to the input.

But that's not what's being discussed here. Just a long aside. Thanks for indulging me. (If that topic _is_ of interest, it's being discussed in https://github.com/nodejs/node/pull/9292.)

Trott on 10 Nov 2016

@Trott

Just to make sure I'm understanding: npm can and will work around this, so even if we update the docs and that's it, things will be OK with npm (assuming people upgrade npm).

Yes, that's right.

iarna on 11 Nov 2016

@jasnell It's not url.parse that's causing issues. It's url.format. Though I suppose your point still stands. =)

iarna on 11 Nov 2016

Looking at the docs right now, I don't think this is a doc omission (although the docs could be more clear). For url.format(), it says:

If either the urlObject.slashes property is true, urlObject.protocol
begins with one of http, https, ftp, gopher, or file, or
urlObject.protocol is undefined, the literal string // will be appended
to result.

That's not exactly the clearest prose expression of all time. But it does say that if the protocol property is set to file, the slashes are appended.

(FWIW, the word either should ideally be removed, as there are more than two options listed.)

So, unless I'm missing something, I think we're covered in the docs if it is determined that sticking with the current behavior is the most desirable course of action.

Trott on 22 Nov 2016

(I am going to file a PR to reword it for clarity, though.)

Trott on 22 Nov 2016

We shipped our own change for this in [email protected] which I'll be downstreaming to you all this Thursday, Dec 22nd. (Well, 4.1.1 actually, but yeah.)

iarna on 25 Dec 2016

❤1 🎉1

This issue has been inactive for sufficiently long that it seems like perhaps it should be closed. Feel free to re-open (or leave a comment requesting that it be re-opened) if you disagree. I'm just tidying up and not acting on a super-strong opinion or anything like that.

Trott on 15 Jul 2017

Was this page helpful?

0 / 5 - 0 ratings