Node: Hashbang not resolving `--loader` relative to entry

Created on 25 Oct 2018 · 28Comments · Source: nodejs/node

Version: 11.0
Platform: macOS 10.14 (18A389)
Subsystem: bootstrap, --experimental-modules (only)

When using shebangs to specify --loader specifically for --experimental-modules a file could want to use a relative path to it's own loader file.

In this specific case, the module file is it's own loader, as if things are not already complicated enough.

#!/usr/bin/env node --experimental-modules --loader ./index.mjs

export function resolve(... args) {
  // …
  if (resolved.url === import.meta.url) resolve.url += '#initialize';
  return resolved;
}

function bootstrap() {
  // … 
}

import.meta.url.endsWith('#initialize') && bootstrap();

The issue is not related to modules down the road, it is strictly related to the current way relative --loader paths are resolved.

Executing this file from other paths other than it's own directory looks for ./index.mjs in the cwd and not the file's own directory.

I believe that Loader being a variable aspect of a shebang is at least problematic, more so risky, and simply not practical (I could be wrong).

Can we fix (or justify) this bug (or feature) please?

feature request

Source

SMotaal

Most helpful comment

@SMotaal I've tagged this feature request. I'll be happy to review a contribution that add some capability to detect if Node.js is running with an hashbang. I don't think this is doable or easy to maintain, so I'll keep being very skeptical on the topic (I'm happy to be proven wrong).

mcollina on 20 Nov 2018

👍3 ❤1

All 28 comments

this isn't a bug. most OSes only use the first argument of a shebang. we can't put flags in it.

also your example code won't really work because the file with the loader isn't used to resolve itself because it doesn't exist yet at that point.

devsnek on 25 Oct 2018

It actually works for me fine, except if cwd is not the same.

In the folder, node works and the loader kicks in.

Outside the folder (anywhere else) calling node [relative or absolute]/index.mjs throws always indicating ./index.mjs not found, the path has no effect on the error message.

Edit: My use case is not most OSes.

SMotaal on 26 Oct 2018

I don't think we should try to encourage using a shebang like this because it won't work in most places, even in the same cwd. https://linux.die.net/man/2/execve

devsnek on 26 Oct 2018

Can we consider this not less lowest-common-denominator shebangs, but more from the general case of a shebang that supports this (my case and maybe more):

Command line

[cwd]» node --experimental-modules --loader ./index.mjs index.mjs

Shebang

#!/usr/bin/env node --experimental-modules --loader ./index.mjs

Can we agree that in the cases where they work (and the do for me) between those two, ./index.mjs in shebang from the perspective of the creator of such file is intuitive to say the least to not implicitly resolve cwd. Where as the command line, it is obviously intentional because it is right then and there.

Contrast this --loader case with other paths passed to node, you may be inclined to say that ARGVs are more ambiguous (again assume cases where it works). Some might think their shebang #!/usr/bin/env node -- ./package.json (obviously fictitious just to elaborate thought) is intended to refer to the package.json for the cwd and few might still thing it resolves their own folder's package.json, this case is beyond the scope of this issue. I would hesitate to even go there because I don't know those guys anyways and one of them seems a little weird, but to each their own.

How can we rationalize --loader a little deeper, is there similar precedence with --require, I really don't know.

Any thoughts?

Note: there is only one index.mjs

SMotaal on 26 Oct 2018

@smotaal i'm not sure what you're trying to say, but I think that the current behaviour is how it should be.

devsnek on 26 Oct 2018

Any other Collaborators want to chime in on this one? Seems like the flags-in-shebang issue/question/request could conceivably be especially relevant to @nodejs/tooling folks?

Trott on 19 Nov 2018

this isn't a bug, nor is it a "feature". hashbangs are an os level construct, not something node provides.

devsnek on 19 Nov 2018

If this is a problem _specific to_ shebangs, I don’t understand why. I _do_ understand the lowest-common-denominator shebang behavior, but not how this relates to Node’s handling of relative paths.

If the desired behavior is limited to certain OSes and would break in others, it may be difficult to rally much support.

boneskull on 19 Nov 2018

my understanding of this issue was that @SMotaal wanted to use hashbangs for esm and something about it wasn't working. if the issue isn't specific to hashbangs can it be rephrased?

devsnek on 19 Nov 2018

@boneskull I think it depends on how you look at it and I am finding it tricky to rephrase/reframe this as @devsnek pointed out but I will try.

One way to look at it is to look at it relative to the current implementation. Essentially, the fact that it is a hashbang does not factor into the behaviour. There is no distinction between running node --experimental-modules --loader ./index.mjs from the command line and having it execute by calling the actual file.

Another way to look at it is to consider the practical uses of the different aspects. A hashbang is a static header of the main entry point of the node process. A custom --loader is specified to be loaded before an entry module which requires the loader specifically in order for it to be loaded and executed. Between those two concepts, it is easy to appreciate the edge case where an entry point and a loader are lumped together in a folder or within a prescribed relative proximity so that the entry point will include a relative path so that only when the arguments are passed via a hasbang would this particular path be resolved relative to the entry point in which the hashbang resides.

I am not ignoring the technical considerations that this would require. At the same time I am considering the current limitations on the future usability of this flag.

SMotaal on 19 Nov 2018

@smotaal you're saying that paths in the hashbang should be resolved from the file location, not the cwd?

devsnek on 19 Nov 2018

I am not sure I am prepared to make this generalization on all paths, but for loader specifically and cases where the path points to a resource that bootstraps the runtime needed for the entry point to actually function.

… paths in the hashbang … — @devsnek

This part is really where I see a divergence where I imagine that node will need a mechanism to distinguish when it is executing from a hashbang. If not offered by platforms, then a very raw idea to do this is to ensure that the stringified arguments (spaces notwithstanding) occurs at index 0 of the main entry before it even tries to locate the loader — obviously this needs work and just intended to inspire a better and cleaner solution.

SMotaal on 19 Nov 2018

For reference, --require works the same way:

. means "current working directory", so changing the meaning of . to "directory of the module to be required" (or loaded, etc) seems incorrect/inappropriate.

What could be done is add a new node flag, e.g., node --cwd /some/path or node --cwd .., which would switch directories before other flags were interpreted. I don't know if this is a good idea.

I don't know if this works for --loader, but a workaround would be to use NODE_PATH, e.g.:

$ NODE_PATH=..:${NODE_PATH} ../bar.js
foo

boneskull on 19 Nov 2018

❤2

@SMotaal ok thanks for rephrasing.

i would still say that the current behaviour is "working as intended":

1st, hashbangs are "above" node, we don't define their behavior or interpret them.

2nd, node exec flags (such as --loader) are purposely separated from the entry point. they effect the entire application and as such are tied to the environment, not the specific script.

devsnek on 19 Nov 2018

hashbangs are "above" node, we don't define their behavior or interpret them

I absolutely agree, but I also think that now that Hashbangs are part of the spec, and they are specifically relevant for node modules because they are executables, I am inclined to see them more within the domain of node moving forward. Not to imply that we define their behaviour, just to say that we define our behaviour in response to their intent which we interpret to maximize their utility and minimize their semantic dissonance.

The 2nd point is not an exclusive thing, just consider my own module as an example, it is the entry and loader in a single file, and once the loader loads itself, it will use the loader to resolve files otherwise not resolved. And while this loader is application bound, it can never be used except when it is the entry point (crazy 👿 right?)

SMotaal on 20 Nov 2018

@Trott Can we please hold off on unlabeling until we had a chance to hash out the different point of views. I honestly see this as a feature worth discussing.

SMotaal on 20 Nov 2018

I absolutely agree, but I also think that now that Hashbangs are part of the spec, and they are specifically relevant for node modules because they are executables, I am inclined to see them more within the domain of node moving forward. Not to imply that we define their behaviour, just to say that we define our behaviour in response to their intent which we interpret to maximize their utility and minimize their semantic dissonance.

@SMotaal I don't see how hashbangs could be in the domain of Node.js. What are you proposing specifically?

Also cc @nodejs/modules

mcollina on 20 Nov 2018

Let me try to phrase this through the example in this issue:

Calling the index.mjs from the command line executes the shebang.
OS calls node with the path for the entry and (on my mac, possibly) the arguments.
node creates Loader instance to resolve "--loader" instance (but first)
node already knows entry location, creates stream and reads 1st line to compare args with shebang (index 0)
Loader then resolves relative to entry (not base) and ~loaders~ __loads__ loader (same as entry in this case)
Loader installs hooks and then loads entry (can use same stream)

Please appreciate however that I am at a disadvantage on implementation specifics. My argument comes from a user-centric consideration. So even if technically we need to work on a solution, it is fair to also try to appreciate the user's side, one does not negate the other.

SMotaal on 20 Nov 2018

@mcollina So in terms of the subset of hashbangs that fall inside the domain of node, my position is that when node can determine that it is running via a shebang, it can decide how it chooses to handle it's own behaviour to deal with the arguments. There is work to actually do the first part, I merely speak in an abstract sense.

But suppose that we are able to determine that we're bootstrapping in shebang mode, don't you think that some behaviours can be better handled to yield a more pragmatic experience from the end user's perspective?

SMotaal on 20 Nov 2018

@mcollina So in terms of the subset of hashbangs that fall inside the domain of node, my position is that when node can determine that it is running via a shebang, it can decide how it chooses to handle it's own behaviour to deal with the arguments. There is work to actually do the first part, I merely speak in an abstract sense.

Can you please articulate _how_ Node.js can determine if it is running via an hashbang? Could you maybe send a PR to add this capability to Node.js itself?
There has been quite a few comments already that states that it is not possible to detect this.

mcollina on 20 Nov 2018

I agree with @mcollina. We first need confirmation that Node can know it is running via a hashbang, then only it makes sense to use our time to discuss how we want to handle this case.

targos on 20 Nov 2018

First, and not to sound discouraging, but I must restate that I have limited experience on that front.

So let's try to paraphrase the idea I had in mind as a first step to allow us and specifically the more experienced on that front to see if it may have merit or is worth exploring.

During bootstrap, generally, node synthesizes the argv and execArgv arrays. So it is fair to assume that if the OS passes the arguments of shebang (in my case) then node can compare it against the first line of the entry point with some allowances for whitespace or other trivia.

But it is not without complexities, specifically if the OS does not pass arguments of the shebang. In this case, node will not be able to use this approach to detect that it was shebang.

Finding a graceful way to handle this divide is best left to the creative problem solving efforts of those with more experience on the topic.

SMotaal on 20 Nov 2018

mcollina on 20 Nov 2018

👍3 ❤1

@targos I think what is missing is input from those with experience to weigh in on my very raw suggestion first. While I normally didn't mind taking a few days or weeks to catch up when problem solving, I think that this one is one too deep in my 1.5 yr distraction which initial was just to be able to use modules back in 2017 😄.

Then, if we see that there is a possible solution that has due confidence, I am happy to do my part.

SMotaal on 20 Nov 2018

@boneskull I absolutely agree on the meaning of . for which a different meaning should be very carefully thought out.

I wanted to explore ideas like dropping the . or somehow wrapping the argument to denote it's special meaning, which can be a separate avenue from all together and avoids shebanging our heads on the more complicated shebang idea if it is not a favourable improvement or favourable.

SMotaal on 20 Nov 2018

@SMotaal Sure. I didn't want to say that you must be the one who does the investigation, sorry if I wasn't clear on that point. I just want to make sure we answer the questions in the right order :)

targos on 20 Nov 2018

❤1

Interpreter directives (Hashbangs / Shebangs) do not parse consistently across shells. I do not believe we should attempt to specify behavior within node itself due to this as node's behavior could differ from shell behavior.

bmeck on 3 Feb 2020

👍1

I'm a bit confused by this, there seem to be a mix of issues discussed, but note that most scripting languages use posix option syntax, and have short-option versions of any long options, particularly any long options that are credible candidates for use in #! lines, so while