I wanted to propose registering a "provisional" URI scheme "node" which reflects a custom resolution extension based on the transport scheme. I don't think it is time to lock in any particular specifications aside from the fact that it depends upon known transport protocols like 'file' and is intended to provide practical URI mappings and improved interoperability with other applications like web browsers.
Let's try to focus this thread on the minimal provisional RFC and as I am sure many have thought about this and want to get into details, I recommend creating a parallel issue mentioning this issue so we can all converge.
This seems like it would conflict a bit with the idea that many believe that node shouldn't by default add support for importing from URLs. What's the use case you have in mind?
I am not sure that there is a concrete use case in mind, aside from the fact that I find my self (and surely others) often relying on the fact that module specifiers are simply derivations for URLs.
As far as URI Schemes go, they are obviously well defined for browser contexts in the form of HTTP* WS* *FTP and FILE, ie the transport & resolution schemes, but start getting a little messy for other contexts (like GIT... etc.)
For node, one can almost say that node specifiers are node+file: because they adopt file:'s transport and relative resolution behaviours but also introduce node: resolutions behaviours that relate to bare specifiers (and maybe scopes) that take precedence over file.
Since such resolutions only take place when modules are being resolved by node, this node+file: is omitted since the entry point is node+file: and all modules that are resolved to a file: URL are also within the node+file: scheme, whereas some modules like "modules" or "path" which are also resolved by node: but do not rely on any transport protocol (like for instance blob:) as such modules are aliases to resources that are simply referenced internally.
Minor Edits
i wouldn't imagine node: going to anywhere except maybe the builtin modules (which the current loader actually already does internally)
I am working on implementing code coverage for ESM using V8's native profiler. One of the issues is to differentiate user's scripts from internal ones. At the moment, user scripts have absolute paths (file:// for ESM, simple FS path for CJS) and internal scripts have relative paths. Using node:// for internal scripts would be a stronger signal that the scripts cannot be just read using fs.readFile.
Note: V8 reports the effective URL of the script, after all resolution was performed so it's either immediately a path to a file or an internal module.
@devsnek 馃憤 this is most likely the seed that inspires this thread. At times when I am experimenting with NW.js but more so with Electron (since it has a more open schemes API) it becomes very easy to see how one needs to assume node resources are always 'node:' if only implicitly and outside the scope of the actual resources themselves.
@demurgos that's probably the URL parameter that is passed to V8 via the ModuleWrap constructor
@SMotaal
Exactly. I've looked into the related source code and relative/abs path may be enough to differentiate them but I am not sure if it's something that can be relied for long term (I did not find it documented).
Would a protocol like commonjs:// be more clear? The disadvantage of a protocol solution (last time I tried it) was that it becomes weird for non-absolute things. commonjs://./foo.node looks really weird and isn't quite how "normal" URLs would work.
Let's get into the details over at #169 (Discussion about an implicit (or explicit) node scheme)
You can already use isCore from https://npmjs.com/resolve, for example, to determine if it's a builtin module or not - the implementation would be able to be simpler if the specifier contained that info (like, for example, if it was under a scope) but it's basically a solved problem.
@ljharb I'm thinking less about implementation, or at the very least, assuming NPM packages are not an accessible luxury, from the premise that a loader that circularly depends on what it loads (unless ESM will always depend on CJS) is paradoxical at best.
A loader might always have to depend on what鈥檚 loadable in its absence, otherwise it wouldn鈥檛 be able to have any external modularity. If this is running in node, installed npm packages will always be accessible because require/import will be, of course but never necessary if installation is unavailable - just like now.
That is true if the loader has a loader, which for all intents and purposes is almost always true in node, but at some point, we have to consider the loader is an abstract platform (today it's handled by ModuleWrap and NativeModule.require)
At which point? node wouldn鈥檛 be node if it had no base loaders, and we鈥檙e concerned with how loaders work in node.
Not necessarily though, just not node as we know it to be right now.
Such assumed premises should not be considered set in stone, or at least, we should make a document to keep track of any such assumptions. Without some clarity on what is assumed and what just happens to be true about one or more of the many implementations, it is very difficult for anyone to try to make any reasonable contributions especially if they are less seasoned or familiar with such implementation details. I think it really discourages constructive debates and ultimately input from others in the community.
I apologize if you feel discouraged; that wasn't my intent.
The way node works now (and has effectively always worked), unrelated to implementation details, is that you can write a program using zero requires, or only requiring core/builtin modules, or requiring probably-npm-installed things from node_modules on disk. Adding ESM should not change that - you should be able to write an ESM program that uses zero imports, or that only imports core/builtin modules, or that imports probably-npm-installed things from node_modules on disk. If we need to document that, then while I would find it surprising if someone participating in discussions about an implementation in node itself was unfamiliar with these concepts, let's document it!
Something a bit more subtle and definitely not set in stone is that were I writing a loader in node, i would expect to have the full expressive power of any program written in node - hence, i'd expect to be able to require or import other things if i desired.
You can already use
isCorefrom https://npmjs.com/resolve, for example, to determine if it's a builtin module or not - the implementation would be able to be simpler if the specifier contained that info (like, for example, if it was under a scope) but it's basically a solved problem.
Hi,
isCore does not solve the problem I was describing. isCore checks if a package name is a Node package.
What I am trying to do is to know is how to differentiate modules that exist on disk or are part of the internal modules bundled in the Node lib.
Here is an example hello-world using ESM. When I execute it with V8 coverage, I get something like this. I also exported the list of URLs. V8 exposes a way to know the parse goal (not included in the example) so the main issue I have right now is that I am not sure if I can rely on the value of the URLs to differentiate user modules from Node's internal modules.
The current heuristic is to assume that relative paths correspond to internal modules and can always be filtered out for the purpose of coverage. CJS modules correspond to absolute paths (no protocol, first URL in the list). ES modules use URLs starting with either file:// or cjs-facade:.
It would be nice if the node:// protocol could be used for those internal modules bundled with Node (the ones in lib). Instead of relative paths, they would start with node:// (example: node://_stream_duplex.js). It would only be visible to tools involved with V8 (or eventually loader hooks) to mean that this file is not really available on the file system. It would be a bit more reliable than checking if the file is relative.
(Just to be extra clear, I am not talking about an eventual namespace for the specifiers of Node packages, I'm talking about a protocol used to identify the resolved module as "internal").
@demurgos those names (events.js, etc) actually come from the older NativeModule loader, which has no concept of URLs, so I don't think people would be very happy with us changing it to use them. It's not actually interacting with ESM at all at that point.
@devsnek
Okay, thanks for info! It was just just an idea I had where node:// might be useful. It could still be used for the ESM interfaces of these modules (if they have one). I was also thinking about having some protocol to handle the CJS modules (they're almost like the file on disk, but in a wrapper). It would have allowed to treat everything as URLs.
In my case, I already have a good-enough solution so it's probably not worth changing these URLs. It could maybe be possible to have a distinct value for the V8 url and the __filename or cache key, but it's probably not worth the risk of messing with the CJS implementation.
@demurgos right, isCore returns true only for built-in core modules; and false for anything else (whether it's actually on disk or not). It does not check if something is requireable (ie, core or in node_modules); that's what resolve would check.
If you're trying to check internal node implementation details - like which builtin modules are embedded vs on disk - that's not something i'd hope userland is really capable of determining, since then it would become a breaking change for node to alter that implementation detail about a module.
If you're trying to check internal node implementation details - like which builtin modules are embedded vs on disk - that's not something i'd hope userland is really capable of determining, since then it would become a breaking change for node to alter that implementation detail about a module.
Well, if I'm in userland it's near the border, I'm using the V8 inspector and get full access to the internals. Determining the source of the module is exactly what I'm doing (and I want to find it to drop these internal embedded modules as soon as possible, to avoid exposing Node's internals further down).
@ljharb not at all, I am just trying to gain as much insight to better understand your position, which I am sure comes off a somewhat annoying (apologies for that). I guess I too make some assumptions, especially when it comes to packages, because I have so much trouble with the dreaded micro-packages problem, I reached a point where if I don't exactly understand exactly how my dependencies interact across the runtime, I am no longer willing to use such packages. This is an assumption on my part, and I apologize for my overgeneralization.
Withdrawn for now
My module have implemented yet URL imports: https://github.com/voxsoftware/kawix-core
I see that my module is compatible with many proposals of this repo
I think this relates to the js: prefix recently added to JavaScript Standard Library Proposal which seems to align with a URL scheme notation (worth noting that it originally used a less URL like form of js:: in its first appearance).
Most helpful comment
My module have implemented yet URL imports: https://github.com/voxsoftware/kawix-core
I see that my module is compatible with many proposals of this repo