Modules: Divergent specifier hazard

Created on 14 Aug 2019  Â·  64Comments  Â·  Source: nodejs/modules

@jkrems and I created a repo to demonstrate the hazard mentioned by @MylesBorins in https://github.com/nodejs/modules/issues/273#issuecomment-492355098 and https://github.com/nodejs/modules/issues/273#issuecomment-492408041 where a single specifier resolves to separate things in ESM and CommonJS environments within the same runtime. This hazard is currently blocking dual packages, and logically would also block --es-module-specifier-resolution=node from becoming the default behavior.

Hopefully the repo illustrates the issue clearly enough that everyone can get a solid grasp on it. It seems to me that we have three options:

  1. Accept that the hazard is a serious concern and that it therefore rules out supporting dual packages and automatic extension resolution. The current modules implementation is what ships with regard to those two features, and presumably we would remove the --es-module-specifier-resolution flag (since arguably the flag’s existence invites users to stumble into the hazard).

  2. Acknowledge the hazard but decide that user education can overcome it. @MylesBorins and others concerned about the hazard would need to be persuaded as to why it isn’t a blocker, especially considering that it’s already come up in the real world. If the persuasion effort succeeds, we would then need to decide how far to lean into enabling features that invite the hazard (such as automatic extension resolution and dual packages).

  3. Find a technical solution to prevent the hazard, such as making require of ESM work in a way that everyone is comfortable with (cc @weswigham). As with the previous option, we would then need to decide whether or not to support dual packages and/or automatic extension resolution.

Deciding on any of these options would lead to a resolution of the dual packages item in Phase 3 of our roadmap.

cjs discussion esm modules-agenda ngo? roadmap

Most helpful comment

It sounds like what you really want is require('pkg') and import 'pkg' to not just be permitted, but to also return the same singleton.

I want the documentation to specify conditions for which the hazard exists, leave it to the module maintainers to not publish modules with bugs. You've given examples of major projects which published broken versions but this happened under a lack of proper documentation about the hazard. I trust properly informed module maintainers to generally avoid creating broken releases. Being a known hazard means that it won't be as difficult to identify in the future.

Some solutions for maintainers that I can think of:

  1. Do not use singletons or instanceof (not always possible but worth suggesting first)
  2. Publish pure CJS or pure ESM (no dual mode)
  3. Publish CJS code with an ESM wrapper (to control named vs default export for ESM)
  4. Tag classes with a hidden Symbol.for('module-v1-specific-id') and check for it instead of using instanceof
  5. Use globalThis[Symbol.for('module-v1-specific-id')] for storage of shared singleton data

If Node someday supports require of ESM, then you could write 'pkg' as all ESM and it would be importable into either module system

If this were possible it could be interesting. I'm unclear if/how it's possible, specifically my concern is that loading ESM is async and the require function is sync. How does require deal with async loader hooks and in the future loading modules with top-level await?

All 64 comments

I, for one, would _really_ like to just resolve the same specifier to the same module in both resolvers - that would go a long way to making esm feel like a gradually adoptable improvement, rather than a hard break - I really hate the idea of leaving cjs users in the cold with respect to new esm libraries (and no, dynamic import isn't practical in the real world because the promise wrapper forces an unrealistic code structure compared to how code is written and consumed today).

It's worth noting that a similar mechanism to what I've looked at to support require of esm could in theory be used to support top level await _in cjs_ without creating excessive event loop turns (just like how the TLA spec specifies esm TLA should behave, with the same caveats about possible deadlocks). I started looking at that, but am suuuuper unfamiliar with how the cjs wrapper is formed and, from what I can tell, there's currently no equivalent of the v8 create function call that supports marking the function as async, which would mean degrading to the older style wrapper (as is done when the wrapper is patched).

It's worth noting that I also think the same-specifier-resolves-to-the-same-thing-in-both-resolvers also isn't actually predicated on require in esm - it's just a matter of the cjs loader prioritizing esm files in the same way as the esm loader and _throwing_ if an esm file is found first (barring actually being able to import it). In effect, the cjs loader should be a superset of the esm loader in this scenario, except it throws when the module that is resolved to is esm.

In the pkg example, that means if pkg has a main of ./x-core in and both ./x-core.js and ./x-core.mjs exist, both loaders must always pick the _same_ resolution. So if .js is higher priority (as would be backwards compatible and wouldn't risk picking up accidental new mains), then in both resolvers it resolves to the commonjs file. If .mjs were higher priority, then the esm entrypoint would be all both the esm and cjs loaders saw (and would, therefore, allow a silent fallback to the cjs entrypoint on older node versions which don't recognize the mjs extension).

It just comes down to both loaders need to resolve a specifier to the same thing, whatever tweaks that takes.

I've mentioned this in another issue, but I think it should be mentioned here.

If two separate packages require/import a third package, they should not assume they will both get the same singleton. This is a dubious assumption, which can already lead to issues even without getting into ESM vs CJS.

With NPM, version conflicts are resolved by giving each package a version it requires, even if that means installing two different versions. With a current version of NPM, a different copy will be given to one of the packages to import, and I believe older versions would simply have installed two copies regardless of version.

That means that if package-a has dependency [email protected] and does:

const {SomeClass} = require('third-package');

And package-b has dependency [email protected] and does:

const {SomeClass} = require('third-package');

Both are going to get a different SomeClass object, and attempts to use instanceof for objects created outside the package will not work out. Likewise, attempting to modify the class like a pseudo-global object also won't work out.

This isn't a new issue, and I don't think that it's a compelling reason to not support dual packages. If a package wants to be a dual package, it should move to not assuming there will only be one version of a package... actually, all packages should do this anyway.

@AlexanderOMara the issue is not only across package boundaries but within the same package if you are moving between CJS and ESM (which is likely to happen as teams transition code bases).

Packages like react which have global state that is shared will break in very mysterious and hard to debug ways if you end up with a different singleton in ESM + CJS.

@MylesBorins that's true, but that's already the case with duplicates or different versions. The solution is peer deps for "across package boundaries"; there's no solution node can provide that would be any different (npm or another package manager could perhaps provide one; bower for example's solution was "pick which one you want and that's the only one you'll get", but that was untenable for many use cases).

Within the same package, I don't see why that's a hazard we're concerned with; that seems like something that package needs to test for?

As you said in the other thread - without a package, the issue can be fixed with peer dependencies - _within_ a package, it's pretty much impossible as long as two side by side "seperate but equivalent" builds of the package are there (currently). Forcing only one implementation to be active (for all usages) would be my prefered solution~

It’s worth noting that I also think the same-specifier-resolves-to-the-same-thing-in-both-resolvers also isn’t actually predicated on require in esm - it’s just a matter of the cjs loader prioritizing esm files in the same way as the esm loader and _throwing_ if an esm file is found first (barring actually being able to import it). In effect, the cjs loader should be a superset of the esm loader in this scenario, except it throws when the module that is resolved to is esm.

I remember asking @guybedford about this when I was struggling to get dual packages to work, and trying to find a workaround for this hazard. My suggestion was basically “well once 'pkg' is loaded as one module type, when 'pkg' tries to get loaded as the other module type can’t we just throw an exception?” And the answer was basically no, not if we want Node to remain deterministic, which apparently we do. I argued that since ESM is statically analyzable, wouldn’t the CommonJS versions of any specifier always be loaded second? But he said no, because of dynamic import().

He can surely explain it better than I can, but assuming that Node remaining deterministic is a requirement of the project, then I think throwing on this hazard isn’t a solution.

It just comes down to both loaders need to resolve a specifier to the same thing, whatever tweaks that takes.

This is the current behavior of the --experimental-modules implementation, at least under the default --es-module-specifier-resolution=explicit. And if that “same thing” is a CommonJS thing, then it works great; it’s only when you want to make that “same thing” ESM that everything gets tricky, since both require and import support CommonJS sources while only import supports ESM. This is perhaps one of the strongest arguments for supporting require of ESM, so that ESM has parity with CommonJS in terms of support within Node’s module systems.

@weswigham would it be possible to produce a PR that implements require of ESM, at least for relative files with explicit extensions, e.g. require('./file.mjs')? Just to demonstrate that it actually works, and let people see what the tradeoffs are. Then we’d need to figure out how to define separate package entry points for CommonJS versus ESM, but that’s obviously a solvable problem. I don’t know if the group will ever go for optional extensions, since there were objections to that besides this hazard, but I think this hazard was the only objection raised against dual packages.

I remember asking @guybedford about this when I was struggling to get dual packages to work, and trying to find a workaround for this hazard. My suggestion was basically “well once 'pkg' is loaded as one module type, when 'pkg' tries to get loaded as the other module type can’t we just throw an exception?” And the answer was basically no, not if we want Node to remain deterministic, which apparently we do.

You seem to misunderstand - basing which you get on how it's first imported is the issue there. Just _dont do that_. (None of the logic in the esm resolver at least is dependent on the parent module kind, and neither is the cjs core algorithm, nor should it be). The key is just to make it resolve _to the same thing every time_, _regardless of how it is accessed_.

@weswigham would it be possible to produce a PR that implements require of ESM, at least for relative files with explicit extensions, e.g. require('./file.mjs')? Just to demonstrate that it actually works, and let people see what the tradeoffs are.

There was a branch with that already linked in the proposal issue. :S

The key is just to make it resolve to the same thing every time, regardless of how it is accessed.

Wouldn’t require of ESM be necessary for this, then? If both require('pkg') and import 'pkg' always resolve to the same thing, then currently they’ll only both work if that same thing is CommonJS; but the whole point is to let people use ESM syntax. If I want to publish pkg as a dual package where both require('pkg') and import 'pkg' work in current Node and require('pkg') works in old Node, don’t we need require of ESM to work?

There was a branch with that already linked in the proposal issue. :S

So then why don’t you rebase it off the latest master, simplify it to only work with relative file paths that have explicit extensions, and open a PR against core? I don’t see that branch getting much attention.

As per the updated spec for top-level await, synchronous graphs can stay synchronous, so require('./esm.js') is feasible by supporting a .linkSync() and .evaluateSync().

In order to evaluate an esm graph synchronously one would just need to check [[Async]] (as per here) is false for all modules that are in the graph to be synchronously evaluated.

Wouldn’t require of ESM be necessary for this, then?

Nah, throwing with an error like Module cannot be required with this construct or Module cannot be inported with this construct rather than returning a module is also workable. So long as it doesn't return a different module identity, it's fine.

As per the updated spec for top-level await, synchronous graphs can stay synchronous, so require('./esm.js') is feasible by supporting a .linkSync() and .evaluateSync().

We don't even need to do that - the built-in ones can all be sync in practice (even if they're marked async right now), and any user provided loaders can be waited for synchronously akin to how child_process.execSync works (since they should be executing in a different realm/on a different event loop), even when they're defined with promises and async.

Wouldn’t require of ESM be necessary for this, then?

Nah, throwing with an error like Module cannot be required with this construct or Module cannot be inported with this construct rather than returning a module is also workable.

I meant, isn't it required if we want something other than an error to happen. require of ESM already causes an error.

I meant, isn't it required if we want something other than an error to happen

Well.... Yeah. But an error is, imo, a fine way to enforce a module only having one relative identity. Yes, we could do better, but that's the baseline.

require of ESM already causes an error.

Yes but no - it means that both esm and cjs need to interpret main for example, in the same way, rather than following different rules for resolving that inner specifier. It means once you require("foo") or import("foo") the _same_ steps are taken to determine to what that refers, and the two algorithms (should be one unified algorithm, tbh) are in agreement. After that determination, either are free to throw or continue with that result, as implementation permits. Yes "esm doesn't permit extension searching" (maybe), but _the package.json main field isn't esm_. It's silly to try to reinterpret it as anything other than what it means to the cjs resolve algorithm, imo.

@weswigham The current --experimental-modules doesn’t have this hazard, unless you opt into it via --es-module-specifier-resolution=node. It sounds like you’re trying to solve the problem of how to avoid this hazard in --es-module-specifier-resolution=node mode. As I wrote above, there are several other objections to --es-module-specifier-resolution=node besides this hazard (see #268). So even if you succeed in finding a way to avoid the hazard for automatic extension resolution, many members of the group would still object to making optional extensions the default behavior in Node.

Whereas with regard to dual packages, this hazard is the _only_ objection. So getting require of ESM to work in --es-module-specifier-resolution=explicit mode might provide a way for us to ship support for dual packages. Would you be interested in pursuing that?

Fwiw, I wouldn't call it "dual packages" anymore at that point, as there's nothing dual about it (which is the point).

I wouldn't call it "dual packages" anymore at that point, as there's nothing dual about it (which is the point).

The packages are still dual in that they still contain both CommonJS and ESM sources, even if the CommonJS files are only used in old Node.

Are you interested in pursuing this or is your interest in require of ESM only in support of optional extensions?

I'd still consider that better. FWIW, I think there's room for both extension searching and non-extension searching resolver modes on a per-package boundary, tbh - that way a package author can opt for either one (or possibly in the browser compat package mode side even stricter checks on browser compatibility, like ensuring no usage of fs or process within the package).

The packages are still dual in that they still contain both CommonJS and ESM sources, even if the CommonJS files are only used in old Node.

I mean.... sure. Technically. But using dual meanings for "dual packages" is exactly the kind of doublespeak that causes us to talk past one another so often, so I'd not attribute the same term to that.

Okay, so @weswigham do you mind please opening a PR against core that enables require('./file.mjs')? Just that by itself, for now, leaving support for packages or optional extensions in ESM for later.

So far, I haven't heard any explanation of a hazard that:

  • doesn't also apply to CJS already
  • isn't solved by peer deps
  • won't still exist with singletons in ESM regardless of what we choose

I look forward to that PR being opened :-)

So far, I haven’t heard any explanation of a hazard that:

  • doesn’t also apply to CJS already
  • isn’t solved by peer deps
  • won’t still exist with singletons in ESM regardless of what we choose

The problem illustrated by the extensionless-imports example in the demo repo doesn’t happen in CommonJS, isn’t solved by peer dependencies, and doesn’t exist in ESM unless extensions are optional. Like I wrote in https://github.com/nodejs/modules/issues/352#issuecomment-521407565, the example is a bit contrived within one’s own app or package, but is very plausible as a deep import within a public package (like aws-sdk/s3) and that’s not solved by peer dependencies.

I also don’t see how peer dependencies would solve the dual-esm-commonjs-package example from the demo repo. Do you care to post another repo showing the solution?

@GeoffreyBooth as has been stated elsewhere tho, that's because the singleton is being created in two places. Instead, it should be created in one place, and re-exported from the other. This is a package design issue. It could still happen if you had both x-core-esm and x-core-cjs, and one part of the tree required cjs and the other imported esm. The only solution is still, the author has to define the singleton in one place and ensure it's shared across both module systems.

Prohibiting dual-mode packages doesn't solve the issue, it just means an author has to create two packages, and still has the same hazard if they architect their code in the same incorrect way.

The only solution is still, the author has to define the singleton in one place and ensure it’s shared across both module systems.

This is currently only possible for CommonJS files and packages, since CommonJS can be required into CommonJS and imported into ESM; but since ESM can’t be required into CommonJS, there’s no way to use ESM syntax to create a singleton that can be used in both systems. Hence the discussion above of seeing if we can make require of ESM happen.

Prohibiting dual-mode packages doesn’t solve the issue, it just means an author has to create two packages, and still has the same hazard if they architect their code in the same incorrect way.

As I understand things, prohibiting the same specifier from resolving to different things in each module system solves the issue. See the readme in the example repo. Perhaps you could produce an example showing how the hazard is still present if an author creates two packages?

right, but without require of ESM, that's just the way it'd have to be - you'd have to define the singleton in CJS, not ESM, for as long as you wanted to retain compatibility with both module systems.

i don't have time to create an example, but since you could have thousands of deps in your graph, and one could be importing the ESM, and another could be requiring the CJS, and the two could interoperate, it still exists. It's the same problem as having two copies of React in your tree, just without the ESM wrinkle thrown in, and it is solved with peer deps.

For the record: I still consider the specifier hazard and the ability to fix it (or greatly reduce it) completely independent of the ability to require(esm).

Prohibiting dual-mode packages doesn't solve the issue, it just means an author has to create two packages, and still has the same hazard if they architect their code in the same incorrect way.

I think most of the hazard comes with the convention that esm and cjs implementations are shipped side by side today, which makes it really easy to mess up unintentionally.

I think most of the hazard comes with the convention that esm and cjs implementations are shipped side by side today, which makes it really easy to mess up unintentionally.

This still seems like strong motivation to be able to require() esm, with it packages could simply ship a single version of the package without the risk of two copies being run.

In the short term (aka while Node 10 is still LTS) people would still probably publish dual but after that with require(esm) there'd be no reason to dual publish.

I have still yet to see an implementation of require(esm) that does not introduce the hazard of zebra striping (moving from async / sync). I am also having a hard time thinking of a solution for that space that would not be semver major (e.g. sharing a single cache between systems).

@weswigham you mention opening a PR... do you have an implementation that is capable of introducing require(esm) into CJS without the zebra striping hazard that is also non semver-major to cjs?

@MylesBorins it's not a hazard because all the code can still be executed in exactly the order given - the same trick used to make sync esm graphs without TLA still execute in a sync order can be used and improved upon to not only support an async resolver (though it really isn't right now), but _also_ support TLA in CJS (at least it should, in theory - some vm module and v8 changes are needed to get full support of non-strict async contexts without using require.wrapper and heuristics). That's a bit of a nice-to-have stretch goal, though.

you mention opening a PR... do you have an implementation that is capable of introducing require(esm) into CJS without the zebra striping hazard that is also non semver-major to cjs?

Yes, since months ago when I opened #299... though that's admittedly slightly incomplete - it only sync's the resolver and instantiation processes - async module execution isn't yet handled (as which TLA variant was moving forward wasn't clear at the time), however I don't see that as likely being much of an issue - it's just a matter of backporting some esm resolver behaviors (unwrapping completed promises, waiting on promise completion by spinning the event loop before continuing execution of the containing module, issuing a deadlock error (since the main thread is waiting on each promise, if in the process of resolving that promise, it calls back into the module waiting on it, that module won't be resolved, and thus the import is a deadlock if not marked as an error (or zebra striped, but whatever)) on reentrance into a module which is already waiting on a promise) into the cjs-resolver-for-esm.

Like, all the change needs to do is, in effect, allow require to call out to a sync(import(specifier)) where sync is a C++ provided function that either immediately unwraps a promise or spins the event loop until the promise is resolved and can be unwrapped (which it then does). In the end, the system behaves pretty similarly to the esm resolver, if one assumed every import was an awaited dynamic one.

@weswigham this sounds great in theory, but I am skeptical that this can be done to our commonjs loader in a way that is non-breaking and even more skeptical that it can be done in a way that is non-observable.

To land this in 12 it would need to be both. If we can't do that we could land in 13 if we could prove that any observable changes would not break the ecosystem.

Top-Level await is also not yet implemented in the V8 engine and won't support CJS. AFAIK we will be unable to update 12.x to a version of V8 that will support TLA.

Time continues to pass us by as we get closer to the 12.x LTS... and I'm still not seeing a plan that would allow us to support require(esm) at any time in the near future. As a goal for constant improvement of the runtime it is absolutely something that we should explore, it would significantly improve the esm experience and give an excellent reason to upgrade to 14 from 12... but I really don't see how we can rely on this for the first iteration of unflagged ESM

Alright, in that case the branch I linked is fairly complete, if we don't need to worry about TLA support for awhile. In it's current demo state it's also trivially nonbreaking, since it just overrides the current error behavior for require of .mjs with actual loading of the module.

it’s also trivially nonbreaking, since it just overrides the current error behavior for require of .mjs with actual loading of the module.

Wouldn’t it also be a breaking change for require('package-with-type-module/file.js') to start loading file.js as ESM?

Given that that only happens when opted in to via a type flag in the package.json (which I don't know weather or not we've added recognition of into the cjs loader yet), I'd hope not. type: "module" is opt in new behavior - it's a textbook example of a backwards compatible addition to the resolver, imo.

I would like to note that the resolver in our current implementation is not actually sync, it is just the default implementation of resolve that is sync which was done after some benchmarking in the long past (could probably retry async with the new microtask optimizations), custom loaders are async and both are wrapped in in a Promise.

See: https://github.com/nodejs/node/blob/317fa3a7574e90e83bbd09acc9b9ab61defccf3d/lib/internal/modules/esm/loader.js#L73

Whatever happens, I'd prefer we don't force custom loaders to only use sync APIs.

Whatever happens, I'd prefer we don't force custom loaders to only use sync APIs.

Which we wouldn't (force sync API usage) - we have the technology to allow async APIs in synchronous-appearing codepaths, after all.

Like, all the change needs to do is, in effect, allow require to call out to a sync(import(specifier)) where sync is a C++ provided function that either immediately unwraps a promise or spins the event loop until the promise is resolved and can be unwrapped

That sounds kinda hellish for bundlers that encounter a require('./esm-with-tla.js'), and would also probably encourage hacks like import { createRequire } to do synchronous dynamic import.

I think it'd be better to just reject require('./esm.js') that transitively has an async module dependency (unless all async parts of the graph have already been evaluated maybe?).

I don't think it's _particularly_ anymore hellish for bundlers than any other kind of async dependency (and certainly not more hellish than supporting top-level await which relies on the same promise unwrapping mechanism) - and amd and system already support async module bodies and are bundled.

@weswigham Do you have an estimate on when you might open a PR against core for the bare minimum require of a relative ESM file (e.g. require('./file.js'))?

Per the 2019-08-28 meeting (subject to edits, as the meeting was a bit confusing):

  • Extension resolution order should be the same in CommonJS and ESM (regardless of whether extensions are optional or explicit in ESM). The CommonJS resolver should understand type: module and resolve the same files as the ESM resolver would in optional-extensions mode.

  • We should write a migration guide explaining to package authors how to write a package that can be used simultaneously in pre-ESM Node and post-ESM Node; and ideally simultaneously with require and import in post-ESM Node.

  • Based on the outcome of the migration guide, we probably choose option 1 above and dual packages are tabled unless/until anyone can come up with a solution that avoids this hazard (e.g. require of ESM or something else).

"dual packages are tabled" isn't something I agreed to; I think that unflagging is tabled unless we can come up with a solution.

I think that unflagging is tabled unless we can come up with a solution.

@ljharb If your desire for dual packages is about easing the migration, would you be OK with waiting until we have the migration guide before concluding that we can't unflag without dual packages?

It seems pretty clear to me that unflagging will happen with LTS in October, regardless of any individual objections. The Node core team won't allow the modules saga to drag on for another release cycle.

I think a more useful approach at this point would be “here are some ideas for improvements” rather than “I'm going to block progress unless these concerns are addressed.” If you want a better story for dual packages, now's the time to work on a new proposal or PR or encourage others to do so. That's what I was attempting above with require of ESM, but it appears there isn't enough interest in getting that to a PR (though please prove me wrong). If you have other solutions, by all means work on them or encourage others to do so.

@robpalme sure, but if there’s conclusions about tabling this soon then there’ll be conclusions about unflagging this soon :-)

@geoffreybooth I’m not so sure the wider node core team would prefer shipping something with a broken experience over waiting longer, but either way nothing is a foregone conclusion til it’s released.

I completely agree that blocking - or deciding on tabling things - is an unproductive way forward. I’d certainly like to have suggestions, but lacking them doesn’t magically make my concerns vanish nor does it make the use cases they represent unimportant. Progress for progress’ sake isn’t progress. My pushback is the encouragement to others to work on it, in the absence of myself knowing what to work on.

Would anyone be interested in championing require of ESM? Here are the next few steps, as I see them:

  1. Create a new branch off of Node master.
  2. Pull enough code from @weswigham’s branch (per https://github.com/nodejs/modules/issues/299) to enable require('./file.mjs') to work. As in, don’t worry about dual packages just yet, and don’t change anything involving extension resolution.
  3. Open that as a PR against core, and respond to feedback from the core team.

Based on how that goes, we can build on this to try to tackle dual packages.

I don't think it's up to node.js to prevent this, it is the module maintainers responsibility (publishing a dual mode package would always be opt-in by the maintainer). The current plan will block the vast majority of the ecosystem from offering ES modules any time soon. When maintainers eventually do publish ES modules it will be in a semver-major release. That semver-major release is likely to cause the singleton issue by forcing npm to install multiple copies of the module.

I've seen the suggestion about require('pkg') vs import from 'pkg/module'. If pkg uses a singleton it will cause the exact same divergent specifier hazard only with a less desirable UX. Please let me know if I've missed a way this somehow prevents the singleton hazard.

Just adding this here, in https://github.com/nodejs/modules/issues/401#issuecomment-542525078 @MylesBorins cited three examples of where the hazard surfaced in real projects during the Node 7-11 days:

I find all of these decent examples of why this rare "hazard" perhaps should require explicit opt-in, but none of them are examples of why it should be utterly prohibited for packages to replicate what dep duplication allows already.

I’ve seen the suggestion about require('pkg') vs import from 'pkg/module'. If pkg uses a singleton it will cause the exact same divergent specifier hazard only with a less desirable UX. Please let me know if I’ve missed a way this somehow prevents the singleton hazard.

This prevents the singleton hazard in the same way that require('underscore') and require('lodash') lack the singleton hazard. Those are two separate specifiers, so you wouldn’t expect one to instanceof the other, just as you wouldn’t expect require('pkg') to instanceof require('pkg/module').

I find all of these decent examples of why this rare “hazard” perhaps should require explicit opt-in, but none of them are examples of why it should be utterly prohibited for packages to replicate what dep duplication allows already.

What difference does it make that something is opt in? All bugs are opt in by the authors who wrote the bugs. Allowing the hazard is enabling a new class of bugs; and clearly some very competent teams (graphql, webpack, and facebook) fell into it. It doesn’t matter how explicit we make the opt-in; package authors will add allowDivergentSpecifierHazard: true to their package.jsons if that’s what we make them do, and they’ll still ship the bug and their package’s consumers will be the worse for it. Consumers can’t opt out of bugs, aside from uninstalling a package or particular version of a package.

Deduplication could theoretically solve this if we could somehow tell via static analysis that both the CommonJS and ESM versions of 'pkg' are used within an application, and therefore supply the CommonJS versions to both require and import consumers (since we can’t supply the ESM version to both until require of ESM is possible). But unfortunately such static analysis is impossible because both require and import() can take variables, and therefore we can’t determine the full module graph ahead of time before evaluating user code. (Trust me, I went down this road for a while, trying to make it work.) This also has the issue that the CommonJS and ESM versions of a package might not be equivalent, and therefore your application could change unexpectedly if the supplied version of a package suddenly changes from CommonJS to ESM or vice versa (for example, because one of your application’s dependencies updates to ESM and now all places in your application consuming a package are now using import instead of previously some using import and others using require).

I also think the premise is wrong here: the burden should be on the folks who want to enable something unsafe to demonstrate why it’s safe enough to allow (and why the benefit is worth the risk) not the other way around. The section “Publishing Dual CommonJS / ES Module Packages” in https://github.com/nodejs/node/compare/273d38b
guybedford:91bcf79?short_path=8e67f40#diff-8e67f407bc32a0569e25d7ecaff6e494 comes closest to this in my opinion, as it actually demonstrates a pattern that allows divergent specifiers while avoiding the hazard. The issue is making sure people use it, when shipping the bug is much easier.

I’ve seen the suggestion about require('pkg') vs import from 'pkg/module'. If pkg uses a singleton it will cause the exact same divergent specifier hazard only with a less desirable UX. Please let me know if I’ve missed a way this somehow prevents the singleton hazard.

This prevents the singleton hazard in the same way that require('underscore') and require('lodash') lack the singleton hazard. Those are two separate specifiers, so you wouldn’t expect one to instanceof the other, just as you wouldn’t expect require('pkg') to instanceof require('pkg/module').

underscore vs lodash is not a valid comparison. Lets say I have an ESM only project that depends on pkg-plugin1 and pkg-plugin2. pkg-plugin1 is CJS only and uses require('pkg'). pkg-plugin2 is ESM only and uses import from 'pkg/module'. Verifying that pkg-plugin1 and pkg-plugin2 both depend on the same semver-major of pkg should be the end of my check for compatibility but in this case these two packages could be using different singletons. This returns me to the point that exporting CJS and ESM under separate specifiers will do almost nothing to avoid the singleton hazard.

TSC meeting did not result in any consensus. Folks did not have enough context to make informed decisions. I'm going to send an email to the TSC that give more context around this and we'll invite TSC members who are interested in this to attend the Modules Team. The general sentiment from the TSC was that we should hold off on unflagging for now while individuals get ramped up.

Verifying that pkg-plugin1 and pkg-plugin2 both depend on the same semver-major of pkg should be the end of my check for compatibility but in this case these two packages could be using different singletons. This returns me to the point that exporting CJS and ESM under separate specifiers will do almost nothing to avoid the singleton hazard.

Permitting the hazard and allowing both require('pkg') and import pkg from 'pkg' would _still produce two singletons._ That wouldn’t be the case if you transpiled via Babel or esm, because that would convert the ESM code into require('pkg') and therefore the CommonJS singleton would be used everywhere; but that doesn’t happen when ESM is being used natively.

See https://github.com/jkrems/singleton-issue/tree/master/dual-esm-commonjs-package. That example is running under --es-module-specifier-resolution=node, where require('pkg')/import 'pkg' is permissible and therefore the hazard is present. That example throws an error because the CommonJS and ESM singletons imported are different; one doesn’t instanceof the other. Changing 'pkg' to 'pkg/module' doesn’t change that; the CommonJS and ESM singletons are still (and will always be) different. The only difference with 'pkg/module' is that this behavior is expected, and _also_ occurs in transpiled all-CommonJS environments.

It sounds like what you really want is require('pkg') and import 'pkg' to not just be permitted, but to also return the same singleton. That’s only possible when 'pkg' references the same files in each module system. Currently CommonJS can be imported in either, so at the moment it’s only possible when 'pkg' is all CommonJS. If Node someday supports require of ESM, then you could write 'pkg' as all ESM and it would be importable into either module system; and a transpiled CommonJS version of 'pkg' would be used in older Node environments. Separate singletons in separate Node versions isn’t a hazard.

Not all commonJS - just the single file that contains the singleton. The rest can be dual, and can import/require the shared singleton from the same place. Using "exports", that internal file can even be kept internal.

It sounds like what you really want is require('pkg') and import 'pkg' to not just be permitted, but to also return the same singleton.

I want the documentation to specify conditions for which the hazard exists, leave it to the module maintainers to not publish modules with bugs. You've given examples of major projects which published broken versions but this happened under a lack of proper documentation about the hazard. I trust properly informed module maintainers to generally avoid creating broken releases. Being a known hazard means that it won't be as difficult to identify in the future.

Some solutions for maintainers that I can think of:

  1. Do not use singletons or instanceof (not always possible but worth suggesting first)
  2. Publish pure CJS or pure ESM (no dual mode)
  3. Publish CJS code with an ESM wrapper (to control named vs default export for ESM)
  4. Tag classes with a hidden Symbol.for('module-v1-specific-id') and check for it instead of using instanceof
  5. Use globalThis[Symbol.for('module-v1-specific-id')] for storage of shared singleton data

If Node someday supports require of ESM, then you could write 'pkg' as all ESM and it would be importable into either module system

If this were possible it could be interesting. I'm unclear if/how it's possible, specifically my concern is that loading ESM is async and the require function is sync. How does require deal with async loader hooks and in the future loading modules with top-level await?

If this were possible it could be interesting. I’m unclear if/how it’s possible, specifically my concern is that loading ESM is async and the require function is sync. How does require deal with async loader hooks and in the future loading modules with top-level await?

Well that’s why some people think it can’t be done 😄 This thread is probably the place where it’s been discussed the most; see https://github.com/nodejs/modules/issues/371#issuecomment-526740456 or farther up.

I want the documentation to specify conditions for which the hazard exists, leave it to the module maintainers to not publish modules with bugs.

Certainly if we permit the hazard, the docs will try to guide people as clearly as possible to avoiding it. The issue though is that it’s not just an author thing; consumers also need to be aware of it. Consider the graphql case. In that one, graphql was a dual package and there was this other package graphql-yoga that imported one of the graphql singletons (I think the CommonJS version) and extended it. So if you’re a developer using graphql, and you import it, you’re getting the ESM version and that’s the only copy in your module graph; but then you add graphql-yoga to your project and now both the ESM and CommonJS versions of graphql are floating around in your module graph. The graphql-yoga plugin attaches itself to the graphql CommonJS singleton, so if you want to use graphql-yoga and your code is ESM, you need to know to require('graphql') rather than import it in order to get the CommonJS singleton that graphql-yoga is attached to. How would we explain this to users?

The same way as one explains peer dep issues. Anything that depends on identity - === comparisons, collection keys, instanceof, or, as you point out, specific mutations - must not be duplicated.

This is solved with separate packages by ensuring that the singleton-containing-package is always a peer dep of everything that depends on it - this is how Babel, React, eslint, etc, all solve this issue already, and it's widely understood.

The new hazard is the same, just within a package - it's that if you have anything that depends on identity, it must not be duplicated (as in, not duplicated between a CJS and an ESM file). This piece would have to remain CJS to be usable in both module systems; or it could be ESM to be usable in only ESM.


In the graphql case, graphql should be a peer dep of everything in its ecosystem already, since it's the core of its ecosystem, and the onus remains on the graphql developers to ensure that there's only one file that conceptually represents a given stateful singleton (altho that doesn't even apply here).

Their bug is the identical bug as what would have happened if graphql was duplicated in someone's dep graph, it's just in a new place. By duplicating their files (presumably automated, with a build process), the developers opted in to having to know this hazard.


Also, given that import() works in CJS, this same hazard would happen even without dual mode modules unless graphql-yoga explicitly mutated both the CJS and the ESM versions - something they could have done here, too, but failed to do.

Also, given that import() works in CJS, this same hazard would happen even without dual mode modules unless graphql-yoga explicitly mutated both the CJS and the ESM versions - something they could have done here, too, but failed to do.

How so? If require('graphql') throws because it's an esm entrypoint, import('graphql') produces the only possible copy.

How so?

I think we should careful separate three scenarios:

  1. Every published package is exactly CJS or ESM. Nobody ships any version that contains both. Pre-ESM and ESM are considered 100% incompatible in practice.
  2. People ship packages that contain both CJS and ESM. Consumers use one specifier, package authors need to make sure there's no issues from that (e.g. by using shared files between implementations as necessary).
  3. People ship packages that contain both CJS and ESM. Consumers use different specifiers to access each implementation (e.g. graphql/esm) and are responsible for dealing with any potential hazards. Package authors aren't expected to handle apps using both implementations at once.

Your comment seems to assume we're talking about (1) or it doesn't address the issue with (3): It requires perfect coordination between all packages in the dependency tree. For example:

  • graphql ships both CJS and ESM. ESM as exposed as graphql/esm and contains 2nd instances of all types (not safe to combine with each other).
  • graphql-yoga ships both CJS and ESM, the latter as graphql-yoga/esm. It uses existing transpilers, so both implementations require/import the CJS version of graphql. It may even have been published before graphql supported ESM. It has passing tests and 100% test coverage. :tada:
  • my-graphql-schema ships only CJS.
  • some-3rd-party-schema ships only ESM. It's super new and used graphql/esm.

In a world where an app owner should now figure out what to do, this isn't great. Should they use graphql/esm since all their code is written as modules? Well, no. That would break both graphql-yoga and my-graphql-schema. So... just graphql then. But then they can't use some-3rd-party-schema.

The situation would be a lot easier if the onus would be on graphql to provide a safe way to handle the hazard (scenario 2, e.g. via require guard in exports). It could share class identities (likely in individual CJS files) but have ESM wiring to provide a cleaner interface and named exports.

@weswigham if the entry points are different, i mean, but there’s still a conceptual duplication. I’m saying that the bug is only avoidable by eschewing back compat, or properly understanding how to avoid it - in the former case, it’s hostile to many users; in the latter, there’s no hazard regardless.

The issue though is that it’s not just an author thing; consumers also need to be aware of it.

@GeoffreyBooth this is a point of disagreement we have. IMV the maintainers of each module are responsible for not producing releases which cause the singleton issue. I mentioned a few suggestions above - https://github.com/nodejs/modules/issues/371#issuecomment-542829995. I'm sure other ways to mitigate exist. This is something which consumers generally should not need to be concerned about.

The only thing end-users should need to be concerned about is how to write an import vs require. Take const _ = require('lodash'); as an example. If an ESM wrapper is provided does it still support import _ from 'lodash' or do you instead need import * as _ from 'lodash'? You can use const {uniq} = require('lodash'); but without an ESM wrapper import {uniq} from 'lodash' fails. What exports are provided would need to be documented by the module author so end-users could know what to expect.

consumers also need to be aware of it.

@GeoffreyBooth this is a point of disagreement we have.

I didn’t mean that consumers _should_ need to be aware of it, just that I don’t see how they wouldn’t need to be. Just look at @jkrems’ examples and how many of them involve the consumer needing to be aware of the module format of the various modules they’re installing. Now some of that awareness will remain even in a /module approach, but the knowledge needs to be much deeper (and the debugging harder) for the “allow the hazard” approach. Your suggestions are good ones, and if we do allow the hazard I would recommend a whole section on the docs explaining the hazard and offering guidance on how to avoid it, including those approaches.

The only thing end-users should need to be concerned about is how to write an import vs require.

Agreed, but unfortunately I don’t see how this is possible in any form of dual packages. See the various examples above.

If an ESM wrapper is provided does it still support import _ from 'lodash'

You can always import the default export, regardless of wrapper. There’s no situation where a named export { uniq } from would be supported but the default _ from wouldn’t. The wrapper would also define the exported names ('uniq', etc.) so that { uniq } from would also work, which we don’t have for CommonJS packages loaded via import today. It almost goes without saying (but the docs should probably mention it) that the author of the package should create an ESM wrapper that exports the same names into ESM as are available in CommonJS.

There seemed to be some confusion about my point above: I believe that we currently have a dual copy hazard within the pkg/pkg/esm pattern. Once an ecosystem of packages is involved, the solution requires that every package in the whole ecosystem supports the pattern and uses it consistently. I don't think this is a realistic outcome. To me it feels much more likely that the package owner of the identity-aware package could handle the issue locally than to depend on global cooperation to solve it.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

SMotaal picture SMotaal  Â·  5Comments

mhdawson picture mhdawson  Â·  4Comments

MylesBorins picture MylesBorins  Â·  4Comments

vejja picture vejja  Â·  5Comments

GeoffreyBooth picture GeoffreyBooth  Â·  5Comments