modules 🚀 - import assertions RFC

I’m not sure it’s worth meeting time until it’s been presented to tc39.

ljharb on 11 Nov 2019

It looks like Daniel is looking for signals from the node side for the December meeting:

The plan is to collect feedback in issues, and present for Stage 1 in the December 2019 TC39 meeting.

So it seems like it should be discussed before the next TC39 meeting?

jkrems on 11 Nov 2019

👍2

Our group should provide feedback, surely, but that can be done on GitHub - it doesn’t require consensus, so it doesn’t seem like it should take meeting time.

If folks want to talk about it, ofc we can.

ljharb on 11 Nov 2019

@ljharb i'd expect the meeting time currently to only be devoted towards gaining agreement to move discussion to w/e location is apt. If the WG wants to not be involved we can avoid spending time on it as a group, but currently our model is to use meetings as our consensus building point. Without consensus discussion on a per individual level is somewhat awkward when acting on behalf of the WG.

bmeck on 11 Nov 2019

Fair point.

ljharb on 11 Nov 2019

I'll be happy to get more feedback from you all here. @ljharb has commented on integration with Node, e.g., in this comment. Does this comment reflect the consensus position of the modules team?

So far, the biggest concerns I've heard about this proposal have been from participants in this group. Feedback from some other people I talked with so far (including folks who didn't file issues) was generally positive/working through details, though of course we'll get broader feedback as time goes on.

If you all have time, I'd really appreciate your feedback here. I'm really trying to understand if this is a viable approach from a Node perspective.

littledan on 17 Nov 2019

👍1

I wouldn’t say that @ljharb’s statements necessarily reflect a consensus of the modules team, no. We haven’t discussed the issue as a group yet.

In particular, the quote that you have in your readme:

"The author is the only true arbiter of the parse goal of content" -- Jordan H.

Is not a universally held opinion. It isn’t clear to me whether you presented this as attributed to Jordan meaning that it’s just his opinion, or rather a pearl of wisdom brought forth from him, so I’m not sure what you mean by including it. But there have been many long debates within this group regarding author-defined versus consumer-defined disambiguation, and there is no consensus on that point.

GeoffreyBooth on 18 Nov 2019

👍1

@GeoffreyBooth the disambiguation argument is quite different from the quote at hand. I disagree with your statement of us having discussed that specific quote at length regarding it in isolation. However, removing the quote seems fine.

bmeck on 18 Nov 2019

👍1

I've removed the quote; thanks for the context.

littledan on 18 Nov 2019

I wasn't saying that we've discussed the quote, just that we've discussed disambiguation; and that discussion hasn't led to any firm conclusion.

GeoffreyBooth on 18 Nov 2019

👍1

We discussed this in the last meeting https://github.com/nodejs/modules/blob/master/doc/meetings/2019-11-06.md . It seems there is a variety of opinions for/against; of note, one potential implementation concern that was brought up is how CJS based loading could be a bit problematic.

Looking at our code this affects our current loading of JSON, since JSON Modules are still experimental in Node, how do people @here feel about decoupling the JSON translator from the CJS cache?

I personally feel it is fine to separate them even if it could lead to loading the same location in 2 different ways and thus resulting in multiple load operations/JSON values.

bmeck on 21 Nov 2019

👍1

I've made a PR to at least kick start discussion of removing the CJS cache synchronization for JSON modules.

bmeck on 22 Nov 2019

👍1

dropping agenda label as we discussed this last week

MylesBorins on 25 Nov 2019

Module Attributes has achieved Stage 1

bmeck on 6 Dec 2019

🎉2

FYI there is a discussion happening in the openjs standards working group about this proposal. If folks could chime in it would be helpful

https://github.com/openjs-foundation/standards/issues/91

MylesBorins on 27 May 2020

in the latest spec for this proposal (now called "import assertions") the cache key computed for a module takes assertions as components, this needs to be integrated somehow with our loader designs.

bmeck on 4 Aug 2020

the cache key computed for a module takes assertions as components, this needs to be integrated somehow with our loader designs.

Are we still trying to prevent multiple entries with the same URL but different assertions? Or are we accepting potential duplication based on assertions? If we want to ensure a single entry per URL, I assume that for our implementation using just the URL as the key would be valuable?

jkrems on 4 Aug 2020

@jkrems examples in TC39 plenary in July showed web is looking towards duplicated URL but differing backing resource varied based upon assertions. I think matching that would be prudent personally even if it is... unpleasant. In particular being able to replicate web workflow for content-negotiation is what they have some desire for.

bmeck on 4 Aug 2020

So import assertions are still called import assertions but they affect how URLs are interpreted (content negotiation, not assertion)? 🤷‍♂️ Definitely agreed that we need to be compatible with the web here. But makes it effectively less convincing that we should care about dual module hazards if they are first-class citizens in the platform?

jkrems on 4 Aug 2020

@jkrems the resources are still distinct but the URLs are the same so it isn't quite the same as dual module hazard

bmeck on 4 Aug 2020

This feels roughly equivalent to "content negotiation based on assertion":

import 'foo' if { type: 'commonjs' };

If the URL can serve different content based on headers etc. that could be triggered by the assertion, the same URL may return different contents or at least contents that could be interpreted as either but do different things?

jkrems on 4 Aug 2020

@jkrems correct, these are similar to how we are using "conditions". However, the backing entry is still distinct. I think we might need to aid rephrasing the proposal here around cache key rather than try to justify how our system works against the current phrasing. URL must be the same according to the current proposal text, but not the same HTTP cache entry/body.

bmeck on 4 Aug 2020

To be clear I do not think we should allow the same file: URL to point to different file system entries.

bmeck on 4 Aug 2020

To be clear I do not think we should allow the same file: URL to point to different file system entries.

But I assume https://registry.github.com/react/v73.2.4 could point to different contents, depending on conditions? E.g. a URL may choose to send either web assembly or JS based on attributes? So at least for loaders, we would want to expose that info to allow web-compatible implementations.

jkrems on 4 Aug 2020

@jkrems correct, and in theory this also applies to bare specifiers affected by "exports" or "imports"

bmeck on 4 Aug 2020

Bare specifiers were always contextual (e.g. even in the olden days the exact file: location they resolved to depended on where the code was running). Afaict the assertion applies after resolution. So in the example above (import 'foo' if { type: 'commonjs' };) it would resolve using the import condition in exports but then load and verify the URL in the commonjs type assertion context. The latter is new: Each import site can influence how the resource is loaded and interpreted. We can try to prevent it for file: URLs for the built-in behavior (since we can argue that there is no "content negotiation" on disk) and I think that's a good idea. :)

jkrems on 4 Aug 2020

Each import site can influence how the resource is loaded and interpreted

The wording of the current proposal is problematic regarding this as it does not allow altering how the resource is interpreted. I think this is due to the browser deferring to a 3rd party (server) to determine the interpretation. I think we have 2 issues with the proposed new wording:

if we use exports or imports and intercept a specifier, varying the final URL is banned which doesn't integrate with those designs.
it cannot alter how the resource is interpreted (since the resource is provided by an external entity), but for allowing conversion of http to file via exports/imports we do need to vary on them (but not the interpretation mechanism of the resource, just what it will eventually be interpreted as).

bmeck on 4 Aug 2020

@bmeck the restriction in the spec is very explicitly intended to only allow leeway to browsers; node is expected to implement the stronger restriction. If node isn’t planning to do that, then import assertions won’t reach consensus to move to stage 3.

ljharb on 4 Aug 2020

@ljharb I would not want node to follow the restriction as it is currently worded, correct. I don't see how that prevents consensus though given my understanding of the parties involved.

bmeck on 4 Aug 2020

The wording is certainly negotiable; my intention in asking for it was to prevent non-browsers from ever returning a different Module record for the same specifier, regardless of which import assertions are present, or their absence. If the current wording isn’t sufficient for node, what wording would you suggest to ensure that guarantee for non-browsers?

ljharb on 4 Aug 2020

@ljharb that was not my understanding and I would not want to prevent node from returning a different module record but do want to prevent node from being forced to create a new module record. CC: @dandclark @littledan @MylesBorins @xtuc

bmeck on 4 Aug 2020

I’m pretty sure the current proposed spec merely allows creating a new module record for a new specifier/assertions combination, and never requires it.

ljharb on 4 Aug 2020

Isn't the only security-important attribute something like {declarative: true} (to indicate that the import should not trigger execution)? Can node just... not support type? Since node doesn't load non-code imports (except json), doing so would allow package authors to still have transparent (ie, non breaking) migrations of implementations?

weswigham on 4 Aug 2020

The intention, i believe, is that everyone is required to support type json, at least (but not required to restrict json imports to those with the type)

ljharb on 4 Aug 2020

Added to agenda since it could block upstream

bmeck on 4 Aug 2020

👍1

Errr, but, eg non-script html modules, or xml or css documents may also be declarative. I feel like hitching the security aspect to specific content encodings is maybe not sufficient. Minimally, if the runtime supports, say, yaml modules in addition to json (which is somewhat reasonable in many places), one could reasonably expect a package author to safely migrate from json to yaml or vice-versa, and it'd be _really odd_, if under the hood import x from "mod/conf" if {type: "json"} failed because it was a loadable yaml document instead, when all you really wanted to imply was import x from "mod/conf" if {declarative: true}. Asserting the invariants you actually want upheld just seems way more straightforward than content type malarkey. Who's to say json8 in some hypothetical future doesn't implement some kind of execution-like behavior (looking at you xml) that should be similarly limited? Minimally, asserting things like the executability of a thing are much easier to derive author intent from.

weswigham on 4 Aug 2020

Hopefully the proposal champions can weigh in on why a “no execute” bit was insufficient.

ljharb on 5 Aug 2020

👍1

@ljharb no execute is a vastly more complex and ambiguous task than asserting a content type (consider a module that only has function declarations but no executed code for example). You have to take into account nesting sub resources locked into being limited, content type is a single layer deep. I feel strongly that "no execute" is not viable in any short term and requires a massive undertaking and is more than what is needed to unblock the main driving feature of import assertions.

bmeck on 5 Aug 2020

👍1

I agree with @bmeck's comment regarding noexecute. @rniwa also pointed out here that:

Execution vs. no execution is an important distinction but changing the parser mode based on MIME type isn't great either. We've definitely had security attacks which used an existing content by reinterpreting it in a different text encoding & MIME type.

dandclark on 5 Aug 2020

Regarding the other discussion in this thread:

I’m pretty sure the current proposed spec merely _allows_ creating a new module record for a new specifier/assertions combination, and never requires it.

That is correct. The proposal leaves it to hosts to decide whether assertions are part of the module cache key (see https://tc39.es/proposal-import-assertions/#sec-hostresolveimportedmodule), although it recommends that assertions are not part of the key. An earlier version required that they were not part of the cache key, but this restriction was loosened based on feedback from proposal's integration with HTML. HTML is going to consider assertions part of the module cache key.

However, it is still a requirement that assertions do not affect the interpretation of the module (other than potentially causing it to fail to load). So, things like changing the HTTP headers depending on the assertions would not be permitted.

Thus, given these two statements it is up to the host whether or not these result in two module instances (assuming neither set of assertions results in failure to load):

import thing from "./url" assert {foo: "bar"};
import thing2 from "./url" assert {foo2: "bar2"};

However, if the host chooses to treat these as two module instances, then the instances must be interpreted exactly the same. They must have the same HTTP request headers, they must be parsed and executed the same, etc.

dandclark on 5 Aug 2020

In other words, they must be the same conceptual module, although nobody has figured out how to better word that normative requirement in the spec yet.

The easiest way to fulfill both the letter and the intent of the spec is to make import * as one from './url' assert {foo: "bar"}; import * as two from './url' assert {foo2: "bar2"}; one === two result in true. The only environments where this isn't possible are web browsers (and deno) because they fetch content from an untrusted location (the internet), and the web server can't be properly constrained in this way. node does not have this security/usability/certainty hole, and as such, it shouldn't need to consider the assertions as part of the cache key.

ljharb on 5 Aug 2020

node does not have this security/usability/certainty hole, and as such, it shouldn't need to consider the assertions as part of the cache key.

But that means that code using assertions wouldn't be portable between browsers and node - assuming unknown assertions are ignored[1]. It would be valid to write otherwise portable code that uses assertion to "bust the cache" in the browser - but that code would fail in node (and vice versa). I would argue strongly for following the execution semantics of the browser in node. And if adding an "nonsense assertion" ({thisWillNeverBeActually: "checked"}) causes an additional evaluation of the imported module in one but not in the other, that would be a change in execution semantics to me.

[1] Are unknown assertions ignored? I assume they must be because of future compat.

jkrems on 5 Aug 2020

@jkrems import assertions are explicitly not designed, intended, or permitted to "cache bust", so any code written with them to do so is already incorrect/broken.

ljharb on 5 Aug 2020

👍1

[1] Are unknown assertions ignored? I assume they must be because of future compat.

This is also left up to hosts. The current status of the HTML integration is to fail on unrecognized assertions.

This seems important from a security perspective. If I'm asserting some security-related restriction on the thing I'm importing, if the implementation can't guarantee that the assertion is true because the assertion is not yet implemented, then I'd rather it cause a failure than just be ignored.

dandclark on 5 Aug 2020

no execute is a vastly more complex and ambiguous task than asserting a content type (consider a module that only has function declarations but no executed code for example).

That's a huge broadening of what I'd expected a "declarative" assertion to mean - it shouldn't introsoect specific code file contents; it should just be banning all potentially executable formats; which seems _much_ easier. (And also isn't "deep" in any way)

weswigham on 5 Aug 2020

@weswigham it would have to be deep, a module that lacks permissions to execute shouldn't be able to pull in malicious code that _can_ execute, directly or transitively.

ljharb on 5 Aug 2020

@ljharb You can define the requirements also as "only things that are necessarily leaf nodes" (like JSON is) are allowed. The argument here is that you can "easily" define a set of importable files that by their nature don't lead to execution, without looking at the file contents. And any file where you _would_ have to look at the contents, would always fail that assertion. That seems to be what {type: "json"} is meant to express, just in a somewhat roundabout way.

jkrems on 5 Aug 2020

👍1

it would have to be deep, a module that lacks permissions to execute shouldn't be able to pull in malicious code that can execute, directly or transitively.

What _nonexecutable_ formats can pull in _executable_ dependencies? I'd struggle to call the format "nonexecutable" if that were the case.

weswigham on 5 Aug 2020

@weswigham HTML Modules.

ljharb on 6 Aug 2020

I'd be inclined to say that if they _can_ include executable content (even via import), then they should just be considered executable themselves.

weswigham on 6 Aug 2020

About

import 'foo' if { type: 'commonjs' };

I don't think we should encourage hosts to define subsets of JavaScript, even if they are allowed to. That will make the module less portable across environments.
For this specific example, I think this could be a evaluator/transformation attribute. The host could translate the module to the user's preference.

Wasm is another instance of module that a user can mark as non-executable but that could import executable JavaScript on its own.

xtuc on 8 Aug 2020

The Import Assertions champions are seeking to ask for Stage 3 this coming TC39 meeting, we should try and iron out if there are any actionable feedback items we wish to ask for. In particular the web browser restriction on cache key seems something I would remain uncomfortable with keeping in the import assertions proposals since node may wish to make use of matching the web.

bmeck on 8 Sep 2020

👍1

closing as i believe there is nothing else needed here

bmeck on 21 Jan 2021

Modules: import assertions RFC

Most helpful comment

All 53 comments

Related issues