Chakracore: Suggestion: Internalise Module record map

Created on 24 Feb 2018  Â·  14Comments  Â·  Source: chakra-core/ChakraCore

Currently for any embedder to implement ESModules with CC they have to create a map (key = normalised specifier, value = module record) of module records which they have to hold.

Whenever a module is requested the host must always perform the following steps:

  1. Normalise the specifier
  2. Check if a module record with this normalised specifier is in the map
  3. If yes return existing module record
  4. If no:
    a) Initialise a new module record (providing the normalised specifier as one of the parameters)
    b) Schedule a call to ParseModuleSource for the new record
    c) add the module record to map
    d) return the record

Suggestion
Considering that the module record map will be functionally identical for all hosts I propose that it should be internalised into ChakraCore.

Steps this would entail:

  1. Add a module record map (key = normalised specifier, value = module record) as an internal property of a SourceContext
  2. When JsInitializeModuleRecord is called check the internal map for the record - if it exists return it instead of creating a new record
  3. When JsParseModuleSource is called do an early return of JsNoError if it's already Parsed rather than JsErrorModuleParsed (optional change as this point could just be documented instead) - point being to allow the host to always schedule a JsParseModuleSource job when asked for a module and not need to check first simply relying on CC to only do it if it's needed

Impact

  1. Face of the API would remain the same
  2. If used in the current "correct" way would still function the same way -i.e. this is not a breaking change
  3. This would remove the requirement for boilerplate code from host implementations AND make the API simpler to understand
  4. A host would now be able to perform the following simpler steps when a module is requested:

    1. Normalise the specifier

    2. Initialise a module record (no existence check required as CC would now do it)

    3. Schedule a call to JsParseModuleSource for the record (no already-parsed check required as CC can do that)

    4. return the record

Suggestion

Most helpful comment

Ah, I misunderstood what you were suggesting; that makes more sense now.

All 14 comments

I think that part of the difficulty with this is the notion of "normalizing", since I think that is a host concept? E.g. in something like node, the paths ./foo and ../src/foo might normalize to /some/path/to/src/foo, but chakracore has no way to tell. And for a more complex example, something like node needs to convert somePackage and ./node_modules/somePackage/startFile.js to the same normalized representation (noting that somePackage imported from a different script might normalize to a different file as well, if multiple versions of a package exists).

@MSLaguana Normalising does indeed need to remain a host responsibility as how to do it needs to be different for different hosts - not suggesting changing that - only the step of checking if you've already loaded a module after you normalise the specifier.

Indeed, ChakraCore already expects you to pass in a “normalizedSpecifier” when initializing the module record, so there’s no reason we can’t just do the caching based on that.

Ah, I misunderstood what you were suggesting; that makes more sense now.

hello

I think it will make the default behavior easier to implement, but at the meantime I have a few concerns,

  1. Specifier is a really host thing and not something for VMs to keep track on (at least not required in Abstract Module Records and its subclass Source Text Module Records in ES spec). VMs on the other hand track [[Status]] for each module record, which helps prevent double parsing or evaluation over the same module.
  2. Do node-chakracore need to keep a map/tree of module records for cjs/esm interop anyhow @digitalinfinity @MSLaguana ? Am not really an expert on this.
  3. I am little torn on silent no-ops on JsParseModule, this makes code shorter but less explicit.

Couple of notes on those points:

  1. This function does not take a "specifier" parameter it takes a "normalised specifier" - the host is meant to pre-normalise it before passing it in.
  1. On the node-chakracore piece; I obviously don't know what the final answer will be for how modules will be supported. Currently node keeps a map of module records itself - however it does this in javascript and it's not visible from the node-cc layer.

    (I was having a look at how to implement support in node-chakracore and found that among other things I think it would need a second map of records managed within chakrashim as node normalises specifiers later in the process than CC would want them (due to differences between the v8 and CC APIs))

  2. Basically my thought there was that it allows an implementer to just leave a lot more of the logic to CC rather than reproducing the same repeatedly - alternatively could still have the error but document that if you understand what you're doing it's safe to ignore it.

Normalized specifier is still host business though. VM wants to get a moduleRecord back from HostResolveImportedModule, but doesn't really care what the normalized specifier for that is. Host (however weird it may sound) is free to use the same specifier for two different modules or give VMs two moduleRecords of the same module.

I can be wrong on this, but the normalized specifier Chakra takes from JsInitializeModuleRecord might just be a debugging string for us?

@liminzhu The string used for debugging is the module URL, i.e. JsModuleHostInfoKind_Url. The normalizedSpecifier passed to JsInitializeModuleRecord() doesn’t seem to be used for anything at all at present; given that the documentation used to state (and maybe still does) that passing the same specifier twice returns the same module record, I tend to assume the feature request here is the exact purpose it was originally designed for.

I think we are just printing a bunch of stuff with this - see GetSpecifierSz in SourceTextModuleRecord.cpp.

I'm not the original designer of module APIs, but if I have to guess the passing the same specifier twice returns the same module record comment is to help with This operation must be idempotent if it completes normally. Each time it is called with a specific referencingModule, specifier pair as arguments it must return the same Module Record instance. from HostResolveImportedModule in the spec (normalizedSpecifier is similar to a referencingModule+specifier pair). I'm still under the impression from reading the ES spec though that it's the host's responsibility to keep track of module sources and feed the VM the right module upon requests.

Yeah, the spec requires the host to return the same module record if passed the same specifier and referencingModule (this is not negotiable, based on the spec text) - what I get out of this proposal is that since CC is already asking the host to normalize the specifier anyway, why not have the engine help enforce the spec requirement?

I mean we could but that will deviate slightly from the workflow defined in the spec.

Since JsParseModuleSource also needs source text as a in-param, the host needs a map or some data structure storing existing module records as well unless it wants to re-do I/Os every time even for no-ops.

@liminzhu That's a very good point on the host needing to track whether it's already loaded the source - I hadn't thought of that - IMO my suggestion as stands above is invalidated by that point - the suggestion will need to be revised for that or scrapped.

I'll give it more thought and may raise a similar but different suggestion at a later stage but I'm going to close this for now.

Thanks for the feedback.

Thanks for the original suggestion, this is a good discussion to have. We're always open to new suggestions as well :).

Was this page helpful?
0 / 5 - 0 ratings