Modules: Feature: Load ESM over https

Created on 9 Jun 2018 · 31Comments · Source: nodejs/modules

Hi,

Just closed my PR to the Node.js core about this feature. I'd like to propose a loader that could do this thing.

Example

import sum from 'https://gitsomething.com/origin/v1.1.0/main.js'

console.log(sum(1, 1)) // 2

How I think it should work

Import over https reached
Convert this https path (without any filename) to the hash representation (e.g. 'https://gitsomething.com/origin/v1.1.0' to 70b56bcaff85f7836cc4688021cfbfe385550fbbeb6e5e801eb16bd9e66d3134)
Go to the main location with stored node modules over http (in PR it uses /tmp/ or the path specified by NODE_HTTP_MODULES env. Essential is that it is some base dir like node_modules)
Look for the folder with this hashed name or download the package
Look for the file matching the loading one (e.g. main.js in this case)
Load the file content as usual esm

What I don't know yet

How to resolve package dependencies

esm features interoperability web-platform

Source

viktor-ku

Most helpful comment

Guys, it is a draft.

Prefer folks, people etc.

benjamingr on 9 Jun 2018

❤3

All 31 comments

I'm curious to know what do you think about allowing URL in Module specifier in WASM https://github.com/WebAssembly/esm-integration/issues/11

xtuc on 9 Jun 2018

once again i think we need to do this the standardised way: html.spec.whatwg.org/multipage/webappapis.html#fetching-scripts

there are 1000000000000 different things that servers do with compression, transport security, content types, etc and node trying to figure all that out itself won't help anyone.

also why do you want to cache files?

devsnek on 9 Jun 2018

👍3

@devsnek Yes that's right, just wanted to bring that up because I think that's important to all agree on it.

also why do you want to cache files?

Sorry where's mentioned caching?

xtuc on 9 Jun 2018

@xtuc steps 3-5 are working on files cached in /tmp/ (which won't work on plenty of oses)

i'm unclear as to why source text is leaving node's memory

devsnek on 9 Jun 2018

@devsnek presumably for the same reason ry caches files in deno - so it’s fast and consistent between loads and doesn’t delay startup time.

benjamingr on 9 Jun 2018

@benjamingr wouldn't you just download the files in that case? i would expect an https import to hit the server every time.

devsnek on 9 Jun 2018

It must; in case there’s redirects or updated content. It might get a 304, of course.

ljharb on 9 Jun 2018

👍2

The reason behind caching is to avoid going to the internet if you already have this module downloaded. Why do you expect node to re-download module every time?

If you specify url with certain version (see example) then you wouldn't need it to be re-fetched
If you use master on un-versioned url then you will have this kind of issue, but then we can have a similar to ry idea with --reload command

viktor-ku on 9 Jun 2018

About tmp:

Guys, it is a draft. I am not gonna leave it as is. We simply need some convenient place to store this modules. Perhaps we can specify this place with flag or env (like in PR).

So you specify env variable and everything goes there.

viktor-ku on 9 Jun 2018

There’s nothing to stop this from being implemented in a loader already
today.

Basically a resolve hook can detect http requests, check if it is in the
cache, downloading if necessary, then resolve to the file url of the stored
cache entry through the resolve promise.

Please do give it a go and share your findings, and just ask if you have
any questions on the implementation or feedback on the loader hooks in the
process too.
On Sat, 09 Jun 2018 at 17:38, Viktor Kuroljov notifications@github.com
wrote:

About tmp:

Guys, it is a draft. I am not gonna leave it as is. We simply need some
convenient place to store this modules. Perhaps we can specify this place
with flag or env (like in PR).

So you specify env variable and everything goes there.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/nodejs/modules/issues/127#issuecomment-395978399, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAkiyt3iYvN4r9QRaiOnJrahoqNMk4aNks5t6-v8gaJpZM4UhUb5
.

guybedford on 9 Jun 2018

@guybedford I don't know how to create a loader at the moment. But I will start to research that

viktor-ku on 9 Jun 2018

👍2

A url is designed to have its contents cached, but urls always still hit the network (service workers aside). I would expect the same in node.

ljharb on 9 Jun 2018

👍2

@ljharb so you will use --reload a lot I guess :keyboard:

viktor-ku on 9 Jun 2018

No, I’d expect that to be the default.

ljharb on 9 Jun 2018

👍3

Guys, it is a draft.

Prefer folks, people etc.

benjamingr on 9 Jun 2018

❤3

@kuroljov see https://nodejs.org/api/esm.html#esm_loader_hooks

benjamingr on 9 Jun 2018

Prefer folks, people etc.

@benjamingr Okay. I will try to remember that. Bad habits I guess

viktor-ku on 11 Jun 2018

@ljharb on the topic of service workers, could abstracting away https implementation into a similar mechanism make sense.

Say I have xyz origins, I create and register a worklet for those (I can use my imaginary favourite npm package to do so), and when a URL is requested, all node has to do is delegate to the worker and expect a Response-like object with content-type and body. It will involve a bit of origin trials... etc. so that node does delegate to the right worklet. It can also be statically defined for packages to use the specific worklet entry point specifiers.

SMotaal on 12 Jun 2018

👍1

I think this feature would be dangerous to implement.

If people would like to hot-load code from a network we should not make it easy for them. Node.js is too un-sandboxed to make this a viable default or built-in option.

Fishrock123 on 20 Jun 2018

👍1

@bmek One thing I keep hitting with the 20+ isolated experiments I did with loaders is the fact that sometimes the resolved URL to a module's actual file on disk is not the same as the URL of the module it intends to resolve as. For instance, if a loader transpiles and caches a module in a temporary file, that URL is the resolved URL, however it is should not be the import.meta.url nor should it be the parentURL for it's static linking calls to Loader#<resolve, import, …>(specifier, parentURL).

I believe this will be common beyond experimental, which imho applies to this kind of loader.

@viktor-ku did you explore this further with --loader and if so, were there any other pain-points came across?

SMotaal on 27 Jun 2018

@SMotaal for now you can insert import.meta.url = ${JSON.stringify(desiredURL)} at top of your transpiled script.

However this is not enough, will continue why in https://github.com/nodejs/modules/issues/140

bmeck on 27 Jun 2018

Note this contextual handling is exactly why a translate hook is useful.

guybedford on 27 Jun 2018

@guybedford I don't think this implies we need a separate hook / translate, if you could explain that a bit it would help here.

bmeck on 27 Jun 2018

Translation allows setting the JS source of a resource at a given path, thereby maintining the correct referrer context to that given path.

The alternatives involve large-scale re-resolving all modules that depend on that module into a new ID space, that will then not match the file system.

guybedford on 27 Jun 2018

@guybedford couldn't that be rolled into the same hook? I'm asking because that matches the JS spec closer and seems to alleviate a series of issues.

bmeck on 27 Jun 2018

@bmeck do you mean combining with resolve or combining with load? A load hook can certainly work as an alternative to any translate hook.

guybedford on 27 Jun 2018

there is too much tribal knowledge in those terms of "load" vs "translate" for me to understand, I've been looking at having:

resolve(referrer : string, specifier : string, hostDefined: *) => Promise<{
  key? : string, // insertion point into cache
  body? : B, // the body, unclear on if this actually needs to be loaded ahead of return, can be on demand?
  data? : * // any message passing between loaders
}>

bmeck on 27 Jun 2018

Could you not just return a Blob URL directly in such a case?

Blobs can work certainly, but are a bit more work as any importers need to be able to look up the mapping from the file URL to the blob for that resource, which is avoided with a translate.

What I mean by load is basically a fetch hook that can choose to do translation (load effectively covering both concepts).

guybedford on 27 Jun 2018

Also the issue with resolvers getting data as you've written is that it's a many to one function, so precedence on body / data would apply.

guybedford on 27 Jun 2018

@guybedford blob: URLs are insufficient for cycles: https://github.com/w3c/FileAPI/issues/97 . We want full Blobs with ability to separate reservation of the key so that it can be done ahead of content creation.

Also the issue with resolvers getting data as you've written is that it's a many to one function, so precedence on body / data would apply.

I'm unclear on this, data is purely used as a message passing system for loaders, it wouldn't affect any default loader I would think.

bmeck on 27 Jun 2018

@guybedford I am +1 for a translation hook as long as chaining it is handled appropriately (which in my opinion is different from resolve's chaining).

Could you not just return a Blob URL directly in such a case?

My biggest concern in all my testing so far has been unchecked potential for redundant "stringing" garbage and mishandling outdated references to objects because we all do that sometimes. My second one is determinism of resulting URL, ie URL.createObjectURL does not allow for deterministic mapping for deadlock dependencies. So maybe the open web requires this kind of after-the-fact security however, Node's Loader specifically should meter both security and performance overhead.

SMotaal on 27 Jun 2018

Was this page helpful?

0 / 5 - 0 ratings