Node: Feature request: node --install

Created on 13 Mar 2017  ·  90Comments  ·  Source: nodejs/node

I'm going to try, as best I can, to distill more than a year of conversations about this topic.

This is a touchy subject. There have been numerous threads with various people advocating significant changes to npm or to replace it entirely. This is neither of those things.

  • I am not suggesting that Node.js stop shipping with npm by default.
  • I am not suggesting that the Node.js project write a full replacement of npm.

With that out of the way, and the frame of debate set within those boundaries, I think we can have a productive conversation.

As it has matured npm has become a large and significantly tool for software development. It includes features for multiple development workflows and optimizes itself for developer ergonomics. I don't think we could ask for a better tool for developers.

The problem is, not every Node.js install is used by a developer. Many installs happen in infrastructure. These installs run an application and are never touched by anything but infrastructure automation. Yet, these installs still include npm and, in fact, require npm in many cases because it is the best mechanism we have for installing the dependencies the application needs.

node --install

  • Installs the dependencies defined in a local package.json.
  • Follows all the standard logic found today in npm install.
  • Defaults to a "production" mode of package installation (similar to npm install --production.

Because the use cases for this are much more narrow than npm you can see a future in which additional features are added that are in high demand by production users but make the developer ergonomics more difficult (multiple registry endpoints for instance).

I'd like to use this thread to reach a consensus about the scope of this feature, potential pitfalls, and whether or not this is something we agree should be added. From there I can work on a proper Enhancement Proposal.

feature request

Most helpful comment

@iarna should probably comment on this.

This is a bad idea. Not an _obviously_ bad idea, but one that very quickly spirals out into a rather large pile of complexity once you start pulling on the thread.

Here's a non-exhaustive shortlist of things that you'll likely need to support in order to handle the enterprise use cases y'all probably care about:

  1. Shrinkwrap files
  2. Git deps
  3. Dependencies spread across multiple registries
  4. Registries mapped to namespaces
  5. Scoped package names that require logging in to a registry
  6. Logging into a registry
  7. Per-project configuration files (often included in a project repo)
  8. Installing binary dependencies, and other modules that require some install-time setup (which may or may not use node-gyp as the build tool)

Additionally, you need to be aware of security implications of pulling code down from the internet (checking the checksums, etc.) and any cache has to be resilient against being run in multiple parallel processes (which is a real thing that real people do a surprising amount of the time).

If installs are going to be kept in a reasonable time frame, you probably also want to build up the dep tree, then dedupe it (without causing any package to get the wrong version of any of its dependencies), and then lay it out on disk.

Is this really a thing that y'all wanna take on? You'll end up re-implementing the majority of the hard parts of npm. (Publishing is, by comparison, extremely simple.) The worst case scenario will be developing a package installer that is not compatible with npm, effectively splitting the community with rival authoritative package installers.

If you want something different than npm, and can articulate what differences you'd like to see, we're all ears. Maybe it makes sense to have yet another package installer tool, but I doubt it. If you _can't_ articulate the differences that you'd like to see, then this is an even worse idea.

All 90 comments

I'm definitely +1 on having this conversation but if we decide to move forward we need to scope this very carefully and specifically.

Some questions:

  1. What sources would this install from? Everything currently supported by the npm client (registry, github repo, local, etc)

  2. When a module is installed using this mechanism, would life cycle scripts be included? run automatically? all of them or just a subset?

  3. Lock file or shrinkwrap? Would version selection of dependencies match current npm? Would it match yarn?

  4. Would this dedupe?

  5. Would this do things like outdated module checking?

It's likely easier to ask it this way: what aspects of npm's scope (and possibly yarn's) would this not include?

The pitfalls are much more straightforward: registry clients are complex, matching user expectations even more so. What additional amount of work would be required to develop and maintain a reasonably feature-complete install client that tracks well with what npm and yarn currently do. If either of those clients moves in a different direction as far as things like dedupe, version selection, etc, which is the "source of truth" with regards to whether one included in core should change or not? We need to be certain about what amount of work would be expected here.

Would this dedupe?

Thinking about this from an "optimized for production" use case you might be able to cut a lot of corners here. For instance, you can expect that install is only ever run once, and you can blow away the prior installs deps as a result. Does dedupe still make sense in that context?

What sources would this install from? Everything currently supported by the npm client (registry, github repo, local, etc)

If we are optimizing for production, we may not need local. How do people feel about git based installs in production?

@iarna should probably comment on this.

This is a bad idea. Not an _obviously_ bad idea, but one that very quickly spirals out into a rather large pile of complexity once you start pulling on the thread.

Here's a non-exhaustive shortlist of things that you'll likely need to support in order to handle the enterprise use cases y'all probably care about:

  1. Shrinkwrap files
  2. Git deps
  3. Dependencies spread across multiple registries
  4. Registries mapped to namespaces
  5. Scoped package names that require logging in to a registry
  6. Logging into a registry
  7. Per-project configuration files (often included in a project repo)
  8. Installing binary dependencies, and other modules that require some install-time setup (which may or may not use node-gyp as the build tool)

Additionally, you need to be aware of security implications of pulling code down from the internet (checking the checksums, etc.) and any cache has to be resilient against being run in multiple parallel processes (which is a real thing that real people do a surprising amount of the time).

If installs are going to be kept in a reasonable time frame, you probably also want to build up the dep tree, then dedupe it (without causing any package to get the wrong version of any of its dependencies), and then lay it out on disk.

Is this really a thing that y'all wanna take on? You'll end up re-implementing the majority of the hard parts of npm. (Publishing is, by comparison, extremely simple.) The worst case scenario will be developing a package installer that is not compatible with npm, effectively splitting the community with rival authoritative package installers.

If you want something different than npm, and can articulate what differences you'd like to see, we're all ears. Maybe it makes sense to have yet another package installer tool, but I doubt it. If you _can't_ articulate the differences that you'd like to see, then this is an even worse idea.

Thinking about this from an "optimized for production" use case you might be able to cut a lot of corners here. For instance, you can expect that install is only ever run once, and you can blow away the prior installs deps as a result. Does dedupe still make sense in that context?

Dedupe _primarily_ makes sense in a production environment, because most people who use node and npm use it to build assets for the front-end, and shipping more than one copy of something in your webpack bundle is actually costly.

Per-project configuration files (often included in a project repo)

These are .npmrc files in the project root, which brings up another question: will this feature support NPM configuration paths and environment variables or will it respond to its own?

Just everything that @isaacs said, plus all of these in combination.

Bundled dependencies and shrinkwraps in the same module.

Bundled dependencies with verisons that don't match those in a shrinkwrap.

Deep-in-the-tree shrinkwraps.

Shrinkwraps are not advisory. They're a production artifact, so you can't skip them. (Well, you can, but will what you produce work? ¯\_(ツ)_/¯ Skipping them kind of defies the point of having them in the first place.)

Would this dedupe?

Thinking about this from an "optimized for production" use case you might be able to cut a lot of corners here. For instance, you can expect that install is only ever run once, and you can blow away the prior installs deps as a result. Does dedupe still make sense in that context?

You get 95% of deduping for free just by producing a flat tree and flat trees are necessary for Windows installs, so I'd say yes.

What sources would this install from? Everything currently supported by the npm client (registry, github repo, local, etc)

If we are optimizing for production, we may not need local. How do people feel about git based installs in production?

local deps are regularly used in real life production, as are git repos. A LOT of users use git repos ONLY as their private module store.

👋
So, this seems pretty neutral to me, personally, as-is. If I understand correctly, you'd just be shelling out to npm, or using some subset of it. The main comment I have is that it seems like a very small allowance for what is actually a larger difficulty surrounding the app deployment story for npm-based apps. I think this may in fact cover only a very small corner of the use cases where users want to take their (non-registry!) npm apps, put them in production, and then get all their (registry and other) production deps installed in there.

It's been an ongoing discussion at least on my team about how to best support stuff like this so it's definitely an interesting suggestion.

As far as how node could implement this, my main inclination right now is to mention that it's probably not-a-good-idea to write your own installer, because of how fragile compatibility across the registry can be for stuff like this. It sounds like you're still intending to just call out to npm proper for this, though.

For the sake of mentioning near-future things that have the potential to change this discussion, though, here's some stuff about the current projects at npm:

  • pacote is going to be replacing the entire package manifest and download substrate of the npm client when npm@5 comes out "soon". This is a first step towards extricating major components out of the monolithic CLI. That means if you use this package, you will get package manifests and tarballs the exact same way the npm CLI does. It's also got things like local mirrors in the pipeline.
  • We're planning on yanking out the installer itself, along with the authentication subsystem, into a separate project, as well. That's probably the point at which you'd be able to just use our stuff to build a "minimal" npm of your own.
  • It's not exactly farfetched for npm to support yarn.lock in some way. We're going to have the capabilities to do the exact same things yarn does with it pretty soon (such as fetching from the cache based on sha, verifying that, etc).

The biggest thing I'd call out is that compatibility is hard, and like @jasnell says, writing an npm-compatible client is a lot of work to get right. In the end, npm-proper (or tools that npm-proper itself uses) is the only way to ensure maximum ecosystem compatibility. Other tools often do a great job, but they necessarily require users to deal with compatibility issues they simply didn't have to before.

You can't have "minimal npm install" without all the things rebecca and isaac mentioned. But you can have an npm that doesn't include the capability to publish, login, manage permissions, manually manage the cache, check outdated deps, or update dependencies. I assumed these are the "extra bits" you're talking about. Ignoring shrinkwrap, bundleDeps, etc, are a good way of breaking random packages and are both heavily used in non-registry "production" apps.

Looking at pacote, we'd have to be crazy not to build on top of this :)

p.s. if this means that I can get the node project to help us out with building these things I would be super thrilled about that. These projects are meant to be easier for the community to participate in, because the npm CLI is so bloody huge it takes a year for anyone to onboard with it. I want to fix that too :<

@mikeal

  • I am not suggesting that Node.js stop shipping with npm by default.

Yet, these installs still include npm and, in fact, require npm in many cases because it is the best mechanism we have for installing the dependencies the application needs.

To clarify: Is the suggestion, then, that this would be in an alternative build (maybe called something like "minimal"/"production"/"infrastructure"), with the standard build, including npm, being the default? Or would both installers be shipped in the standard build?

@bengl correct, this would open up the option of producing a build that didn't include all of npm. However, all builds would include node --install.

Also, we should keep in mind that in the near future we'll probably have a lot more build types than we have now as we start to support addition vm's.

@mikeal would this build extricate parts of node core itself, too? It's hard to think of what you'd gain from that (or, frankly, what you'd gain from not including the full CLI sources as-is)

@zkat it opens it up as an option, but it may not end up being something we want to support. In the wild I've heard of people removing the npm binary from their docker images after preparing them, so something like this is already happening whether we produce a build like this or not.

A minimal production install of Node.js could likely benefit from not having any publish capabilities (because it's unnecessary code in production) and as @mikeal indicates there are scenarios where the npm cliient is removed. I wouldn't say that's a majority of cases by any stretch of the imagination.

removing the npm binary from their docker images after preparing them

WOW RUDE

Though that makes sense. One thing about this sort of thing is that it's often better to understand the usecase-in-general before tackling a single solution like this. It sounds like "better embedded support" is the one in this case?

@jasnell the thing about that is that literally everything other than the installer is a fairly small chunk of npm. I don't know how much smaller right now, but most dependencies and code in npm itself are tied up with installation itself. Yanking those secondary tools out isn't going to help. Publish is a good example: probably the biggest part of the publish codebase involves npm pack itself, which is what generates tarballs for publication. But this code is necessary for installers themselves because we need to run what amounts to an npm pack inside the source code in git dependencies. So, you can yank out publish.js, but literally all the rest of the code will stay. Because dependencies actually need it.

But wait there's more! We've been talking about running lifecycle scripts inside git dependencies, so people can rely on artifact builds from git dependencies (this means that if someone is using a registry dep, then forks the dep's repo with their own mods, and points their local dep at their own fork, they'll be able to rely on the regular build that would be done, rather than having to publish under a different name). If we do that, that means we would be installing a dependency's devDeps, which this proposal, as-is, does not at all take into account.

What I'm getting at is you basically have a choice between keeping the bulk of npm (with all the necessary bells and whistles the installer itself needs), or having to put a ton of your own effort into building Yet Another Installer™ that will only be a greenspunned version of the existing installer, and potentially make life harder for users that expect the command to actually work.

If deployment is the concern, perhaps it's a better approach to provide users (either through npm, or through node), with a way to bundle/treeshake a single node bundle that includes only the things they need for deploying that application. That would solve the embedded system issue (assuming we also support cross-compilation of gyp deps and/or univeral binaries of sorts), the minimal deployment issue, and you'd get rid of "those parts we don't need". The same goes for any of the javascript source code in Node Core that users won't need for their production apps.

Node.js continues to be used in ways we never anticipated. I don't know how long we can continue to produce a single artifact that works for all of these use cases given the expansion we've seen in the last few years.

What I would hate to see happen is for these ever widening use cases to start to impact the default experience, which I believe is already happening indirectly. I strongly believe that the most important constituency for Node.js installs is the developer community that builds applications and publishes modules. Catering to this constituency and continuing to grow it is the most important thing we do because these are the people that continue to build and mature the Node.js ecosystem.

Thinking about how we might be able to produce builds specific to other use cases is a good way to reduce the pressure we put on the default build and give us more of an opportunity to grow the list of developer niceties that Node.js ships with by default. As we've grown I've been in more and more conversations with people who don't see why we can't just ship, by default, without a publish command, or without a debugger. There are a lot of people out there that think developers should just jump through some extra hoops in order to participate at that level. As we grow this sentiment will also grow if we don't produce something that addresses the concerns of these other use cases.

It's encouraging to see that npm is already breaking off the components that would allow us to cater to this "install only" production use case without actually writing the logic all over again. I don't think it's very valuable for us to re-produce that logic and there's a big advantage to standardizing it. But make no mistake, developers need npm. If we produce a build without it, that's not a build for developers to use directly.

Follows all the standard logic found today in npm install.

Isn't this an awful large amount of logic?

Also, regardless of whether we say we don't want to write a replacement for npm there will undoubtedly be plenty of people who would want it to become that and so the scope creep here is immense.

_Edit: Mikeal does make a very good point above, though._

Perhaps we can start with a smaller, more general problem set. _(Smaller, not small.)_ To draft node --install we would need a clearly defined understanding of what Node.js expects its final dependency tree to look like. Today we have "like npm" or "like yarn", both of which are moving targets. With a clearly defined spec, node can assert which of the registry installers is spec compliant and more effectively manage user expectations.

With that baseline established, we can much better assess Node.js' options and the registry clients also can concisely indicate which versions of Node.js (and perhaps even JavaScript) they are compatible with.

What would be the value of having the installer in the binary (i.e. node --install)? Considering we'd probably be talking about including libs like pecote and other modularized parts of npm, it would probably not be ideal if these were included in the binary (or if not in the binary like the lib folder is, then where?). Could it be packaged alongside, the way npm is today? Something like node-install.

@dshaw The _absolute minimal and smallest_ that you can make this general problem, to satisfy the installer contract, is this:

Given a folder with a package.json file containing a dependencies and devDependencies objects that map moduleName to specifier, guarantee that when that program calls require(moduleName) they get a version of the module that is compatible with specifier, and that this same restraint is upheld for all dependencies thus loaded. If at any point an npm-shrinkwrap.json file is encountered, then its stricter restraints on the packages returned by require(moduleName) must instead be satisfied, for that entire branch of the tree.

Since shrinkwraps can be encountered at any point along the package traversal, and dependencies can be bundled in packages, satisfying the user expectation is extremely non-trivial. Also, specifier is not just a semver range, and package namespaces can be mapped to different registries by a config file. And without building upon the work already done by npm et al, it's going to be a pretty bad user experience, even if it does produce a technically correct logical tree.

All,

Can we drive towards what y'all would actually want to see from this? Because I just have so many questions.

  1. How does the imagined node --install differ from npm install?
  2. Is that imagined node --install better than npm? Howso?
  3. Can we modify npm to satisfy those requirements?
  4. _Should_ we modify npm to satisfy those requirements?
  5. If !(3) or !(4), then are those requirements even possible?
  6. If (5), then is there anyone willing and able to do this work?
  7. If (6), then what's the best way to go about doing this work?

I've already said I think it's a bad idea (or at least, extremely wasteful and expensive), but that's largely because I know very intimately the significant cost involved in doing this, and I anticipate that it's a bad use of Node.js project resources.

Maybe I'm wrong! But until someone can clearly answer at least the first two questions in that list, there's really nothing to discuss here. Most of the conversation in this thread so far is jumping to try to answer (7). That is very premature.

@isaacs don't forget both ends of the "files"/ignores logic which is full of stuff that can randomly break people that I wish so hard I could change.

It also leaves the question open of what sorts of errors this would provide? Does it check for invalid deps? missing peerDeps? Private packages? Private git repos?

Like, do y'all actually get what you're asking folks implementing this to get into? This is literally my full time job and I barely have time for it, and I'm working primarily on stuff that would be directly used by this proposal. I am also terrified of breaking compatibility with 450k+ packages. Most people who use our stuff, we never even talk to. We'll just ruin their day. Because we wanted to save a few kb of data (yes, it's probably less than a couple hundred kb that would be saved by yanking out everything not installer-related).

You're better off making node --install literally an alias for npm install --production.

How does the imagined node --install differ from npm install?
Is that imagined node --install better than npm? Howso?

For one thing, it doesn't take an argument for the package name :)

I think the primary motivator is simple: reduce the surface area to what is needed for a production use case. I think the size of all of npm is a bit of red herring, it's mostly about reducing what can be done to what needs to be done. In terms of security this is a good practice and not one that I can easily argue against. It's a pretty natural inclination for people building large production systems to reduce what is available in those systems to the necessities.

Is it better? No, it's just less.

Like, do y'all actually get what you're asking folks implementing this to get into? This is literally my full time job and I barely have time for it, and I'm working primarily on stuff directly used by this proposal. I am also terrified of breaking compatibility with 450k+ packages. Most people who use our stuff, we never even talk to. We'll just ruin their day. Because we wanted to save a few kb of data (yes, it's probably less than a couple hundred kb that would be saved by yanking out everything not installer-related).

I think we're slightly over-estimating the impact of this feature. It's new behavior and has much less impact than existing behavior millions of people are dependent on already like npm install. People would have to migrate to using this feature and along the way would expose what is wrong with it and what its inadequacies are and at the end of the day always have a safe exit of just falling back to npm. It's actually much safer than making changes to npm install but does introduce additional maintenance burden in the future that we need to keep in mind and weigh against.

@mikeal What I'm getting at is I think you're far underestimating the "additional maintenance burden", and the ongoing cost of potentially spreading registry incompatibilities if the main clients don't sync up well enough or introduce separate bugs.

And doing all this just because of a small fraction of the download of node. There's not even any guarantee that your end-product of a full re-implementation would _actually_ be smaller, code-wise. And your sugar syntax is trivially achievable _right now_. I'm trying to understand what the difference between "several kloc of installer code" and "several kloc of installer code with a couple hundred unused loc" really is, to you. And if it's the interface you care about, why a bash script that's literally just npm i --prod in it would be insufficient.

Like, you now have two teams of multiple people being paid full time and receiving further community support who have actual expertise in doing it, who are each maintaining two different registry clients, and there's already a bunch of excellent effort that's been put into making those two alone be compatible. Where is this third team coming from? And _what's even the point_ at that point?

And doing all this just because of a small fraction of the download of node.

By most estimates infrastructure accounts for far more of our downloads than developers do.

@mikeal I literally just deleted everything not having to do with the installer from the published version of [email protected]:

$ du npm-full
...
20392   npm-full

$ du npm-install-only
...
15548   npm-install-only

$ du yarn-dist
# just for good measure
26852 yarn-dist

The latter includes removal of all docs, AUTHORS, manpages, build scripts, all dependencies and subcommands that, off the top of my head, are not critical for the installer only to work. Only about half of that drop came from removing code.

The installer is the vast majority of the code for the CLI as-is, and that's what you're talking about rewriting. npm isn't some fat tool that has literally everything you could ever want. It is, for the most part, just an installer with a couple of allowances. You'd get a bigger boost by taking the current codebase and minifying it tbh. Have you considered that? I mean, are we trying to conserve disk space, or achieve some standard of purity "untouched" by "unneeded" things?

By most estimates...

I meant the total distribution size. Full-fat npm currently uses 3.2M tarred. The .pkg for OSX Node7.7.2 clocks in at ~17M.

Let's keep in mind that the originally intent of this discussion is to discuss a hypothetical option we could take, what the scope of that action would need to be, and whether it would make sense to keep exploring it. So far we've have a great discussion from a small group of people whose points of view are pretty well established in this space. Let's make sure we don't rabbit hole too much right out the gate and end up discouraging others from weighing in also. I'd really like to get input from the larger group of @nodejs/collaborators on this.

I'm just going to point out that if I wanted a stripped-down "production only" node, I wouldn't want install stuff either because I would be supplying the tree from an outside build process. imo if you're running an install on a prod cloud box, you're doing it wrong. I would consider this cruft.

@jasnell It's worth noting that three of the people with the most experience in this particular aspect have given their informed technical decisions here. I feel "point of view" is a bit dismissive of how familiar some of us are with the scope and technical concerns of the solution that was proposed. I'm not speaking here from some political leaning about the purity and supremacy of https://github.com/npm/npm. I'm speaking as someone who has internalized just how big the scope of this really is.

I am super interested in hearing more from folks like @jfhbrook about what their deployment processes and woes are. There is, in my opinion, a really interesting problem space there that hasn't quite been tackled yet.

And as I said above, I'm all for looking for (and helping with!) solutions that solve the distribution issues users have run into. And that does include having a standalone installer. I'm just trying to make a strong point that, from my perspective, the benefits for the stated use-case are minimal, and the cost is basically astronomical, if the solution is "write our own thing again".

I am super interested in hearing more from folks like @jfhbrook about what their deployment processes and woes are.

Not trying to distract from this issue too much, but maybe this will be helpful. I don't claim to speak for everyone, and I'm sure there are a lot of ways to deploy node. My experiences are pretty specific to cloud infra.

What I like to do is make a build job that runs npm install, test, etc., on my project, then tarballs the whole shebang and shoves it onto s3. The webserver should basically never ~install~run npm (edit: I don't really care about a few megs in the grand scheme so I wouldn't bother stripping npm out of my webserver). This is kinda annoying from an infrastructure standpoint (you need a build server with the same arch as the webserver, bare minimum) but it's nice because (a) the webserver can be really tiny, too tiny for npm to run (I don't know npm's memory requirements but I know that in the 1.0 days it could easily blow over 256mb), and because (b) I know the same thing will get deployed no matter what.

"why not shrinkwrap," you might say? Well for one, it doesn't solve (a), but for another, I've found it unworkable because they're OS-specific. As soon as someone using osx generates a shrinkwrap that includes osx-only optional dependencies (possibly for non-prod dependencies, cause a developer workspace necessarily has dev tools), it's game over. Maybe I'm doing it wrong, dunno. But that's been the best way I've found so far to deploy node code.

EDIT: Oh also, my approach works if npm is down but s3 is up. <_<;

Unless the --install functionality can be uninstalled, or the addition is a negligible size (which seems unlikely), it seems the better option to optimize for size would be continuing to use npm / yarn, at least in the docker example. I can currently install npm, install project dependencies, and uninstall npm when creating my immutable artifact. I'm then bundled without npm, whereas with this feature i'm left with that code necessary to support this feature.

I would be interested to get an idea of how many node users fit into the following buckets (understanding that some people will fall into multiple):

  1. developers, installing node for development (no change needed)
  2. people who solely need production-only installs on specific projecs, and actually do installs in production (who this issue seems aimed at addressing)
  3. people like @jfhbrook (and airbnb and twitter and i assume most enterprise users) who create deployable artifacts prior to production, and thus don't need npm in production at all (it's already possible to compile node without npm) and don't care about the extra download size in their build environment

My suspicion is that the vast majority falls into the first and third categories, and that the second is vanishingly small, but I'd love to see more data on that to confirm or deny. The 5k savings in the second scenario implied by @zkat in https://github.com/nodejs/node/issues/11835#issuecomment-286303772 also seems negligible, even thought it obviously adds up over time.

Is there any way to get concrete numbers that would give credence to the need to address this problem?

I've skimmed the thread and haven't read every post in-depth yet, but this point seems to not have been mentioned yet: __making this core functionality would make it impossible to version it separately.__

It's not uncommon today to manually install a newer version of NPM that doesn't match the version shipped by the Node.js version that's installed. By moving things into core (even if it uses the same code), this would become impossible. While it's still possible by installing a newer NPM in addition, this would rapidly start to look unappealing because it requires what people perceive as "extra work", and there would be no guarantee of it integrating with node --install since there may be API incompatibilities.

Additionally, since users tend to gravitate towards built-in features (even if there is no good technical reason for it), this would likely significantly reduce ecosystem support for using "NPM proper", with all the weird contraptions that result from that - "I can't publish this library because I don't want to learn and use NPM, because Node already does installing packages for me" is just one of the scenarios I can easily see happening, based on my experiences with helping people in #Node.js.

While I have my issues with NPM, as well as with the fact that Node.js is currently shipping a package manager that's outside of the control of the Node.js Foundation and operated commercially... there is significant ecosystem value in having a single integrated package manager that handles everything from installation to publication, and that is available and recommended by default.

Therefore, I don't believe that adding a 'stripped down' version or alias of NPM to Node.js core would be a good move to make.


Finally, I think that this solution doesn't address the real problem in the first place; the real problem that I constantly see people having issues with isn't that NPM ships with Node.js by default or that it's needed separately (since it's still just a single command to install an application anyway). Rather, the following two problems are commonly mentioned:

  • NPM needs unreasonable amounts of memory when installing packages (sometimes >1GB), making it impossible to install dependencies directly on smaller VPSes.
  • NPM requires a network connection to work, and there is no obvious way to just hand it a pile of tarballs and have it install those, on a network-isolated system.

The first point can be solved simply by fixing NPM (or even replacing it), since it's purely an implementation issue. The second point is already sort of solved (you can simply tar up or rsync your entire project including node_modules and do an npm rebuild on the server), but it's not obvious to people that this works, so that's a documentation issue. Neither of these really require anything to be changed in core.

The problem is, not every Node.js install is used by a developer. Many installs happen in infrastructure. These installs run an application and are never touched by anything but infrastructure automation. Yet, these installs still include npm and, in fact, require npm in many cases because it is the best mechanism we have for installing the dependencies the application needs.

We are not in agreement on above sentence. I believe that the industry is moving towards immutable infrastructure, where _servers do not install software_. However, I'm very open towards a bare binary/build for servers that does not ship npm.

Let me preface all of my thoughts: I don't think the goal here should be to provide a package manager.

What I do think we should ship is a bundling / installation mechanism. One of the largest features of npm not talked about yet in this thread is it has support to extract the .tar.gz files that are piped from the registry to Windows machines. This is very valuable. It standardizes a distribution format and has a builtin way to handle extracting that distribution format. Also, the ability to default to a sane installation location for global installs across environments is very valuable.

I think we should provide a way to generate such an archive similar to a simply archiving the application directory including all dependencies. I think we can discuss cache mechanisms only after we discuss signing.

I think that lifecycle scripts automatically running should probably be left out at least for the initial implementation. People can provide lifecycle scripts in any CLI or package.json scripts they ship that request privilege escalation if they need it.

I do believe there is a valuable use case in being able to run package.json "scripts" via node directly. For machines without npm you sometimes end up having to generate scripts by hand or some tool like npm-runscript. The scope of this, such as how much env vars are set, is certainly open to discussion.

With signed packages it would be easy enough to declare shared library like dependencies. Similarly things such as declaring which keystores/policies to use when running or installing becomes valuable. However, I think these are needed at this time since we don't have signed packages.

In summary, I think that in the case where we don't want npm on an environment we need:

  • to be able to install a standard file format across all environments
  • a way to install to a sane global location
  • a way to run package.json scripts

All of that in mind, I do not think .tar.gz is my main choice for what kind of standard file format to use if we want to be able to share things with web targets or creation of self extracting binaries. webpackage is starting to take shape as a preferred route for me. I previously endorsed .zip since it could also be used for self extracting binaries.

So, I've done a bit more thinking about my points I expressed above concerning ecosystem consequences, and I think I can generalize the argument more: __aliases or alternative ways of doing things are undesirable.__

In pretty much every case I've seen where two ways were provided to do something - usually one simplified way for the 80% of easy cases and one trickier way for the remaining 20% - it has led to a number of issues.

Essentially, these kind of design decisions introduce a larger API surface without actually providing more functionality, leading to 1) a larger API to maintain, support and version, 2) a larger API for developers to have to learn, 3) an incentive against learning the full, original API.

Some practical examples of this occurring:

1.) In __Express.js__, one can directly call things like app.get instead of creating an express.Router first. Internally, app.get is just an alias for router.get on an internal router - it doesn't do anything special.

The result is that people learn to work with app.get first, and become very hesitant to ever use routers, even when they are obviously the right solution, and even when it's beneficial to robust error handling (example here). All the while the existence of app.get doesn't really yield any benefits.

2.) In __Knex__, there are a number of different syntaxes for doing a SELECT * query:

  • knex("table").where({foo: "bar"})
  • knex("table").select().where({foo: "bar"})
  • knex("table").select("*").where({foo: "bar"})
  • knex.from("table").select("*").where({foo: "bar"})

All of these are valid, and do effectively the same thing. A user has to understand all of these syntaxes to be able to read a piece of code that uses Knex, and not misinterpret its workings. Some people even mix two or more of these styles throughout their codebase!

It also introduces complexity for the Knex maintainers - how should the implementation deal with things like knex("table").select().from("othertable")? Should the second table specification override the first? Should it throw an error? Which of these make sense from a "preventing footguns" and "increasing usability" perspective?

3.) In __Bluebird__, there are a number of aliases:

  • .caught -> .catch
  • .nodeify -> .asCallback
  • .lastly -> .finally
  • ... possibly more...

While there are good reasons for some of these (namely, browser compatibility), and the aliases are not very commonly used, it still means that users need to learn about a bigger API surface to reliably understand Bluebird-based code - if they're reading code that uses the aliases, they still need to understand what they are aliases for. This increases the learning curve.


Finally, for all three of these example cases, it means that the aliases cannot be removed easily; they need to be supported essentially in perpetuity (even between major bumps), because otherwise there's the risk of ticking off users. The only winning move here is not to play; the aliases shouldn't have been added in the first place.

Hence, I'd argue against any kind of "aliasing" or "multiple ways to do a thing" implementation. It may seem like it makes things easier on paper, but in the end it'll just complicate things, fragment the ecosystem, and likely teach people bad habits and remove the incentive for them to learn about the 'proper' way of doing things.

@alfiepates can you clarify whats problematic about my ideas?

@bmeck I do not appreciate being put on the spot like that. I generally react to things when reading a thread, but I'll only make a comment if I have something to say that isn't already being said better by somebody else. Please don't distract from the thread topic with personal callouts.


That said, my opinions on this topic are as follows:

Optimise and improve npm, do not attempt to work around npm's shortcomings by building "npm-lite".

If we need a method to deploy node applications standalone without installing npm on a client, this should be npm functionality.

@alfiepates I saw a thumbs down w/ no clarification. I wanted to know why. Thanks for explaining :).

If we need a method to deploy node applications standalone without installing npm on a client, this should be npm functionality.

I am not sure I understand this. How would npm deploy things if it isn't on the target machine.

I saw a thumbs down w/ no clarification. I wanted to know why. Thanks for explaining :).

Pay attention to what I thumbs up'd :wink:

I am not sure I understand this. How would npm deploy things if it isn't on the target machine.

I have a wonderful (okay, it's a hacky mess) little script that runs npm install --production locally and spits out a tar'd node_modules that I can rsync over to a remote machine and untar. About 25% of the time I can even get away without running npm rebuild.

I'm sure it's wouldn't be out of the question for npm to incorporate similar functionality with a slightly more polished interface and no requirement for npm rebuild on the remote?

Pay attention to what I thumbs up'd 😉

I did, but also saw only mine was thumbs down'd. Seemed like mine in particular was a problem.

What problem is this trying to solve? The binary size of npm?

By not bundling npm with node, there is at least the option of removing npm after deployment. Bundling adds to the binary size regardless of how lightweight you make it.

I think we should provide a way to generate such an archive similar to a simply archiving the application directory including all dependencies.

If this can be boiled down to making and running/extracting some sort of compressed bundled package I'd be cool with that. I'm not exactly in favor of webpackage though, as given what I know about it it addresses usecases in different environments and with different constraints than a default node install as we are talking about here. I'd prefer to go with something simpler.

I'm not exactly in favor of webpackage though, as given what I know about it it addresses usecases in different environments and with different usecases than a default node install as we are talking about here. I'd prefer to go with something simpler.

As long as we can re-use w/e format for creating single file binaries I'm fine w/ whatever. Compat with web would be nice to have, definitely not mandatory.

Great writeup @mikeal! I wanted to share a few thoughts here from my perspective.

  1. I don't have a clear understanding of the problems with npm in node.js in it's current state, or the benefits we get moving it out. In other words - what problem are we solving? Short of reducing the size of the install required to effectively install dependencies, I'm not sure what we gain.

  2. I think a lot of this comes down to best practices with deployment. A bunch of platforms taught us that npm install --production during deployment time was a fine model (my PaaS included). But... is this actually a good practice? In other ecosystems, I've generally been an advocate of producing a single binary artifact as part of my build system, and then deploying that artifact (with no changes) to production. In my world, this artifact is a docker image that's generated at build time. I like this for a few reasons:

    • The npm service shouldn't gate my deployment. I need to be able to deploy, even if npm is down.
    • I want binary verification of the image/build I'm running in production.
    • I don't actually trust shrinkwrap to always download the same version of a module (paranoid).

I like the general guidance of using npm during artifact build time, but not as part of a deployment or production workflow. Given that - I don't see a lot of value (for me) in having a separate node focused command in installing dependencies.

My opinion on this has shifted a bit over the past year or so. While npm cli was the only option available to users, I favored the exploration of an alternative, light weight client because, frankly, competition in implementation is always a very good thing to have. Since then, however, yarn has emerged and has filled that gap. Given the existence of multiple clients, I do not believe that core should be in the business of creating and maintaining it's own registry client. Rather, core should be in the business of facilitating competition, interoperability, and consistency, while focusing on other equally (if not more) important aspects that get into some of the thoughts that @dshaw, @mcollina and @bmeck have been expressing.

Specifically:

  1. Core should own the specification for the absolute minimum install layout for modules. That is, what should be the standard layout of modules on disk post install, including the minimal steps necessary to produce that layout. This does not mean taking ownership of the code implementation, just the specification itself. The idea is that multiple registry client or packaging implementations could be written to such a specification to produce a reliably predictable layout of modules on disk. Essentially, the process for an implementer should not be "Copy what npm currently does right now and change as they change", it should be, "Write the code to this codified specification". With cooperation from the various current implementers it should not be too difficult to come up with such a specification.

  2. Right now, there is no reliable and independent source of truth for the identity of a module publisher. I can install the exact same code from the npm registry and github and have absolutely no reliable way of knowing if the publisher is the same for both. I can examine SHA's, yes, but there is no cryptographic assurance that the code, or the publisher, or the registry itself, is a source that I can trust. Further, ownership of modules can change in undetectable (or not immediately detectable) ways that could have extremely detrimental impacts on the security of my code. Signing of modules needs to be a real thing so that I can reliably determine whether the source of some bit of code is trusted regardless of whether it came from a registry, a GitHub repo, a webpackage, etc. Personally, I view this as being the much more critical issue that needs to be solved.

  3. Install options are important. Once options like yarn are a bit more mature, I would like to see Core move away from having the npm client vendored in within the core repo as a dependency and have an installer option that allows users to choose either (a) no registry client, (b) a choice of registry clients, and have the installer handle the details of where and how to grab that. In other words, the actual core releases should not bundle the client at all -- tho any specific installer might.

Core should own the specification for the absolute minimum install layout for modules. That is, what should be the standard layout of modules on disk post install, including the minimal steps necessary to produce that layout.

I agree with this, but I'd like to add that standardizing version specification in package.json should be considered as well. Not having a consistent way of dealing with versions could easily lead to a very fragmented ecosystem with very different conventions and guarantees, especially once the door is opened towards "anybody can build a package manager and stand a chance of people using it".

I would say that NPM's current approach to versioning (semantic versioning, etc.) is pretty sane, but that especially in the area of "locking specific versions" some work remains to be done. NPM and Yarn, for example, take a fundamentally different approach to this, and various alternative package managers do not support shrinkwrap at all.

Something else to carefully consider would be what the default should be for working with dependencies in Node.js, regardless of package manager; defaulting to 'locked versions' makes sense for specific organizations that have the manpower to actively track dependency updates, but would cause ecosystem-wide security problems for everybody else - I've explained more about that here.

At the same time, there absolutely must exist a standard way to specify and use locked dependencies for those who wish to use them, and it should interoperate between package managers (which, as I understand it, is not the case now).

Leaving every package manager to set its own versioning defaults and implementations is likely to cause significant fragmentation in the ecosystem, which will especially confuse users that are new to Node.js. It will likely also considerably reduce semantic versioning usage.

This can already be seen on a smaller scale with Git dependencies in NPM, which behave very differently from how people expect (and I've taken to just explicitly recommending against using them at all, in favour of something like Sinopia).

So, uh, I think creating an installer spec is kind of a bad idea. The way I see it, there's only a few ways this can go:

  • The spec is identical to npm as it stands, at which point yarn is non-compliant and nobody really cares about the spec anyway except to encode npm's current behavior (and what happens when npm wants to change behavior?). This also applies if node core chooses yarn as the standard.
  • The spec only encodes behavior that's common to both npm and yarn, in which case it encodes very little.
  • The spec is compliant with none of the existing package installers, at which point either nobody implements it and it's useless, or core creates yet another packaging implementation and we're even more fragmented

I also smell a little bit of a power grab here. I think some elements of core resent the idea that they don't own npm, and this would be a political move more than something that's actually useful.

I'm also going to float the idea that y'all can safely close this issue. idk about you but from my reading nobody actually wants node --install, even if/when they do want things to be different somehow (deploy artifact tooling, which should probably live outside core and possibly outside npm to start with; signed package artifacts; a package spec; source tarballs sans package manager).

I like the idea of the basic installer coming with node. It would allow to install node alone in production. However, I don't think it should be a npm light. Instead it would be used to install a production bundle that npm/yarn/whatever produce.

I'm also going to float the idea that y'all can safely close this issue. idk about you but from my reading nobody actually wants node --install, even if/when they do want things to be different somehow (deploy artifact tooling, which should probably live outside core and possibly outside npm to start with; signed package artifacts; a package spec; source tarballs sans package manager).

I think we can discuss this, but I do find value in having a that kind of tooling inside of core itself.

On the topic of naming, I am not entirely sure what kind of nomenclature should be used, but "install" seemed apt. if we are taking an archive and unbundling it to someplace. Open to bikeshedding.

@bmeck I assume you're specifically talking about deploy tooling here. Correct me if I'm wrong.

I think we can discuss this, but I do find value in having a that kind of tooling inside of core itself.

I would make the argument that outside core would be the best place to experiment with what deploy tooling would look like. If a tool became a de facto standard like npm did, then bundling it with node could be a reasonable move. If the tooling was developed by npm, then it would get bundled, I suppose, as a matter of course.

It seems like a pretty fine line in terms of deciding what should get bundled inside a node installer, but in general I think people agree that it's easier to innovate outside core and easier to keep something stable and available from inside core. node-gyp and nan are afaik core-sponsored so that would be a reasonable model to work off.

On the topic of naming, I am not entirely sure what kind of nomenclature should be used, but "install" seemed apt.

Normally I'd agree, but my expectation for install is largely to do what npm does. I'd expect deploy tooling to do something similar to npm pack except specific to deploys rather than specific to shipping packages. So, uh, I'd call it "bundle" personally.

In general, I'd be interested in someone writing a v1 of this tool, though again, I'd suggest that initial cuts should live on the registry, with the option to bundle a la npm if it turns out to be a generally agreed upon good idea in retrospect.

(by deploy tooling, I mean something you would run on your build server to generate a deploy artifact, not something you would run on the application server)

In general, I'd be interested in someone writing a v1 of this tool, though again, I'd suggest that initial cuts should live on the registry, with the option to bundle a la npm if it turns out to be a generally agreed upon good idea in retrospect.

This seems reasonable but we need to agree on a target specification for whatever such a tool does. I don't think this tool would be shipped without some experimentation and implementation feedback. I would like to keep this issue open to create such a specification, specify use cases, discuss scope, and see if it is reasonable.

airbnb and twitter and i assume most enterprise users, who create deployable artifacts prior to production, and thus don't need npm in production at all.

What tools are people using to bundle their apps up this way? I'd like to look through them and see what we might be able to do to help facilitate this workflow.

@jasnell

I do not believe that core should be in the business of creating and maintaining it's own registry client. Rather, core should be in the business of facilitating competition, interoperability, and consistency.

FWIW the version-management discussion group reached a similar conclusion on nvm or other version managers in the nodejs org (see https://github.com/nodejs/TSC/issues/96#issuecomment-277307261). We decided that the Foundation's job first and foremost is to spec behavior and enable multiple compatible implementations.

While not a strict requirement, implementing node --install _somewhat suggests_ adding tar support to core.

I'm not sure whether that's an argument for including tar in core or an argument against node --install, but I think it's an important example of the kinds of side effects not already mentioned.

i assume most enterprise users, who create deployable artifacts prior to production, and thus don't need npm in production at all.

What tools are people using to bundle their apps up this way?

For what it's worth, we absolutely don't use NPM to install anything in-situ on a production machine. For one thing, we want to deploy the same artefact everywhere, and running npm isntall repeatedly has never (in practice) really guaranteed an identical result. Also, it puts a number of third party network services into the build pipeline, which haven't always been particularly reliable.

Instead, we build container images (in Jenkins jobs) which include a pre-built version of all the modules that make up a particular service, some installed via NPM and some via other means (e.g. pkgsrc, Makefiles, etc), and then just deploy new instances of that sealed container image. We generally don't even include NPM in the image. This way, even if the build tools are unreliable, those issues are at least isolated to producing _new_ builds, not deploying new instances of things already built.

I would recommend that anybody seriously deploying Node-based software in production do the same thing: pre-build (via some CI process) your tree of artefacts and store the result as an immutable artefact, whether using something like Docker or even just a tar archive, and then deploy that. When you need a new version, build a new image and swap out the old one.

Core should own the specification for the absolute minimum install layout for modules. That is, what should be the standard layout of modules on disk post install, including the minimal steps necessary to produce that layout.

Doesn't it already? Or are you saying that core should own the contract described in my previous comment as well? These are things that are already written down, and there is a _tremendous_ social cost to changing them, so if you want em, they're yours.

As far as multiple competing implementations, I find it kind of annoying that y'all give yarn so much credit here when ied, pnpm, npmd, npm-install and a bunch of others have been around for a lot longer. I know that yarn has more PR and a big corporate brand behind them, and it's a fine client, but really? It's not like competition in npm registry clients started the day Facebook got involved. It's kind of rude to the developers of those other programs to act like they haven't been maintaining these things all along. npmd predates the Node Foundation by 2 years. When people complain that the NF only cares about big-name enterprises, stuff like this is why.

The npm cli team are (as @zkat has described) actively working on splitting up the npm client to make it more easy to build interesting things that talk to the npm registry in novel ways. We would love to see more clients pop up. In fact, I'm hoping that yarn takes advantage of pacote and cacache to go even faster and be more reliable. OSS seas lifting all boats, etc.

We're working on package signing, which is extremely difficult to do in a way that is both useful and usable, and if the history of GPG artifact signing has taught us anything, it's that it's useless if no one uses it. It's not easy! And getting a web of trust that is actually trustworthy is also challenging. I'm not sure yet, but I'd bet that the next step here will involve Keybase, who've already done the unthinkable task of making GPG somewhat usable. (I'm already talking with them about this, no idea what the final thing will look like.) Also, this has nothing to do with the installer, except that, once packages are getting signed at publish time, installers ought to verify those signatures. (Signed and unverified is worse than unsigned.)

I see a lot of people asking about a static/frozen artifact deployment tool. We've been sketching out how to do that as well. This is partly because I think there's an interesting business there, and partly because we kind of _have_ to build most of the pieces of this to stay ahead of npm registry growth, and partly because we already have the technical expertise and moving parts to do it well. I don't have an ETA on when we'd ship something in this space, and it'll certainly come in pieces, but (a) it's off-topic and probably not what node --install would be anyway, and (b) super exciting to see so much interest in something we've been sketching. I agree that "run npm install --prod on your production server" is a bad pattern, but it's a common one, because it's extremely easy. We need to deliver something easier that is also better. (The long term goal is to productize what we use to deploy the npm website and registry services.)

Lastly, I hate to break it to y'all, but if there was a spec that nails down exactly what npm et al do today, _we fully plan to break that spec_. If we can build a new registry API and a client to access it, which speeds up installs by a significant amount, we're going to do that. When we added deduping in npm3, it would have violated the specification had it been cast around npm2. I feel a moral obligation to do this, because improvements in npm benefit our community.

I'll re-iterate what I said above: node --install is a bad idea. Not _obviously_ bad, but a ball of complexity that no one wants in their node binary, and no one is willing to make the tradeoffs required to make it work. It's a leap to an implementation detail where no one has any idea what problem it addresses. "Just the install bits" of npm is 90% of npm, and the resounding sentiment seems to be that bundling a tar parser (and the rest of npm install) as core modules is a Bad Idea. (Personally, I think that'd be kinda cool, if only because it'll force the creation of nodecc, but that's also way out of scope of this issue.)

Can we drive towards what y'all would actually want to see from this? Because I just have so many questions.

  1. How does the imagined node --install differ from npm install?
  2. Is that imagined node --install better than npm? Howso?
  3. Can we modify npm to satisfy those requirements?
  4. _Should_ we modify npm to satisfy those requirements?
  5. If !(3) or !(4), then are those requirements even possible?
  6. If (5), then is there anyone willing and able to do this work?
  7. If (6), then what's the best way to go about doing this work?

We're working on package signing, which is extremely difficult to do in a way that is both useful and usable, and if the history of GPG artifact signing has taught us anything, it's that it's useless if no one uses it. It's not easy! And getting a web of trust that is actually trustworthy is also challenging. I'm not sure yet, but I'd bet that the next step here will involve Keybase, who've already done the unthinkable task of making GPG somewhat usable. (I'm already talking with them about this, no idea what the final thing will look like.) Also, this has nothing to do with the installer, except that, once packages are getting signed at publish time, installers ought to verify those signatures. (Signed and unverified is worse than unsigned.)

This should be talked about also for runtime startup verification, not just installation. It would be problematic if Node core and npm were not in sync in how these verification steps occur. I personally am not fond at all of GPG and like x.509.

I see a lot of people asking about a static/frozen artifact deployment tool. We've been sketching out how to do that as well. This is partly because I think there's an interesting business there, and partly because we kind of have to build most of the pieces of this to stay ahead of npm registry growth, and partly because we already have the technical expertise and moving parts to do it well. I don't have an ETA on when we'd ship something in this space, and it'll certainly come in pieces, but (a) it's off-topic and probably not what node --install would be anyway, and (b) super exciting to see so much interest in something we've been sketching. I agree that "run npm install --prod on your production server" is a bad pattern, but it's a common one, because it's extremely easy. We need to deliver something easier that is also better. (The long term goal is to productize what we use to deploy the npm website and registry services.)

This also could be important if instead of shipping around extracted archive you just ship an archive itself, like how .jar files work for Java. This I think is in part why I want a standard setup around these artifacts in Node core. There are many scenarios around this that get complex as you do things such as shared library like behavior. It is also complicated by the fact that most existing formats have poor support for things like internal symlinking and/or signing.

Lastly, I hate to break it to y'all, but if there was a spec that nails down exactly what npm et al do today, we fully plan to break that spec. If we can build a new registry API and a client to access it, which speeds up installs by a significant amount, we're going to do that. When we added deduping in npm3, it would have violated the specification had it been cast around npm2. I feel a moral obligation to do this, because improvements in npm benefit our community.

I personally want to mandate anything about the registry/client as that is something I don't think should belong in core due to all the already competing standards. However, stating that you plan to break a non-existent spec only leads to concerns about hostile lock in on my part.

the resounding sentiment seems to be that bundling a tar parser ... as core modules is a Bad Idea.

I fully expect something like this to ship if Node ever wants to have a deployment system as nice as .jar.

Will all this in mind, I continue to state that this issue is about discussing scope and use cases. The name node --install may be misleading if those use cases are different from what people expect.

I can only speak on personal points about your questions, but I will answer them to the best of my ability with my goal of previous comment

  1. How does the imagined node --install differ from npm install?
  • It can be run on machines without network access
  • It intentionally has a more limited set of functionality including not running lifecycle scripts by default
  • It does not have publication / login capabilities
  • It may not extract an archive to a dir structure
  1. Is that imagined node --install better than npm? Howso?

Slightly different purpose, not better or worse. It is meant as a way to install a deployable artifact whereas npm downloads source code for development/build purposes.

  1. Can we modify npm to satisfy those requirements?

Probably. However, as you and others in this issue have stated we have many competing clients so it would also seem pertinent to make the standard in core somehow. This would remove the concern of incompatible packages by either providing an implementation or a standard on how a package must be built. It sounded above like a standard might be broken on purpose; so, I would lean towards a minimal reference implementation in light of that.

  1. Should we modify npm to satisfy those requirements?

Unclear. Needs more talk about what the needs are. I think the decoupling of some workflows of npm may in many ways already be satisfying this if/when we actually get done discussing what the scope of the end goal of this is.

  1. If !(3) or !(4), then are those requirements even possible?

Yes. However, I think the implication that if npm cannot satisfy a requirement it might not even be possible is part of the concern with the evolving ecosystem of how people handle packages and deployables for Node.

  1. If (5), then is there anyone willing and able to do this work?

After ESM, this is next for me. Still out in the year+ time range before anything would ship LTS.

  1. If (6), then what's the best way to go about doing this work?

Probably before we get started we should specify the exact goal and get feedback from the relevant parties (npm, pnpm, yarn, large deployment PaaS, etc.).

Specific topics require some research:

  • Effort on package signing both by webpackage and npm. Need to sync on that.
  • Discussion of file formats involved.
  • Discussion of eventual goals such as shared library behavior, archive loading like .jar, single file binaries, etc.

With all of that still to be discussed I can't answer this question.

What you would need to accomplish this at a minimum is a URI and a configuration instruction.

The URI could be an NPM package, a Github repo, or anything else. The wonderful thing about URI is that it provides a unique identifier even it doesn't resolve outside the local network, file system, or at all.

The configuration instruction could be a string that executes as a child process. Executing it as a child process has the benefit of executing a step that may not even be related to Node, such as running a local C++ utility. This configuration step could be something that brings in dependencies in a NPM like fashion or not.

From the perspective of Node core that is it. Too simple. Let's not run away with imaginary requirements that exceed the scope of this project. If packages require things like shrink-wrap or other advanced configuration requirements then the package maintainers will include access to these advanced features in their packages.

In my understand, we're facing the following two problems:

  • the core size can be too large
  • npm doesn't match some cases:

    • someone wants to have a "minimum" install which is not easy to define into details

    • someone wants to run on a low memory machine

As a solution to the first problem, I think we can have a build option to not bundle npm. It's something like --with-intl=none.

For the second problem, we can build new install command by ourselves, but it doesn't mean we should replace npm with it right now. It's better to install with the core that doesn't bundle npm, and install it by yourself since the definition of the "minimum" can be unclear. At least, it will be hard in NodeSchool if the core doesn't have npm as default because the workshoppers are depending on the current install ecosystem. I'd like to care the community as well while we're figuring out what is the minimum install.

@bmeck Thanks for taking a crack at answering those questions.

So, from your point of view, it sounds like node --install isn't about installing deps at all, but rather about unpacking a pre-built binary artifact of some sort? That's interesting, but I'd be surprised if many others on this thread shared that expectation.

I personally want to mandate anything about the registry/client as that is something I don't think should belong in core due to all the already competing standards. However, stating that you plan to break a non-existent spec only leads to concerns about hostile lock in on my part.

Well, we don't "plan to break a non-existent spec", because you can't break specs until they exist ;) My point was that, if core decided "we need to own this spec!", my point of view is, "ok, great, you own a spec, but why should npm follow it?" In other words, no, core shouldn't own that spec, because that isn't better and doesn't make sense. npm will continue to define npm behavior, because that process works. It has clearly not prevented innovation or collaboration from other parties.

As always, we'll be developing in the open, maintaining frankly ridiculous backwards compatibility support, covering npm's behavior in documentation and release notes and blog posts, and splitting out functionality into modules with the hope that others use them. What does "hostile lock in" mean?

I'm not sure what you mean by "competing standards". What standards are competing?

Well, we don't "plan to break a non-existent spec", because you can't break specs until they exist ;) My point was that, if core decided "we need to own this spec!", my point of view is, "ok, great, you own a spec, but why should npm follow it?" In other words, no, core shouldn't own that spec, because that isn't better and doesn't make sense. npm will continue to define npm behavior, because that process works. It has clearly not prevented innovation or collaboration from other parties.

As always, we'll be developing in the open, maintaining frankly ridiculous backwards compatibility support, covering npm's behavior in documentation and release notes and blog posts, and splitting out functionality into modules with the hope that others use them. What does "hostile lock in" mean?

There's currently an extremely high bar to clear to build a package manager that doesn't break existing NPM-installed setups, because there's no clear specification on what is expected from an NPM-compatible package manager (and the NPM CLI codebase has a lot of shared mutable state, making it hard to understand).

I'm not aware of any comprehensive, complete, unambiguous reference for NPM behaviour. If one exists, I would much appreciate a pointer towards it.

What does "hostile lock in" mean?

I mean that while we are locked to using a dependency, due to the sheer complexity of attempting to support their backwards compat, like npm, v8, etc. seeing statements about preventing new features/standardization and assertions that creation of those would be ignored; it seems like that lock in might be problematic considering other systems like chakra, pnpm, yarn, etc. exist. Backwards compatibility must be taken on with great care, but creation of new standards that do not affect backwards compatibility should also be considered. It sounded like any creation of a new standard would have been a problem, but after reading the most recent comment with In other words, no, core shouldn't own that spec, because that isn't better and doesn't make sense. we really need to discuss what such a spec looks like and what it affects.

What standards are competing?

How clients place things from package.json on disk, usage of symlinks, different caches, mutating NODE_PATH via shims, etc. I don't think there are competing [specifications] in the registry, but there are definitely differing implementations. Those differing implementations such as artifactory and npm Enterprise do have differences that I have to work around while doing ops work, particularly around authentication. I was stating the client and registry differences as "standards" even though they do not have formal specification.

@isaacs:

Doesn't it already? Or are you saying that core should own the contract described in my previous comment as well?

To an extent, yes. But documented better and more completely so that other implementations can produce a valid result on disk that node can use successfully without having to reverse engineer what the npm client is doing.

As far as multiple competing implementations...

This entire paragraph about things that annoy you is off topic and is not constructive to the conversation.

We're working on package signing, which is extremely difficult to do in a way that is both useful and usable

Ok. Good to hear that it's on the radar. That said, this is definitely something that should not be done unilaterally then thrown over the wall at some point. There are many approaches to this and several paths need to be explored collaboratively and openly within the ecosystem to determine which is best. I'll be very happy when something practical emerges here. (I will also say that I share @bmeck's concerns around using GPG, but that's a different conversation for a different thread.)

Lastly, I hate to break it to y'all, but if there was a spec that nails down exactly what npm et al do today, we fully plan to break that spec.

No one has suggested such a thing. What has been suggested is a spec that more completely and strictly describes the minimal footprint of an installed module so that tools can be implemented to meet those requirements without being required to reverse engineer the npm client. That's quite different than nailing "down exactly what npm et al do today".

I'll re-iterate what I said above: node --install is a bad idea.

Assuming you mean node --install when defined as a lightweight alternative npm client, I fully agree. In fact, this appears to be the overwhelming sentiment throughout this entire conversation so far. I've yet to see anyone step up here to really assert that node --install as a lightweight npm replacement is something they would want. Given that I believe we can put that particular question to bed. As you have noted, however, several individuals (@bmeck for instance) have a much different notion of what a node --install could be and I'd really like to explore those ideas further.

This entire paragraph about things that annoy you is off topic and is not constructive to the conversation.

Yeah, but that paragraph also made what I think is a fair criticism of the node foundation, which I'll also quote because I think it's important that you at least meditate on this offline:

When people complain that the NF only cares about big-name enterprises, stuff like this is why.

It would be a mistake to ignore that merely because it's "off topic".

As you have noted, however, several individuals (@bmeck for instance) have a much different notion of what a node --install could be and I'd really like to explore those ideas further.

Can we please at least call this something else, then? OP was very clearly talking about a very different feature.

@jasnell as @jfhbrook says, this thread is deviating for a very different feature.

I think you're pretty much right that this feature request, as stated, can be safely considered dead in the water. Here's some other stuff that got brought up in the thread, some of which is good stuff to keep thinking about:

  • If you want to talk about specifying the contents of package.json (which, for at least some fields, is not a terrible idea: breaking {devD,peerD,optionalD,d}ependencies is a serious risk for the ecosystem when it comes to more competitors entering the race, and those are good examples of places where people keep wanting to expand/extend things and the answer just has to be categorically "no". Specifying the trees generated by those dependencies, though, is a terrible idea, because it's literally at the core of how the different clients structure their work, for different performance/reliability benefits. We already have a standard for what these dep trees need to achieve: that's the module loading algorithm, which my team treats with the utmost care. I think standardising thinks like npm-shrinkwrap.json and bundleDependencies is premature at best.

  • Distribution should be prototyped in at least one or more existing installers before we even talk about standardizing on it. This is something massively dependent on user experience, and we need good exemplars in production that prove their strategy before node even considers doing automagic bundle.tarjs loading or anything at all surrounding that space. Go ahead and encourage people to do it. Don't standardize prematurely because you will get it so wrong it'll make your head spin.

  • Ditto for package signing. You can have any opinion you want about GPG vs X.509. In the end, this is something that requires a proven implementation that balances user experience and security. I don't remotely trust anyone to hand me a spec before a popular implementation is in place. If you want to prove that something works, write a prototype and prove it, rather than trying to clamp down on the possibilities based on your own ideas. If you want help with those prototypes, it's in our interest to help make them successful. (this one and the previous one are mainly directed at @bmeck)

To summarize, I think this feature request should be closed because:

  • it is not clear what node --install was meant to achieve, and the reasoning of it being about deployment concern fell flat on pretty much everyone.
  • This thread was not created with the intention of talking about the larger standardization of package stuff. This needs to be a more precise conversation, and it needs to lead with the problems that need to be solved, rather than the solution.
  • @mikeal is right in assuming that issues like these are more likely to cause contention than productive conversation. "install without npm" means very many different things to many different people and you'll get a damn wide range of reactions. Issues like these aren't even good as honeypots for people's frustrations because they drag in everyone, not just the ones that keep trying to find places to vent their frustrations.
  • Even if it were a good idea to have that command because there was some value, it would be an utter waste of Node Core time and resources, and frankly a just general bizarre waste of time, to not rely on existing implementations by existing installer devs. We've looked into the void of despair that is implementing a reliable, compatible package manager. How much do you think the resources poured into writing yet another implementation would help the existing projects do their job better? All of us are heavily resource-constrained, and I can speak with some certainty about both npm and Yarn on this front. Please stop trying to stretch us thinner.

@zkat

Distribution should be prototyped in at least one or more existing installers before we even talk about standardizing on it. This is something massively dependent on user experience, and we need good exemplars in production that prove their strategy before node even considers doing automagic bundle.tarjs loading or anything at all surrounding that space. Go ahead and encourage people to do it. Don't standardize prematurely because you will get it so wrong it'll make your head spin.

Ditto for package signing. You can have any opinion you want about GPG vs X.509. In the end, this is something that requires a proven implementation that balances user experience and security. I don't remotely trust anyone to hand me a spec before a popular implementation is in place. If you want to prove that something works, write a prototype and prove it, rather than trying to clamp down on the possibilities based on your own ideas. If you want help with those prototypes, it's in our interest to help make them successful. (this one and the previous one are mainly directed at @bmeck)

I take an approach where the standard needs to be made and implementation feedback needs to come in at the same time, thats why I have been spending time doing some work on webpackage.

Moving all work into the ecosystem doesn't allow people to express core use cases. Often, when we create implementations of features, we only are thinking about our own internal use cases. That is why developing a specification and an implementation are important instead of retrofitting a design to work for a use case.

it is not clear what node --install was meant to achieve, and the reasoning of it being about deployment concern fell flat on pretty much everyone.

I don't fully agree on this conclusion that it has fallen flat. I do agree that if it is seen as an npm replacement it doesn't have much support.

This thread was not created with the intention of talking about the larger standardization of package stuff. This needs to be a more precise conversation, and it needs to lead with the problems that need to be solved, rather than the solution.

@mikeal is right in assuming that issues like these are more likely to cause contention than productive conversation. "install without npm" means very many different things to many different people and you'll get a damn wide range of reactions. Issues like these aren't even good as honeypots for people's frustrations because they drag in everyone, not just the ones that keep trying to find places to vent their frustrations.

Having been through the nightmare of talking about ESM, I can fairly safely say when developing pretty much anything involving modules. We need to have a central place to talk. As stated in the original issue text by @mikeal :

I'd like to use this thread to reach a consensus about the scope of this feature, potential pitfalls, and whether or not this is something we agree should be added. From there I can work on a proper Enhancement Proposal.

This issue is open to change scope and attempt to reach an agreement about what should be done.

Even if it were a good idea to have that command because there was some value, it would be an utter waste of Node Core time and resources, and frankly a just general bizarre waste of time, to not rely on existing implementations by existing installer devs. We've looked into the void of despair that is implementing a reliable, compatible package manager. How much do you think the resources poured into writing yet another implementation would help the existing projects do their job better? All of us are heavily resource-constrained, and I can speak with some certainty about both npm and Yarn on this front. Please stop trying to stretch us thinner.

As I have stated earlier, I personally have no desire for this to do package management. I am still hoping to talk about the scope and future concerns as listed in above comments.

This thread is full of "solutions" without any problem statements. Many of which aren't obviously related to the original text. For example, package signing and deployment both seem way off topic if we're just discussing @mikael's feature request. As best as I can tell they came up because people were trying to backfill what problem the feature request was supposed to be solving and then proposed additional new solutions. While I'm sure the intent wasn't to derail the issue, that's all this discussion is doing.

This thread would do well to be started over with a problem summary, rather than a feature request, and then proposals that address that problem summary can actually be weighed on how well they solve that problem. Without that, I'm unlikely to involve myself any more, it being a poor use of any of our time.

to throw my hat in the ring i'd like to take a few steps back. it sounds like the problem here is defining a problem. @mikeal tries to define the problem like this:

The problem is, not every Node.js install is used by a developer. Many installs happen in infrastructure. These installs run an application and are never touched by anything but infrastructure automation. Yet, these installs still include npm and, in fact, require npm in many cases because it is the best mechanism we have for installing the dependencies the application needs.

in order to even think about what the solution should be, we should be asking the community to respond with statements that say "npm is not the ideal tool for the job of running on my infrastructure because X, it would be great if it could do Y, or not do Z. maybe a new thing A that just does B would also work. here's an example of my workflow where this is a pain point: blahblah".

i would like to suggest that we open up an issue that asks "what are your pain points" that and see what we get. like many open source issues, this one is distracted by the suggestion of a potential solution and an insistence on the owner of such a solution (Node Core).

this is a tricky question to ask because it is such a large and shared community but we can only serve the community's interests if we talk to them. the data that @mikeal is basing this thread on is completely elided. i have no idea where this idea is coming from! i've been around for a second and no one has come close to asking for something like this, so i'm definitely intrigued about this idea's origin.

let's do some research and talk to our users before we have a massive thread of basically only npm and node core developers. without that data we'll fail to build anything that helps anyone. talking to users will help us know what problems we are trying to solve so that we can best decide on what a solution should be and who should own it.

shameless plug: we have a Community Committee who might be an AMAZING resource for soliciting and guiding community feedback.

OMG JINX @iarna 😆 😅

Maybe we can just move this to https://github.com/nodejs/node-eps 😏

@zkat @ashleygwilliams @iarna as stated in the original issue text.

From there I can work on a proper Enhancement Proposal.

EPs are supposed to have a fully fleshed out feature in mind. This issue was created with the intention to discuss scope.

@bmeck

I am still hoping to talk about the scope and future concerns as listed in above comments.

So, yes, let's do that. I'm reasonably sure that by now we know what is out of scope, let's talk about what is.

Given: node --install {artifact}, some questions:

  1. What is {artifact}? What I imagine is that it is a file path to some form of deployment thing (tar, directory, webpackage, whatever).

  2. What is {artifact} not? What I imagine is that it is not a module identifier that would need a registry endpoint to resolve.

  3. What exactly does --install do? What I imagine is that it takes {artifact} as input and results in something that allows require('some-identifier-for-artifact') to just work. What exactly is that?

  4. What does --install not do? It definitely should not do any kind of registry resolution (it won't have to, the artifact is already local and known). I'd also say that it shouldn't do really any dependency resolution. The dependencies should already be resolved in {artifact})

I have a few answers of my own in mind for these but I'd like to see what direction you're mind is heading on it.

(fwiw my thumbs up of the eps move was a bit of sarcasm, which i am sorry for)

to reiterate: we can't scope a feature if we don't know the problem it solves and we don't have any evidence that it is something that users actually want.

worse: if we make the feature we have no way to tell if it is successful since we dont know what failure looks like.

EDIT: also if we don't know what problem this solves how do we explain it to users so they can make informed decisions about using it?

Without registry resolution, why would anyone bother with this versus a simple wget + untar in a build/deploy script?

What I think might actually have some value for deployment is a node --rebuild-dependencies which walks the dep tree and does node-gyp rebuild where applicable. With how much extra stuff is in node-gyp it doesn't seem all that different from just shipping with npm though.

As several of the npm folks have expressed though: the problem trying to be solved here seems rather undefined so far. Can anyone share how they'd actually use this feature and not just what it should look like?

fwiw, if node were going to have a "help pull down prod artifacts" command (please don't call this --install) that would currently look like curl $ARTIFACT_URL | tar -xz, MAYBE some mkdir -p and cd action to put the artifact in the right place. I could see this being extended to support things like, idk, signature verification, making it actually valuable (since I'd just as soon use the shell one-liner today). I'd of course expect "run a jar-but-for-node" to be a different command. This would have to change the behavior of node --pull-down-that-artifact, naturally.

But if it's literally the same as a curl command, I'd just as soon use curl, and I think that's true of most people running node in production. I'd like to +1 @ashleygwilliams's suggestion that you do user research on this. I'm basically volunteering to fill out surveys about deploying node right now!

Uhh, but I honestly don't know what problem @jasnell's proposal is trying to solve. I don't mean that as a dig. I just never had a problem that could be solved with a glorified tar -xzf. I'd just as soon package a tar implementation for windows as a separate binary on npm and not named anything related to install.

Lol.. I haven't made any proposal. I've asked questions so I can understand what others are thinking.

Also: I'd find run-scripts support in core a little more useful.

A lot of PaaS's use 'npm start' as the command for starting the node app--or at least nodejitsu did. I think heroku as well? So there's some precedence for using run-scripts as the command for kicking off their services. I've usually seen people implement this directly in systemd or upstart in practice, since it kinda sucks when the process your init system is controlling isn't actually the node app.

There's some parallels here with foreman--or at least, a way you can use foreman. I've been doing python for my day job lately, and we use this tool called 'honcho' (a foreman port) to source an env file before running some command. I feel like npm fills a similar niche, in that it sets up an environment. A tool that could reliably source an env file (good luck parsing it, both docker and honcho--edit: AND systemd-- behave differently than bash, and you can't use bash because the export keyword isn't in there) and set up npm-like environment variables (NODE_ENV, add ./node_modules/.bin to path) and then exec a command... would be useful. But also a pretty big project that's going to be hilariously broken in subtle ways, not one I'm sure I want integrated with my runtime.

Again, in practice people use templates to generate systemd configs based on the env file and some configured start command (which may or may not be inside the package.json). I feel like this isn't worth trying to implement (in core) because it's either integrated with chef/puppet/ansible/etc, the person doing it has very specific opinions about how this thing should look, or both.

Lol.. I haven't made any proposal. I've asked questions so I can understand what others are thinking.

Maybe "proposal" was the wrong word. you can call them "things I imagined" if you like. But what I'm referring to is things like:

it is either a file path to some form of deployment thing (tar, directory, webpackage, whatever).

it is not a module identifier that would need a registry endpoint to resolve.

it takes {artifact} as input and results in something that allows require('some-identifier-for-artifact') to just work.

It definitely should not do any kind of registry resolution (it won't have to, the artifact is already local and known). I'd also say that it shouldn't do really any dependency resolution. The dependencies should already be resolved in {artifact})

I see this as one of the more concrete suggestions in this thread. So, I guess I was charitable? But my understanding is that this suggests a command that maybe has a subset of curl's capabilities and, in today's use cases, some of tar's capabilities and basically just dumps the contents of the module (which I hope has all its deps bundled--this almost never happens nobody afaik does this) into node_modules.

Which again, is honestly just weird. I would never want to do that, and I don't see who this caters to.

No need to be charitable at all. At this point I'm wanting to tease out more details of what @bmeck may have in mind because if it's limited to just what I am imagining, I'm not seeing a lot of value with having it built in to core. Others have much better imaginations than I, however.

@jasnell

As you have noted, however, several individuals (@bmeck for instance) have a much different notion of what a node --install could be and I'd really like to explore those ideas further.

That is not what I noted. What I noted was that @bmeck has an understanding of this feature that seems to be very _different_ from everyone else. I don't see "several individuals" suggesting this. What I _do_ see is clear consensus that "npm lite in core" is a bad idea, and the degree of badness that each person sees in that idea seems closely correlated to their degree of understanding of the problem. What I _also_ see is a lot of assumption that node --install would be a "npm lite in the core" from each person (except @bmeck) who shows up to this thread.

That has led me (and a few others) to conclude, (a) "npm lite in core" is an implementation detail no one wants, addressing a problem that is not well specified, and (b) that's what most people seem to assume node --install would be.

So, it is clear that either the OP feature here is either ill-conceived or spelled wrong. I suggest the following:

  1. Close and lock this issue.
  2. Open a new issue to discuss @bmeck's idea of a no-network-touching webarchive (or other jar-like) unpacker, which I believe can drive towards userland experimentation very quickly.
  3. Open a feature request over on https://github.com/npm/ somewhere requesting updated documentation on what parts of npm could be better specified.

Node's module loading behavior is well documented, and that specification is properly owned by the CTC. npm's dependency contract could certainly be more concisely articulated (right now it exists, but is spread out across a few different places).

@jasnell

if it's limited to just what I am imagining, I'm not seeing a lot of value with having it built in to core

Could not agree more. (My not-so-secret agenda in driving towards userland experimentation is that this usually means core doesn't have to add anything ;)

I don't think what @bmeck is talking is necessarily the same but I do think that it ends up covering some of the package install points here that would be more suited for us to solve, ontop of likely solving other use-cases. (That is to say, I don't think he is the _only_ one.)

I'd really love to have a discussion around self-extracting and/or egg-like archives (that can be run with node via a shebang or node foo.nodearchive), and using archives like a filesystem (like how electron does with asar).

@Fishrock123 what package install points here do you think are suited for node core to solve?

Is it worth revisiting this discussion considering the current state of npm and npmjs.com?

https://mobile.twitter.com/palmerj3/status/1141797296004325376

NPM cli hasn't had a commit since March and you can't file any issues. This is fine.

I feel it would be positive step forward to have a node --install that is just powerful enough to install the user's choice of npm, yarn, entropy, etc without making the difficult-to-reverse decision of bundling them with Node.js

You file issues on the discourse, which has been the case for awhile. Separately, a lack of commits doesn't necessarily indicate any problem.

I agree that a lack of changes to a code basis is an insufficient metric to warrant a conversation of change, but the relationship between an application and transmission is continuously worthy of discussion.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

willnwhite picture willnwhite  ·  3Comments

stevenvachon picture stevenvachon  ·  3Comments

fanjunzhi picture fanjunzhi  ·  3Comments

dfahlander picture dfahlander  ·  3Comments

addaleax picture addaleax  ·  3Comments