Yarn: Shrinkwrap file from added packages are ignored

Created on 12 Oct 2016 · 27Comments · Source: yarnpkg/yarn

Do you want to request a _feature_ or report a _bug_?

Bug.

What is the current behavior?

Yarn doesn't respect the npm-shrinkwrap.json contained inside added dependencies, like hapi, and chooses to install the latest semver compatible versions declared in its package.json instead.

If the current behavior is a bug, please provide the steps to reproduce.

yarn add <package-with-shrinkwrap>

What is the expected behavior?

Installation of the exact versions declared in npm-shrinkwrap.json, including sub-dependencies. Alternatively, a visible warning that the file has been ignored, and the install might be broken due to this.

Note that it is an important feature that a package can specify the exact dependency version requirements.

Please mention your node.js, yarn and operating system version.

yarn v0.15.1

Source

kanongil

👍3 👎2

Most helpful comment

@kanongil I wanted to add a little bit to what @kittens and @Turbo87 said above.

I've thought a lot about recursively using lockfiles (shrinkwrap or otherwise) when resolving an application. The first time I remember writing about it was in 2010, pretty soon after we first released a version of Bundler with lockfiles generated by default: Clarifying the Roles of the .gemspec and Gemfile.

At the time, my TLDR was:

Even if the precision could be enforced, you wouldn't want it, since it would prevent people from using your library with versions of its dependencies that are different from the ones you used to develop the gem.

This blog post had a pretty clear directive (that the Ruby community has more-or-less followed) but the reasoning was a little muddled by accidental aspects of the implementation at the time.

I had a chance to revisit this question when designing Cargo, which doesn't separate the application manifest from package manifests.

For context:

| Package Manager | Application Manifest | Lockfile | Package Manifest |
| --- | --- | --- | --- |
| Bundler | Gemfile | Gemfile.lock | pkg.gemspec |
| Cargo | Cargo.toml | Cargo.lock | Cargo.toml |
| Yarn | package.json | yarn.lock | package.json |

We had an opportunity to think through the problem again from scratch, and control exactly packages would be possible on the registry, and arrived at the same conclusion (described more articulately).

The purpose of a Cargo.lock is to describe the state of the world at the time of a successful build. It is then used to provide deterministic builds across whatever machine is building the project by ensuring that the exact same dependencies are being compiled.

This property is most desirable from applications and projects which are at the very end of the dependency chain (binaries). As a result, it is recommended that all binaries check in their Cargo.lock.

For libraries the situation is somewhat different. A library is not only used by the library developers, but also any downstream consumers of the library. Users dependent on the library will not inspect the library’s Cargo.lock (even if it exists). This is precisely because a library should not be deterministically recompiled for all users of the library.

If a library ends up being used transitively by several dependencies, it’s likely that just a single copy of the library is desired (based on semver compatibility). If all libraries were to check in their Cargo.lock, then multiple copies of the library would be used, and perhaps even a version conflict.

In other words, libraries specify semver requirements for their dependencies but cannot see the full picture. Only end products like binaries have a full picture to decide what versions of dependencies should be used.

In short, if yarn allowed the yarn.lock or shrinkwrap.json file in a package to control the exact version of a dependency, it would create unnecessary duplicates.

The package.json describes the intended versions desired by the original author, while yarn.lock describes the last-known-good configuration for a given application. In other words, the package.json describes the semantic requirements, which the yarn.lock maintains determinism across builds.

If a package wants to insist on a specific version of a dependency because it needs that version, it should say so in the package.json (and both npm and yarn will happily duplicate the dependency). But if a package specifies a wider range for a dependency in the package.json, that doesn't mean that the existence of a yarn.lock or shrinkwrap.json should force dependent applications to duplicate.

The nice thing about this philosophy is that it gives packages a very simple way to express either intent:

if a package needs a particular version of a dependency and wants to be sure that all dependent apps use it, it can specify that in package.json.
if a package wants to check in yarn.lock for convenience (I don't personally recommend it, but sometimes packages double as apps in some contexts), they can do that by checking in their yarn.lock for their own use without forcing unnecessary dupes on dependent packages or apps.

wycats on 13 Oct 2016

👍68 ❤4

All 27 comments

We investigated this in #41. Unfortunately the npm shrinkwrap format is poor and very lossy so we can't accurately reconstruct the dependency graph with just the shrinkwrap so we can't support this without crippling other behaviour.

sebmck on 12 Oct 2016

👎5 👍2

So you prefer to install broken packages instead? No warning given?

If you can't support it, I'd suggest you detect this, and abort the install since it will otherwise be broken.

Also, do you have a yarn compatible solution for package authors that desire to specify exact dependencies versions?

kanongil on 12 Oct 2016

Note that this behaviour seems at odds with the install compatibility intent stated by @wycats in #375.

kanongil on 12 Oct 2016

👍1

So you prefer to install broken packages instead?

Why would they be broken? If the library wants to pin dependencies to a specific version it could just declare those in the package.json file. Am I missing something?

Turbo87 on 13 Oct 2016

@kanongil I wanted to add a little bit to what @kittens and @Turbo87 said above.

At the time, my TLDR was:

Even if the precision could be enforced, you wouldn't want it, since it would prevent people from using your library with versions of its dependencies that are different from the ones you used to develop the gem.

This blog post had a pretty clear directive (that the Ruby community has more-or-less followed) but the reasoning was a little muddled by accidental aspects of the implementation at the time.

I had a chance to revisit this question when designing Cargo, which doesn't separate the application manifest from package manifests.

For context:

The purpose of a Cargo.lock is to describe the state of the world at the time of a successful build. It is then used to provide deterministic builds across whatever machine is building the project by ensuring that the exact same dependencies are being compiled.

This property is most desirable from applications and projects which are at the very end of the dependency chain (binaries). As a result, it is recommended that all binaries check in their Cargo.lock.

For libraries the situation is somewhat different. A library is not only used by the library developers, but also any downstream consumers of the library. Users dependent on the library will not inspect the library’s Cargo.lock (even if it exists). This is precisely because a library should not be deterministically recompiled for all users of the library.

If a library ends up being used transitively by several dependencies, it’s likely that just a single copy of the library is desired (based on semver compatibility). If all libraries were to check in their Cargo.lock, then multiple copies of the library would be used, and perhaps even a version conflict.

In other words, libraries specify semver requirements for their dependencies but cannot see the full picture. Only end products like binaries have a full picture to decide what versions of dependencies should be used.

In short, if yarn allowed the yarn.lock or shrinkwrap.json file in a package to control the exact version of a dependency, it would create unnecessary duplicates.

The nice thing about this philosophy is that it gives packages a very simple way to express either intent:

if a package needs a particular version of a dependency and wants to be sure that all dependent apps use it, it can specify that in package.json.
if a package wants to check in yarn.lock for convenience (I don't personally recommend it, but sometimes packages double as apps in some contexts), they can do that by checking in their yarn.lock for their own use without forcing unnecessary dupes on dependent packages or apps.

wycats on 13 Oct 2016

👍68 ❤4

@Turbo87

Suppose Package A depends on v1.0.0 of Package B. The author of Package A wants to pin dependencies, so they declare a specific version in package.json:

{
  "name": "A",
  "dependencies": {
    "B": "=1.0.0"
  }

But now suppose Package B has a loose dependency on Package C:

{
  "name": "B",
  "version": "1.0.0",
  "dependencies": {
    "C": ">2.0.0"
  }
}

Suppose version 2.0.0 of C is published. When the author of A reinstalls their packages, they will install 1.0.0 of B and 2.0.0 of C.

Then version 2.1.0 of C is published. When the author of A reinstalls their packages, they will install 1.0.0 of B and 2.1.0 of C, i.e. something different from what they had before. This is bad for package A; _their node_modules folder wasn't completely locked even though they declared specific versions of all their dependencies._

not-an-aardvark on 13 Oct 2016

😄2 👍2

Then version 2.1.0 of C is published. When the author of A reinstalls their packages, they will install 1.0.0 of B and 2.1.0 of C, i.e. something different from what they had before. This is bad for package A; their node_modules folder wasn't completely locked even though they declared specific versions of all their dependencies.

This problem is exactly the problem that we solved in Bundler and Cargo by using "conservative updating", which avoids updating dependencies unnecessarily across the entire graph.

I mentioned it briefly in #579:

The reason I think the current status isn't an urgent priority is that a list of seen patterns and associated packages is still deterministic, which is the number one priority of the lockfile. Lower priorities, like conservative updating (yarn update some-pkg changes the minimal necessary subgraph) depend on a more traditional lockfile, but we can work on it over time.

The TLDR is that the current structure of the lockfile deterministically finds the same package for a given pattern, but isn't a graph of dependencies themselves. I'm working on #579, which should make it possible to address the scenario you described in the canonical way 😄

wycats on 13 Oct 2016

👍1

@wycats thanks for the clarification!

just to make sure: your recommendation is to use lockfiles only for "binaries", but not necessarily for libraries, correct?

Turbo87 on 13 Oct 2016

@Turbo87 I think it's fine to use lockfiles in libraries if what you're trying to accomplish is increasing the determinism of the library's test suite, for example. That said, in my view you should only do that if you have a matrix of a bunch of different dependencies and lockfiles associated with them.

wycats on 13 Oct 2016

👍3

As an example, the gem that the Skylight product I work on uses has this travis.yml (and we use cache: bundler to keep things deterministic across builds):

gemfile:
  - gemfiles/Gemfile.rails-3.0.x
  - gemfiles/Gemfile.rails-3.2.x
  - gemfiles/Gemfile.rails-4.0.x
  - gemfiles/Gemfile.rails-4.1.x
  - gemfiles/Gemfile.rails-4.2.x
  - gemfiles/Gemfile.rails-5.x
  - gemfiles/Gemfile.sinatra-1.3.x
  - gemfiles/Gemfile.sinatra-1.4.x
  - gemfiles/Gemfile.grape

wycats on 13 Oct 2016

Not to overcomplicate things for you @wycats, but I'd be remiss if I didn't at least mention preferred versions as an approach to considering information from deps' lockfiles.

sdboyer on 13 Oct 2016

Wow, thanks for taking the time to look into this, and I generally agree that libraries should not lock their dependencies.

The problem is something like hapi (in top 1000), which controls most of the internal dependencies, would like each release to always install the same exact dependencies _and_ sub-dependencies. See https://github.com/hapijs/contrib/issues/26#issuecomment-94552345 for the reasoning, which includes a security aspect.

Note that since the exact versions of sub-dependencies are desired, it can not be expressed in package.json and currently relies on npm-shrinkwrap.json for this.

If you don't want to support this feature, you should at the very least explicitly warn users that are adding / updating a package that contains an ignored npm-shrinkwrap.json, that it can produce a broken install.

kanongil on 13 Oct 2016

👍2

@kanongil since npm@3 and yarn are flattening the dependencies hapi could just list and pin the subdependencies in their package.json too if they want control over the subdeps. I don't see how else this could work properly.

Turbo87 on 13 Oct 2016

👍2

@Turbo87 Hmm, you might be right. I will have to look into this.

Still, yarn should warn whenever it ignores a shrinkwrap file.

kanongil on 13 Oct 2016

@Turbo87 ~~After looking into it, declaring sub-dependencies in the package.json can work for some scenarios, but can't handle a case where multiple versions of the same sub-dependency is desired.~~

Edit: This does not work at all, as yarn will install newer sub-dependencies into each declared dependency. See #681. It does sorta work for npm@3 though.

kanongil on 13 Oct 2016

@kittens I checked out #41 and it seems the problem is that the shrinkwrap file is computed after dependencies are hoisted. I don't quite grok, however, why this precludes its being used in lieu of a yarn lockfile (I don't doubt it, I just want to take a moment to learn more about yarn and how it flattens its deps and calculates the dependency closure and used lock files and stuff). To my untrained mind, it would seem that you could count on the shrinkwrap to describe a valid closure for a library or app, and if yarn followed that pattern you would get a correct, deterministic dependency closure, with the downside being that you might not get an optimally flat structure (since it might have been better to hoist some deps).

doug-wade on 15 Oct 2016

Can this be reopened please?

kanongil on 19 Oct 2016

👍2

At the very least, I need to be able to generate my lockfile from the contents of my node_modules directory (which is currently managed by npm-shrinkwrap.json but my company is trying to move away from shrinkwrap to yarn. we can't until there's a reasonable transition from one to the other).

bstro on 20 Oct 2016

Yarn is useless to me when I can't trust the install command to fetch the right modules, or at least warn that it has done something dubious. @kittens can you reconsider this issue?

kanongil on 17 Nov 2016

👍1

Thank you for raising this issue, @kanongil. I am publishing a security-sensitive library and need to provide reproducible builds to my users.

I first looked at npm shrinkwrap and thought it might be what I needed until I read the following caveat:

If you wish to lock down the specific bytes included in a package, for example to have 100% confidence in being able to reproduce a deployment or build, then you ought to check your dependencies into source control, or pursue some other mechanism that can verify contents rather than versions.

I then found Yarn, read many reviews comparing it favorably to shrinkwrap, and saw that it incorporates hashes in the lockfile. I assumed it would give me the reproducible builds I require _for the users of my library_. My rationale is the same as that expressed by the maintainer of Hapi.

I drank the Kool-Aid after reading the following points and want to provide all of these benefits for my users as well as my developers.

If you don’t store which version you ended up installing, someone could be installing the same set of dependencies and end up with different versions depending on when they installed. This can lead to “Works On My Machine” problems and should be avoided.

Also, since package authors are people and they can make mistake, it’s possible for them to publish an accidental breaking change in a minor or patch version. If you install this breaking change when you don’t intend to it could have bad consequences like breaking your app in production.

Lockfiles lock the versions for every single dependency you have installed. This prevents “Works On My Machine” problems, and ensures that you don’t accidentally get a bad dependency.

It also works as a security precaution: If the package author is either malicious or is attacked by someone malicious and a bad version is published, you do not want that code to end up running without you knowing about it.

I'm now warning my users not to use Yarn to install my library until this is addressed.

NodeGuy on 21 Dec 2016

Here's someone else who feels the same way and wrote an essay about it:

https://medium.com/@boennemann/avoid-yarn-for-packages-and-fully-enjoy-its-benefits-for-application-development-8bdd4deb33cf#.s5hiksr2o

NodeGuy on 21 Dec 2016

What I especially don't get is how this issue has been handled, compared to the stated goals of the project.

Closing it here doesn't make it go away, but instead actively discourages contributions and adoption. Though, I guess you also gain some immediate adoption from the the fake trust you instill into outsiders?

kanongil on 21 Dec 2016

😕1

@kanongil FWIW I think this issue was handled quite well. @wycats wrote a long response to your request and explained why this request does not fit in the way yarn works. If you don't agree with the way yarn work then you don't have to use it.

Turbo87 on 21 Dec 2016

😕1 👍1

@Turbo87 I appreciate the time @wycats took to write a response but, while lengthy and interesting, it doesn't actually attempt to solve the issue. Right now I'm just interested in the project acknowledging that there is an issue.

Once it has been acknowledged, work might start on a solution which could be any of these:

Clearly note that the yarn philosophy robs package authors of an existing ability to manage the install.
Warn when shrink-wrapped dependencies of an added / updated package doesn't match.
Warn when an added / updated package contains an ignored shrinkwrap.
Somehow find a technical solution that installs the expected dependencies.

kanongil on 21 Dec 2016

I respect the authors' decisions to design Yarn however they like and I won't use it if it doesn't address my needs. I'm grateful for the work they've done and the conversations they've inspired (including this one).

I respectfully point out that there may be a useful distinction between:

Install this package as the maintainer published it, thus enabling the maintainer to make a meaningful promise about quality and to apply tools such as security audits.

vs.

Update the package's dependencies automatically according to an algorithm while acknowledging that this increases the attack surface and releases the publisher from any liability for using a library that behaves differently than the one he tested and published.

I see these as two different activities and I would love it if Yarn would present them as explicit options rather than conflating them into a single command.

I also agree with @kanongil that for Yarn to be a well-behaved part of the npm community it should display a warning that it is disregarding npm-shrinkwrap.json files.

NodeGuy on 23 Dec 2016

👍1

Yarn does warn that npm-shrinkwrap.json is being ignored ( https://github.com/yarnpkg/yarn/commit/87f3da96909db1eeaaa597426655baf79af03151 ).