Yarn: package resolver is sensitive to order

Created on 24 May 2017  路  12Comments  路  Source: yarnpkg/yarn

Do you want to request a feature or report a bug?
bug

What is the current behavior?

From https://github.com/yarnpkg/yarn/pull/3477#issuecomment-303537660:

With 2 patterns, [email protected] (A) and a@^1.0.0 (B), current version 1.0.2, the following can happen:
Order 1

A resolves, adds version 1.0.1
B resolves, finds matching version 1.0.1, delays
final resolve for B, to version 1.0.1
Outcome: [email protected] -> 1.0.1, a@^1.0.0 -> 1.0.1
Order 2

B resolves, adds version 1.0.2
A resolves, adds version 1.0.1
Outcome: [email protected] -> 1.0.1, a@^1.0.0 -> 1.0.2
The committed code will still mitigate problems, but it won't completely solve them...

I'm afraid a better algorithm is needed to get fully deterministic resolves.

Please mention your node.js, yarn and operating system version.

Yarn 0.26

high-priority triaged

Most helpful comment

@blexrob, you've done a great fix for pattern resolver in https://github.com/yarnpkg/yarn/pull/3477, would you have time to solve this one?

All 12 comments

@blexrob, you've done a great fix for pattern resolver in https://github.com/yarnpkg/yarn/pull/3477, would you have time to solve this one?

On top of it, I think we should disable this lockfile optimization if --pure-lockfile is passed otherwise does it mean that we may have different trees resolved with the same lockfile?

Another idea to discuss, should we make this dependency optimization an opt-in feature?

I think I've hit this one too - I've found a case where upgrading a package then running a yarn install results in a changed lockfile (even with yarn 0.26).

I've put a script together that reproduces the issue here https://github.com/hmarr/yarn-lock-inconsistency

Thanks for repro, @hmarr.

This is a first try for a deterministic algorithm. No early outs based on earlier registry resolves,
so resolving order won't influence the final outcome.

  bestversion = map of pattern->version
  resolvedversion = map of pattern->version

  # determine the best matching version for pattern, recurse through dependencies of those versions
  # note - no early out if a matching older version was already found! This will cause more lookups...
  recurse, start with root patterns
    { version, depencies } = getregistrybestversion(pattern)
    bestversion[pattern] = version
    recurse depencies

  useversions = map pattern -> Array<version>

  # determine a set of versions to use (relatively minimal in size)
  currpatterns = allpatterns
  while currpatterns.length
    # get the lowest best version for the remaining patterns, that one is the next to use
    lowestbestversion = min(currpatterns.map(pattern => bestversion[pattern]))
    useversions[pattern].push(lowestbestversion);

    # filter out all patterns satisfied by that version
    currpatterns = currpatterns.filter(pattern => !ismatch(pattern, lowestbestversion))


  # resolve pattern to best matching version from the useversions list of that pattern
  # so we'll tend to use the best version available if we'll have multiple possibilities
  for pattern of allpatterns:
    resolvedversion[pattern] = max(useversions[pattern].filter(version => ismatch(pattern, version)))

  # we'll probably have resolved patterns that aren't going to be used (because we won't early-out when
  # resolving from registry), make sure those don't end up in the lockfile
  recurse, start with root patterns
    mark pattern as actually used
    recurse over deps patterns of resolved version

This algorithm has a 2 drawbacks:

  • it iterates dependencies of versions that won't be used later on, so lower perf than current algorithm
  • its decisions won't always be clear from the resulting tree. Some dependencies can introduce used versions that influence the final resolve, even though those dependencies ultimately aren't used at all. (couldn't construct an actual scenario yet, though)

I think it is also important to determine what the main priority for this algorithm should be. Minimizing the number of installed versions? Install the best version? Version stability for currently already installed packages?

I don't have a lot of time to work on this for now, so I can't promise to work on this.

After thinking a bit more about this today I think we should step back actually.

Yarn guarantees that node_modules will be installed the same way for the same lockfile and developers won't get surprisingly different results every time install command runs.

That is why Yarn should not optimize lockfile while resolving dependencies during install.
It can offer deduping but that should be a voluntary action from a developer, e.g. if running via TTY Yarn should ask if deduping should happen.
Or if you add a new dependency Yarn may do an optimization but still we probably want users to be aware about the optimization and allow them to opt out from it.

@blexrob, thanks for offering your insight!
And no pressure if you can't work on it now.

I have an example of this occurring here: https://github.com/timkelty/yarn-order-bug

package.jsons are identical except for swapping order of 2 deps. Resulting in very different node_modules.

I may have a fix, will post a PR soon

Should be fixed in 0.27.2 馃憤

Was this page helpful?
0 / 5 - 0 ratings