Describe the bug
npm and yarn v1 ( can't verify see #187 ) are faster than yarn v2 installing the following repo:
To Reproduce
yarnAdditional context
Cached installs are quite quick. First installs are not. Using hyperfine I did a very basic benchmark of yarn
olingern@olingern-UX370UAR:~/code/ink-select-input$ hyperfine yarn
Benchmark #1: yarn
Time (mean ± σ): 8.520 s ± 19.140 s [User: 8.866 s, System: 0.328 s]
Range (min … max): 2.411 s … 62.995 s 10 runs
Unforutnately, the npm cli doesn't play well with hyperfine, but I generally observe 9-12 seconds on a fresh install.
It may be nice to have benchmarks as a separate test suite and run them against different repository types and compare run times against yarn v1.
Cold installs are quite a bit slower indeed, but that's to be expected. There are two reasons to that:
Since we now store the cache as zip archives, we need to convert those we download from the npm registry. It has a significant cost, and that's where most of the time is spent.
By default the cache is now configured to be unique to each project. This is done in order to increase "offline mirror" awareness, as this very useful feature went mostly unnoticed in the v1. The consequence however is that projects don't benefit by default from the packages downloaded by the other projects.
For the first issue I'm afraid we can't do much. The ideal solution would be for the registry to provide zip archives out of the box (as a reminder, we use zip because its has better random access properties, which improves runtime performances), but it doesn't seem likely it'll happen anytime soon. Maybe it could lead someone to build a zip mirror for npm?
For the second issue users can manually set the enableGlobalCache setting, which will instruct Yarn to ignore the cacheFolder setting and instead use a system-specific directory. All projects that this setting will be able to use a common pool of packages. Note that the yarn dlx installs use the global cache by default, so in their case you won't pay a particular price when calling it twice in a row.
Imo, while it will likely continue to be used as a comparison factor by a bunch of our users (and I admit that's kinda our fault, given the initial communication back when Yarn launched), with the v2 installs aren't a metric as valuable as before.
With zero installs we spend some extra time upfront, but we also get to not have to pay it anymore later. In this context the real command that I think we should be careful about is yarn run, as it's the primary command people use to manage their project.
Maybe it could lead someone to build a zip mirror for npm?
That could be interesting. For github, it's actually not necessary to download the tarballs as they already provide the each branch as both a tarball and zip.
https://github.com/yarnpkg/berry/archive/master.zip
Imo, while it will likely continue to be used as a comparison factor by a bunch of our users (and I admit that's kinda our fault, given the initial communication back when Yarn launched), with the v2 installs aren't a metric as valuable as before.
Here, I would have to disagree. In a node project, I typically have a a start, dev, lint and test commands, so my needs ( I think ) are typically pretty basic whereas a project like create-react-app has much more script usage. One of the reasons that I personally switched to yarn was CI builds. Yarn saved both time and _money_ because many CI's measure their plans in minutes. Each time I push a new build, it spins a docker container up, and does everything from scratch. That process being fast is a metric I'm _very_ interested in.
I think somewhere in the Yarn v1 repo there was a comment asking what differentiates Yarn from npm et al. A lot of v2 features ( _especially plugins_ ) are solid differentiators; however, as an end user -- it would be a tough to convince me that upgrading from v1 to v2 with a performance degradation of up to 4x on initial installs is worth it.
Not in any way trying to be abrasive, I just believe that dismissing performance on cold installs in exchange for features will possibly result in poor adoption of Yarn v2 or people using other package managers.
Yarn saved both time and money because many CI's measure their plans in minutes. Each time I push a new build, it spins a docker container up, and does everything from scratch. That process being fast is a metric I'm very interested in.
I understand that, and that's precisely one of the reasons why Zero-Installs are a superior system. It's faster to clone a repository than to clone + yarn install it, even if it contains extra versions of the packages in the history (and I'm not entirely sure it's true if you make a shallow clone).
Even disregarding Zero Installs, the v2 cache is much easier to persist between two CI instances. You previously might have been doing it already, but you needed to work with a node_modules (much larger), or with a Yarn cache (which required yarn install). You now only have to copy the cache folder (or better: just reference it from your CI configuration) and it's done.
Not in any way trying to be abrasive, I just believe that dismissing performance on cold installs in exchange for features will possibly result in poor adoption of Yarn v2 or people using other package managers.
No worry, I understand what you mean - however I also want to make sure that my point is correctly understood: the cold installs are less important because there shouldn't be cold installs.
Of course there's some tuning to do, we're still in the phase of the project where features have a slightly higher priority than performances, but I'm not overly concerned by the cold install. What I'm concerned about is how to explain in simple terms why it doesn't matter as much as before, which is why this thread is great 🙂
I understand that, and that's precisely one of the reasons why Zero-Installs are a superior system
I see. I think, even though I read through the zero-install docs a couple of times, "zero installs" and their usages didn't really set in on me.
What I'm concerned about is how to explain in simple terms why it doesn't matter as much as before
Maybe having a Yarn v1 vs Yarn v2 workflow graphic or table could help. I think a key difference that I missed was that you only need to check in dependencies once / when adding new ones and every clone & install thereafter will be quite fast.
I think some easy points in the docs to make would be:
I'm sure there are more advantages as well. Having a caveats section explaining the new workflow plus gain in security / speed is an acceptable trade-off for the cold install could be helpful so that people ( _like me :slightly_smiling_face:_) don't open issues about this during migration.
Closing as a non-issue. Thanks for the explanation @arcanis! Maybe we can open a separate issue about the different workflow in v2 in "GETTING STARTED" area under a tab "MIGRATING FROM V1"
Yup, good idea 👍