yarn is deleting symlinks in node_modules that it doesn't own

Created on 21 Apr 2018  ยท  26Comments  ยท  Source: yarnpkg/yarn

I have some symlinks in node_modules that come from another custom tool. Yarn likes to blow them away:

> ll | grep "\.js"
gen-await.js -> ../modules/node-utils/build/gen-await.js

> yarn
...

> ll | grep "\.js"
needs-discussion

Most helpful comment

I'm in this same situation -- I need to have a symlink in my node_modules and I don't want yarn to delete it. My use case is I have a private copy of a package (pkgA) checked out as a git submodule in $TOP/pkgA; I'm working on both my main app and pkgA, so I want my build (and hot reload) to pick up changes in both. I've tried link: and workspaces but neither is helpful. Simple symlink node_modules/pkgA -> ./pkgA works perfectly, except that yarn deletes it! Could yarn just have an option for "known non-Yarn files in node_modules" that it just wouldn't delete?

All 26 comments

yarn deletes all files and directories under node_modules that don't belong to the currently installed packages. This is part of the design. It does work differently than npm's pruning/cleaning of extraneous files.

For comparison, npm will only delete extraneous directories in node_modules if they have a package.json file. It won't remove other non-package directories. It also just never cleans anything out of node_modules/.bin. Even npm prune doesn't remove them.

My personal opinion is that npm is incorrect. node_modules is the directory for the package manager to manage and no manual modifications should be made. I could be convinced otherwise though.

@yarnpkg/core think this should be "closed / as designed" or a bug due to "npm compatibility"?

This isn't quite accurate, yarn doesn't touch symlinked directories regardless of whether or not they're in package.json. I'm guessing this has to do with the fact that yarn link doesn't touch package.json, but I'm not sure.

node_modules is the directory for the package manager to manage and no manual modifications should be made

I personally disagree here. node (and tooling) is setup so that node_modules is the only way to include modules so if you want to do anything out of the ordinary modifying the contents of this directory is the only way to do it. But even should I concede that point, package.json should be the source of truth, not the individual package manager. In an attempt to placate yarn we tried adding references in dependencies, e.g. "gen-await": "../modules/node-utils/build/gen-await.js",, but yarn then failed, complaining that this wasn't a directory, and there doesn't seem to be a way to tell yarn to just bugger off and ignore a dependency.

AFAIK yarn doesn't actually have a use-case for symlinking individual files in node_modules so any symlinked files were clearly created by a different tool. yarn cleaning up is one thing, but I don't believe it should be touching things it doesn't understand as it leads to a specific package manager being the source of truth. You could, for instance, argue that npm would be correct in blowing away links or modules created by yarn workspaces since it doesn't understand where they came from (which it does, but only via npm prune, not aggressively on actions like install)

I did find this comment in the code:

    // If an Extraneous is an entry created via "yarn link", we prevent it from being overwritten.
    // Unfortunately, the only way we can know if they have been created this way is to check if they
    // are symlinks - problem is that it then conflicts with the newly introduced "link:" protocol,
    // which also creates symlinks :( a somewhat weak fix is to check if the symlink target is registered
    // inside the linkFolder, in which case we assume it has been created via "yarn link". Otherwise, we
    // assume it's a link:-managed dependency, and overwrite it as usual.

I suspect it works the way it does because what if:

1) you make a package.json that contains

"dependencies": {
  "foo": "link:../foo"

2) run yarn instal
3) manually edit package.json and remove that dependency
4) rm yarn.lock
5) run yarn install

The symlink node_modules/foo should be deleted because it used to be a yarn thing, but there is no way of knowing that. yarn just continually tries to make node_modules "clean".

You could, for instance, argue that npm would be correct in blowing away links or modules created by yarn workspaces since it doesn't understand where they came from (which it does, but only via npm prune, not aggressively on actions like install)

if so then this is a recent change then. npm used to auto-prune on install.

https://github.com/npm/npm/issues/16853
https://github.com/npm/npm/issues/17379#issuecomment-345042377

maybe npm just doesn't auto prune symlinks? that might make sense, since npm doesn't have a link: dependency type. yarn likely was designed to remove them due to having that type. I didn't implement that feature so it's hard to tell just from the source.

I see your point about yarn keeping things tidy and agree, but it's frustrating that there isn't a way to tell it that it doesn't own something. For instance if I put:

"dependencies": {
  "foo": "fuse:/db",
  "bar": "modulemap:*"

I'd hope that yarn would ignore both of these since it doesn't understand them. Instead it treats these as just unresolved versions against the npm registry and either throws an error if npm doesn't have the package or prompts for a version # from npm. Maybe this is the more relevant discussion.

Nit about symlinks -- since yarn doesn't have symlinks to files, only package folders, I don't see why it should prune them. But we also create directories in node_modules then throw symlinks in those so just fixing that wouldn't help much. Indicating to yarn that it doesn't own certain children of node_modules would

You could, for instance, argue that npm would be correct in blowing away links or modules created by yarn workspaces since it doesn't understand where they came from

Yes it would, and probably should. If something doesn't match the package manager expected output, it should be removed, otherwise the guarantee that yarn install is all that's needed to get the exact same node_modules layout anywhere doesn't hold.

I'd hope that yarn would ignore both of these since it doesn't understand them.

No, it should fail the installation. If something causes a package not to be installed (such as an unknown protocol), the whole installation is bogus, since the requested dependencies won't be there, breaking the contract.

If you want to use custom protocols, then put them in another key than dependencies.

I don't see why it should prune them

They shouldn't exist in the first place ๐Ÿ™‚

@arcanis -- This goes back to my point that node owns node_modules, not the package manager, or at least not any one package manager. There's absolutely no reason or benefit for yarn to aggressively prune things from this folder that it clearly doesn't own since it has no way to create them.

If you want to use custom protocols, then put them in another key than dependencies.

Sure, except that yarn would still delete the files generated. Not having an escape hatch seems unnecessarily controlling.

@arcanis -- This goes back to my point that node owns node_modules, not the package manager, or at least not any one package manager. There's absolutely no reason or benefit for yarn to aggressively prune things from this folder that it clearly doesn't own since it has no way to create them.

@MarkKahn - apparently you and the Yarn team have clearly differing opinions there. For the strong consistency guarantees that we want Yarn to deliver, we have chosen to implement this behavior. node_modules folder is a generated folder and it is and should be governed by a package manager. Relying on implicit, undefined behavior is not safe as demonstrated in your issue.

Sure, except that yarn would still delete the files generated. Not having an escape hatch seems unnecessarily controlling.

Is yarn link not a proper escape hatch for this?

Maybe .yarnrc should have an option that disables pruning, or makes it less aggressive?

This would allow for better interop between tooling.

@rally25rs Not arguing with a point, that yarn install is cleaning all symlinks and directories inside node_modules, I have another problem here.

If there is a symlink in a node_modules, that leads to an external directory, yarn install is following it and purging also an external content. Don't you think, that behavior should be just unlinking a symlink instead of following it?

Facing the same problem. I have a ./scripts/postinstall shell script, that does inside:

postinstall() {
    local repo_root="$(determine_repo_root)"
    ln -sf $repo_root "$repo_root/node_modules/@"
}

which leads to:

ls -la | head
lrwxr-xr-x     1 farewell  staff      51 Oct  1 13:57 @@ -> /Users/farewell/Workflow/Projects/Dash/corp-dash-ui

After running my yarn install --production it totally destroys my repository (deleting even .git folder). It's kind of undesirable and unexpected behaviour.

Can you make repro project that we can just checkout to try out? Thanks ๐Ÿ™‚

@arcanis, I created a repo for my case and found an interesting thing.

I wrote a steps for reproducing in README, but in short, the problem can be observed if and only if the name of symlink starts with @. Sounds like some internal policy, that we missed? ๐Ÿ˜•

Oooh! That totally makes sense - if something starts with an @ we assume
it's an npm scope, in which case we have to recurse inside to find out the
actual package folders. So from Yarn point of view, @/sources is a package
called "sources" from the scope "@".

Three solutions :

  • special case "@" in Yarn
  • don't recurse if the scope folder is a symlink
  • ask users not to create such symlinks

On Wed, Oct 3, 2018, 9:25 AM Eli Shvartsman notifications@github.com
wrote:

@arcanis https://github.com/arcanis, I created a repo
https://github.com/shvar/yarn-install for my case and found an
interesting thing.

I wrote a steps for reproducing in README, but in short, the problem can
be observed if and only if the name of symlink starts with @. Sounds like
some internal policy, that we missed? ๐Ÿ˜•

โ€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/yarnpkg/yarn/issues/5709#issuecomment-426552592, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AA_WaxSv6NmqENQ_L4P-siGWFtdjHVoaks5uhHSOgaJpZM4TeVlF
.

@arcanis, if I would vote, I would choose "don't recurse if the scope folder is a symlink".

I like an option "ask users not to create such symlinks", but if I and @farewell7117 faced this twice in a week, there would be also other users with the same problem.

Also, just thought about another use case. What would happen if somebody would use yarn link for a scoped package?

Using @/ as a prefix for your ./src directory seems to be a common pattern now, maybe the logic @arcanis mentioned could be fixed by only considering it a scope if there's characters between @ and the first /?

For example, test the path against the regex \@([A-z]+)\/ and skip treating it as a scope if false is returned would result in the following behaviour:
@/components/Dock -> not a scope, do not modify contents
@cyclejs/core -> is a scope, modify

As far as I know on NPM registry and yarnpkg, @/x can't even possibly exist, it would be referring to the package called x under the scope of a user/org called `` (empty string) which AFAIK can't exist on the registries.

So by this logic yarn should assume packages starting with @/ are not scoped since it isn't actually possible for packages to be scoped under @/.

@ having two different meanings seems a bit confusing ๐Ÿ˜• Is that a common thing?

I just got hit by this bug as well - and more precisely the user of a package I maintain https://github.com/Rush/link-module-alias. The reporter had all of their source files removed by yarn as indicated here https://github.com/Rush/link-module-alias/issues/3 :-(

In my opinion yarn should never remove files from a 3rd party symlink unless.

In order to reproduce:

git clone git@github.com:Rush/delete-bug.git
cd delete-bug
yarn
ln -rs components/ node_modules/@components
yarn add fuse.js
# components/index.js has been deleted by yarn :-(

I haven't had time to work on this yet, but would be interested to merge a fix - would someone be interested to contribute it? I could review and merge it in time for the next release ๐Ÿ™‚

FYI, just got a report that a user lost several hours of work:
https://github.com/Rush/link-module-alias/issues/3#issuecomment-479632353

@arcanis What will be the solution?

I use symlinks to avoid relative paths in require calls.

app/
view/
node_modules/
  app -> ../app
  view -> ../view

After yarn install|add|remove my symlinks are removed.

This issue also manifests when using yarn 1.16.0 in combination with lerna, which is a monorepo manager that symlinks packages to each other. yarn panics with An unexpected error occurred: "ENOENT: no such file or directory, copyfile errors when trying to manually install dependencies inside a package folder, and then deletes the symlinks.

Incidentally, yarn check --integrity also has odd interactions with symlinked dependencies; because those dependencies don't appear in yarn.lock, the integrity check fails.

I'm currently hard at work on the v2 trunk, but if you open a PR that implements option 2 ("don't recurse if the scope folder is a symlink") I'd be happy to review it ๐Ÿ™‚

I'm currently hard at work on the v2 trunk, but if you open a PR that implements option 2 ("don't recurse if the scope folder is a symlink") I'd be happy to review it ๐Ÿ™‚

What about regular symlinks like I said before? I don't use @.

I'm in this same situation -- I need to have a symlink in my node_modules and I don't want yarn to delete it. My use case is I have a private copy of a package (pkgA) checked out as a git submodule in $TOP/pkgA; I'm working on both my main app and pkgA, so I want my build (and hot reload) to pick up changes in both. I've tried link: and workspaces but neither is helpful. Simple symlink node_modules/pkgA -> ./pkgA works perfectly, except that yarn deletes it! Could yarn just have an option for "known non-Yarn files in node_modules" that it just wouldn't delete?

Same here, I would like to add a package during development and keep the symlinks in node_modules, rather than having to recreate them if yarn add <package> is executed.

same here...

Was this page helpful?
0 / 5 - 0 ratings

Related issues

davidmaxwaterman picture davidmaxwaterman  ยท  3Comments

chiedo picture chiedo  ยท  3Comments

esphen picture esphen  ยท  3Comments

Ambroos picture Ambroos  ยท  3Comments

MunifTanjim picture MunifTanjim  ยท  3Comments