Pnpm: EPERM: operation not permitted, rename

Created on 24 Jan 2018  Â·  27Comments  Â·  Source: pnpm/pnpm

pnpm version: 1.30.2

Code to reproduce the issue:

Install a very large number of dependencies (like 2,084).

Expected behavior:

Everything is always installed and unpacked.

Actual behavior:

Fairly often we get an error like:

EPERM: operation not permitted, rename 'Z:\sp-client\2\common\temp\pnpm-store\2\onedrive.pkgs.visualstudio.com\highcharts\5.0.14_stage' -> 'Z:\sp-client\2\common\temp\pnpm-store\2\onedrive.pkgs.visualstudio.com\highcharts\5.0.14\node_modules\highcharts'

This usually happens when the store is empty to start with. It seems like it may be an issue with file locking (possibly related to the Windows Defender holding locks on newly created files). This also occurs with different packages each time. It seems to happen more often with larger packages. My naive assumption is that the larger packages just got extracted to disk and we immediately try to rename them while MsMpEng.exe is also trying to scan them.

Rush runs pnpm in a loop, attempting installations up to 5 times, which helps mitigate the issue, but it definitely slows down installs (and confuses people when they see errors).

Suggestion is to run the rename operation in a retry loop, with an exponential backoff.

Additional information:

  • node -v prints: 8.9.4
  • Windows, OS X, or Linux?: Windows
bug

Most helpful comment

Yes, we are good. We've done over 1,000 installations since then, and so far nobody has reported any problems. :-) Thanks again!

All 27 comments

This code is used to rename the unpacked folder: https://github.com/zkochan/rename-overwrite/blob/master/index.js. So now it waits only one time for 200ms. We can make it smarter

This has been happening to me as well, but while trying to install any global package.
Should I reproduce & give you logs or is it likely the same issue?

@vaaski seems like you have the same issue, so lets' see if the fix will help in your case as well

It is weird because rename-overwrite uses rimraf and graceful-fs both of which retry on EPERM.

I am changing to 8 retries during 7.2 seconds. Or should the delay be longer?

[email protected] will now retry renaming. Lets' see if the errors will stop happening

Rush has similar code:

https://github.com/Microsoft/web-build-tools/blob/7930ea56efe8f8d9a91aebf359025432aed44d94/apps/rush-lib/src/utilities/Utilities.ts#L110

FYI I encountered this issue mostly when trying to delete a file/folder and immediately recreate a new file/folder with the same name. Whereas if you reuse and existing folder (or overwrite an existing file), the problem did not seem to occur. This is better because it avoids the delay entirely.

I wonder if there is an officially documented best practice somewhere for coexisting with virus scanners.

I think there must be a regression in this code somewhere. As soon as we upgraded to 1.31.2, I began seeing other errors, such as:

 WARN  There's a directory in node_modules without package.json: R:\sp-client\common\temp\node_modules\.onedrive.pkgs.visualstudio.com\request\2.79 .0\node_modules\caseless

It looks like the particular version of caseless this symlink is pointing to doesn't exist. I wonder if we somehow swallowed an error while doing the rename?

The retry that was added, only retries EPERM errors https://github.com/zkochan/rename-overwrite/blob/master/index.js#L37

There were lots of other changes as well since 1.30 but I haven't noticed such errors yet

Okay, I'll keep testing out to see if I can find any more info.

I think there must be a regression in this code somewhere. As soon as we upgraded to 1.31.2, I began seeing other errors, such as:

 WARN  There's a directory in node_modules without package.json: R:sp-client\common\tempnode_modules.onedrive.pkgs.visualstudio.com\request\2.79 .0node_modules\caseless

It looks like the particular version of caseless this symlink is pointing to doesn't exist. I wonder if we somehow swallowed an error while doing the rename?

@nickpape-msft is it possible that a previous installation to the same project was interrupted? or that another installation is running in the same directory at the same time?

I looked at the code that creates the node_modules folder for the dependencies, it is currently not an atomic operation, so it can get corrupted https://github.com/pnpm/supi/blob/master/src/link/index.ts#L258-L262

We have to link dependencies to a _stage folder and then rename the folder. It will prevent corruption of R:\sp-client\common\temp\node_modules\.onedrive.pkgs.visualstudio.com\request\2.79.0\node_modules\

@zkochan I think that error is really a part of the EPERM issue. Typically, Rush attempts to retry installations up to 5 times and does not clean out the node_modules and pnpm-store in in between retries. When I changed the NUMBER_OF_INSTALL_ATTEMPTS to 1, I almost always saw the EPERM error (probably since we do clean out the node_modules and pnpm-store if rush install fails.

We also did more investigation, and do not think that this is a locking issue with Windows defender. In fact, when we looked at which processes were holding locks in that folder, it looked like pnpm was holding locks to random files inside the folder. This makes me think there could be some sort of resource leak in the tarball_fetcher or unpack_stream.

We only see this with large packages (>20 or 30 MB, such as ckeditor which are the ones that are hosted on the blob cache). You might want to try spinning up your own Verdaccio server, publishing a couple large packages, and trying to install them with PNPM.

@iclanton @pgonzal FYI

Another person who is trying to build this branch is also hitting this error (intermittently with the EPERM issue).

Error: sha1-pW3tByoPy08JoASnkf1LYB6+RP8= integrity checksum failed when using sha1: wanted sha1-pW3tByoPy08JoASnkf1LYB6+RP8= but got sha1-wyrIActJY1wfyqaqnDQ5udzsw54=. (1244601 bytes)<br>     at Transform.on (C:\Users\ianc\.rush\pnpm-1.30.2\node_modules\pnpm\node_modules\@pnpm\tarball-fetcher\node_modules\ssri\index.js:275:19)<br>     at emitNone (events.js:91:20)<br>     at Transform.emit (events.js:185:7)<br>     at endReadableNT (_stream_readable.js:974:12)<br>     at _combinedTickCallback (internal/process/next_tick.js:80:11)<br>     at process._tickCallback (internal/process/next_tick.js:104:9)

Which is interestingly also part of the tarball-fetcher.

hm, maybe the issue is caused by a lock on the stage folder, not the target folder.

This part of tarball-fetcher makes me think that

  • when a tarball is bad, pnpm doesn't stop unpacking the bad tarball
  • pnpm retries the download before the bad tarball has finished unpacking
  • also, during the retry, pnpm unpacks to the same stage folder again, not even cleaning it up first

That sounds about right...

image

Here is an example file. Currently, PNPM is process 44104, but it also failing to rename the staging directory.

image
image

Apologies for the consfusing logging. Essentially, the first picture is the initial call to _renameOverwrite() which runs "rename". The second picture is the first .error() handler, which calls rimraf() then rename(). What you notice is that it is not rimraf that is erroring, it is rename. The next "delay" logging line says it is now delaying and then recursively calling _renameOverwrite.

Even when I raised the retry limit to 2,000,000 we still encountered the error.

We can also show that the target folder is empty/does not exist.

image

Unfortunately, I could not reproduce the issue (I tried with this package.json). However, your logs confirm my assumption, so I'm going to try and update the fetchers to use a new random stage folder every time during unpack

@nickpape-msft I published [email protected]. @pnpm/tarball-fetcher will unpack to a stage folder and if the unpacking fails, it will unpack to a different stage folder. I believe in this case there should be no process hanging on the stage folder

This should work now in [email protected]

also, I reverted the initial fix in rename-overwrite, which did not help

Thanks a ton for the quick fix! 😃 I just tested 1.31.6 and all operations appear to be working fine. We'll have some other people do more in depth testing on Monday, but so far it looks good! I really appreciate all your help investigating these weird edge cases.

Cool, I'll have to optimize the code though because I added some redundant renamings. It was the fastest way to fix it

as far as I understand, the fix worked. With [email protected] I optimized the fetchers. I removed the redundant file operations that were added during the "fast fix"

Yes, we are good. We've done over 1,000 installations since then, and so far nobody has reported any problems. :-) Thanks again!

And it somehow broke again with stream-events:
ERROR  EPERM: operation not permitted, rename 'C:\Users\ExE Boss\.pnpm-store\2\registry.npmjs.org\stream-events\1.0.4\_tmp_27292_610b691e00196398c6949a3351f214d8' -> 'C:\Users\ExE Boss\.pnpm-store\2\registry.npmjs.org\stream-events\1.0.4\node_modules\stream-events'

See also #1090.

Happening to me as well, after it says

 WARN  Moving @typescript-eslint/eslint-plugin that was installed by a different package manager to "node_modules/.ignored

next line is:

 ERROR  EPERM: operation not permitted, rename 'C:\Users\Dionysusnu\Programming\TSBlox\node_modules\@typescript-eslint\eslint-plugin' -> 'C:\Users\Dionysusnu\Programming\TSBlox\node_modules\.ignored\@typescript-eslint\eslint-plugin'
at ignored\

Should I create a new issue?

ok, create one

FYI @typescript-eslint/eslint-plugin introduced an optional peer dependency for the first time this month. Might be related.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

KSXGitHub picture KSXGitHub  Â·  37Comments

zkochan picture zkochan  Â·  49Comments

nickpape picture nickpape  Â·  40Comments

zkochan picture zkochan  Â·  35Comments

aecz picture aecz  Â·  28Comments