Lint-staged: windows chunking git lock race condition when run through husky

Created on 9 Apr 2020  Â·  20Comments  Â·  Source: okonet/lint-staged

Description

When the chunked files are split up into enough groups the chance of a race condition on calling git add increases. If chunk a calls git add while chunk b is adding there will be a git.lock file which will make the process fail.

tested on latest version of husky and lint-staged but still occurs.
Originally experienced with
lint-staged: 10.0.9
husky: 3.0.9

Tested with
lint-staged: 10.1.2
husky: 4.2.3

Issue appears to be in gitWorkflow.js

await Promise.all(
      // stagedFileChunks includes staged files that lint-staged originally detected.
      // Add only these files so any 3rd-party edits to other files won't be included in the commit.
      this.stagedFileChunks.map(stagedFiles => this.execGit(['add', '--', ...stagedFiles]))
    )

There is no way to make sure that the git add commands are concurrent for chunks to prevent having a git lock scenario.

Steps to reproduce

Merged 1k+ files where there are lint changes to be applied call through husky on windows

configuration:
"husky": {
"hooks": {
"pre-commit": "lint-staged"
}
},
"lint-staged": {
".{ts,tsx}": "eslint --quiet --fix --cache",
"
.scss": "stylelint --syntax=scss"
},

Debug Logs

expand to view

Applying modifications... [failed]
→ fatal: Unable to create '{repo}/.git/index.lock': File exists.

Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
Restoring unstaged changes to partially staged files... [started]
Restoring unstaged changes to partially staged files... [completed]
Cleaning up... [started]
Cleaning up... [completed]

Environment

OS: Windows 10
Node.js v10.16.1
lint-staged: latest 10.x

Previously there was a bug pre-10 where windows would fail due to command-line args length
Upgraded to 10 to fix this but keep hitting async issues with git when merging multiple chunks.
The issue at the time of raising was a merge of 47 chunks.

released

All 20 comments

Thanks for the issue.

I think safest would be to not do a Promise.all in the point you indicated, bur rather run them one at a time.

Will open a PR!

Would you mind testing if the linked PR/branch works better for you?

https://github.com/okonet/lint-staged/pull/837

The workflow should be that all tasks are run, and then the git add step runs after them; chunks should not ever contain the same file, unless there's a bug in that code.

Code looks like it should fix the issue.
Downloading now

As only a single file was changed I copied this code file into my node_modules and re-ran the test of lint-staged.
The git issue is fixed but it now fails in cleanup with

  × Cleaning up...
    → git.dropBackup is not a function
git.stashBackup is not a function
git.dropBackup is not a function

Those method names are from before v10.1.0, maybe you copy-pasted something wrong? Might be safe to delete node_modules/lint-staged and then install it again to get to a clean state.

Ah apologies... My current version is 10.0.9
I didn't take this into account when only copying the changed file

Copied over all files and ran npm install in case your own packages had updated.
re-running the test, sorry for the mistake (I have other deliverables so rushing)

This branch does indeed successfully run linters on all files without hitting the issue this defect was raised for. Thank you so much for a quick response.
I look forward to the notification that the code is merged

No problem, thanks for the issue. I personally don't have a Windows environment so testing is limited to CI.

Regarding that... when you're free, maybe you would like to try help us debug why our test suite doesn't pass on GitHub CI Windows runners? Currently we use Appveyor, but it's slower than GH Actions. There's a merge request open at https://github.com/okonet/lint-staged/pull/833.

Please don't take any pressure, the current setup does work sufficiently for our needs. 😄

Not a defect but wonder if I should raise a new suggestion related to the same thing.
lint-staged could perform better in some scenarios by filtering out files that there is no configured linter for first.
In the case where a small amount of the staged files actually have linters configured it would reduce the amount of work to be done.

It depends on how heavy the parsing/checking would be as to whether this makes sense for the package as a whole

I will add a note to return to this task later and see what I can do.
Right now (as stated) I have pressing commitments.

Thanks for the suggestion. This particular place is the only one where the entire stagedFileChunks is used after generating the tasks, which need all files for matching.

It would certainly be possible to create a (smaller) list of matched files, and then chunk that for the git add part. I haven't done it because it has not been a priority, and to be honest I didn't think about it.

I appreciate your help!

I'm not sure a consumer would expect any file to be touched by lint-staged if they had not configured a linter.
i.e. if your configuration is like mine

"lint-staged": {
".{ts,tsx}": "eslint --quiet --fix --cache",
".scss": "stylelint --syntax=scss"
}

And you staged a .pdf file or .js file or anything other than those configured.
Personally I would not expect this package to touch it at all.
If it does then there is the risk that a bug in this package could damage the files

I guess this is a fairly large refactor though as you just take ALL staged files and run the tasks without parsing the tasks up front.
If the key were regex this may be an easier job as you could test all staged files against the regex keys and only handle the ones that match.
Anyway, unrelated to this fix.
Thank you again for the fast reactions

Running git add on the staged files not matched by tasks doesn't matter, because either:

  1. They have only staged changes, and no unstaged changes, so nothing happens
  2. They have both staged and unstaged changes (partially staged). In this case the unstaged changes are hidden before running tasks, so git add again does nothing (because at this point there are no unstaged changes). The unstaged changes are restored only after running this step.

In any case, you are correct that git add call is totally unnecessary for those files, and removing them would at least be a performance gain.

There was a bug when we were using 10.0.0-7beta that caused a .pdf file to be corrupted.
Upgraded to 10.0.9 and this stopped happening.
Never tracked this down it's root cause as it was already fixed.
However, it illustrates the point that if you only ever handled matching files then issues like that should not occur.

Unfortunately, I don't have the resource to dig in and find the cause of this old issue now.
I think it was something to do with prettier trying to run on the file but this should never have happened when .pdf is not configured.

I would need to look at your task matching code to see how trivial that would actually be to do.
Either way, it is not a discussion for now. Thanks again

I actually managed to implement this feature in the same PR!

Since we already go through all (chunked) tasks in runAll, we can add all matched files to a Set, which will by definition contain unique items.

Then I just pass this set instead of the full chunkedFilepaths to GitWorkflow, chunk it, and then run git add "serially" for all the chunks.

Let me know if there's any unexpected issues with the other change!

:tada: This issue has been resolved in version 10.1.3 :tada:

The release is available on:

Your semantic-release bot :package::rocket:

Was this page helpful?
0 / 5 - 0 ratings