For some time, running RuboCop in my app has been slow, and the problem has gotten worse over time. I (wrongly) attributed this to the growth of my app. Then yesterday, RuboCop stopped honoring exclusions in my config, showing hundreds of offenses that should have been ignored.
After some searching, I discovered the -d flag, and I found that RuboCop was trying and failing to process hundreds of files that it should not.
Unprocessable file /path/to/my/app//tmp/cache/bootsnap-compile-cache/33/0b5df563415472: ArgumentError, invalid byte sequence in UTF-8
The files in question were things like Bootsnap cache, Rails FileStore cache, and ActiveStorage. None of them are tracked in my git repo.
It could be argued that I should have excluded these files in my config in the first place. It could also be argued that RuboCop should ignore certain files by default, so as not to set a trap for unwary users like me. Hundreds of silent errors were making it difficult, and then impossible, to use RuboCop.
I'd like RuboCop to ignore certain files by default, or via an opt-in. Rather than try to cherry-pick exclusions to fit popular frameworks, I propose that RuboCop add an option for ignoring files ignored by version control.
Maybe something like:
AllCops:
- ExcludeGitIgnoredFiles: true
Instead, RuboCop could just fail noisily on these types of errors. This only became a problem because I was unaware of it for so long.
I added the following to my config to resolve this problem and restore execution speed:
AllCops:
Exclude:
- tmp/**/*
- storage/**/*
- '{}/**/*'
The /tmp directory is obvious.
/storage is specific to ActiveStorage in local development.
The third directory is the cache directory when setting the Rails cache to FileStore in development mode.
It's hard to imagine a valid usecase where it makes sense to check files in tmp/, or in .git (#7819). They should probably be always excluded unless specified explicitly.
I like the idea of relying on .gitignore if present.
In general I've always restrained from adding form of VC integration as that's a slippery slope - does Git get preferential handling and we ignore other VCSs, how far do we want to go with some form of git integration (e.g. do we go as far as shelling out to git for some things like obtaining the list of files in the project), how to handle projects with submodules, etc. If we just do the simplest - parse .gitignore in the folder RuboCop was started in that'd be fine by me, though.
Thanks for the fast, thoughtful responses! I suggested .gitignore because git is the only VC I'm familiar with, but I'm not attached to any specific implementation of this idea.
I had essentially this same problem, for some reason it kept trying to access files in my storage/ directory (for Rails' ActiveStorage, so most of the stuff in there is image files).
Unprocessable file /Users/connorshea/Programming/vglist/storage/va/ri/variants/udh6wp5cqgt693hi92xxpb0agn4b/546e89d3666ab1f82227cd2c592566e3c9dce1e5862e97b6d09e3bf05b8fb4bd
My AllCops config looks like this:
AllCops:
TargetRubyVersion: 2.6
Exclude:
- "bin/*"
- "db/schema.rb"
- "node_modules/**/*"
- "vendor/bundle/**/*"
- "sorbet/**/*"
I didn't realize it was even trying to access files in storage/ until today because I happened to use the --debug flag to see if I could figure out why Rubocop was taking so long. Adding storage/**/* to my exclude list takes the time it takes to run rubocop down from 1 minute 15 seconds to 8 seconds. I've just dealt with Rubocop being this slow for months without realizing there was a problem 馃槵
Idk why it's even trying to access the files in that directory, they don't have a ruby extension or anything. :/
EDIT: Looking at it more, tmp/ also has a bunch of files that it tries and fails to process. storage/ just had so many that I didn't even notice tmp/.
One thing I might suggest for this is to tell the user if Rubocop is seeing dozens or hundreds of unprocessable files to make them aware that it's trying to do something it shouldn't be.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution and understanding!