Pylint: lint all files in a directory

Created on 29 Sep 2014  ยท  37Comments  ยท  Source: PyCQA/pylint

Originally reported by: Buck Evan (BitBucket: bukzor, GitHub: @bukzor?)


It would make things easier for me to use pylint in CI if I could simply say pylint . in order to tell pylint to look at all my files.

flake8/pep8 already have this behavior. They make it configurable via a filepatterns config (default *.py) and an exclude config (default .svn,CVS,.bzr,.hg,.git,__pycache__).

Discussion from #pylint:

buck1: jcristau: in flake8, I can run `flake8 .` and it has a set of configured match/exclude patterns to check everything in the directory
buck1: currently pylint assumes that any directory is a package, and fails if not
buck1: how would you feel about including the flake8 behavior
buck1: would make my life 100% easier when trying to run pylint against everything during CI
jcristau: doesn't sound crazy, though there might be backwards compat concerns...  please float the idea in a ticket or on the list, i'm not the maintainer :)
buck1: jcristau: i'll make a feature-request ticket. Sounds like something I might be able to implement. Just wanted to see your thoughts first.

enhancement topic-command-line

Most helpful comment

I would like to add that I'm keen on this proposed functionality - pylint .

All 37 comments

_Original comment by_ Claudiu Popa (BitBucket: PCManticore, GitHub: @PCManticore):


Sounds good. I also encountered the need for this feature when running pylint over a large corpora of scripts. Nevertheless, Pylint shouldn't emit every time when it finds a folder without an init file, as in the case of the namespace packages from Python 3.3+.

_Original comment by_ Claudiu Popa (BitBucket: PCManticore, GitHub: @PCManticore):


Issue #512 was marked as a duplicate of this issue.

_Original comment by_ BitBucket: rneu31, GitHub: @rneu31?:


Would the entire directory be scanned in a serial manner (in one thread/process)?

For the CI process where I work we wrote a wrapper script that uses multiprocessing and consolidates the outputs of pylint (and various other tools) to avoid the large time hit of running Pylint across a large set of files.

_Original comment by_ Claudiu Popa (BitBucket: PCManticore, GitHub: @PCManticore):


The scanning should be serial, that is finding the files. The actual processing can already be parallelized with --jobs. So, as long as this https://bitbucket.org/logilab/pylint/src/33e334be064caf5b5d251929fc2f74bdf400f6c4/pylint/lint.py?at=default#cl-882 returns all the files from a directory that should be analyzed, then the parallelization comes freely with --jobs.

_Original comment by_ Florian Bruhin (BitBucket: The-Compiler, GitHub: @The-Compiler?):


@PCManticore mentioned someone worked on this on the EuroPython sprints but never uploaded a PR - if you're that person, can you maybe upload a PR (even if work-in-progress), so I could look at this when I have some time?

I am looking into this enhancement. Will provide the update once code is ready.

Hi,
Can anyone help me here to understand why CI might have stalled or failed?

Thanks,
Rohit

Is there any update on this / will this be a part of the 2.0 release?

No one continued work on it. Unless someone takes charge of it, it will probably not be part of the 2.0 release. I will probably not have time to tackle it, since I have other priorities to bring to an end for 2.0.

Note that in 2.0 we support implicit namespace packages. which have no init.py inside and supporting generic directories with scripts should not be that hard after all.

I am unfortunately stuck using 2.7. I have been following some of the discussions on work being done for 2.0, but not the code directly. Is there any work in this area that makes sense to do?

Yes, implementing this should be pretty straightforward once you figure out the details.

I will provide some high level details, in case you might be interested in tackling this (or if someone else see the comment as well):

  • this behavior should be triggered through a flag. It can either default to False, which means the current behavior of not processing directories without __init__.py or we can default it to true, so it would act more in line with the other tools, but in any way, it has to be controllable through a flag.
  • We are collecting the files for analysis using this function https://github.com/PyCQA/pylint/blob/master/pylint/utils.py#L800. It uses inside another function https://github.com/PyCQA/astroid/blob/master/astroid/modutils.py#L460. The list_all flag of modutils.get_module_files can be used for listing all the files, indifferent of having or not __init__.py in that directory.
  • From here, tying these things together should be straightforward, including having tests and whatnot.

I would like to add that I'm keen on this proposed functionality - pylint .

Even better-- just pylint should have the same behavior as pylint ..

Would be really awesome to get this fixed

Hey folks, just wanted to say that I left a couple of suggestions before on how this feature could be implemented in pylint. As previously mentioned, we'll get to it someday, but as far as it concerns me, I don't have enough time on my plate for implementing this feature. A PR from someone else could definitely speed this up!

As we do depend on this a little bit it would be awesome to get it tackled and I would even enable our gsoc student to have a look at it. What would be helpful would be any insight or mentorship he could get, as he obviously doesn't know the codebase and it seems to be quiet a deep change?

Hey @Razzeee That sounds like a great first step. I can offer mentorship on this issue if someone wants to write the code for it. I suggested an initial step in this comment https://github.com/PyCQA/pylint/issues/352#issuecomment-229818309 but I would be happy to write a more in-depth comment on what exactly needs to be changed.

@PCManticore After looking into this and following the comment I was able to figure it out.

In astroid/modutils.py - L472 we are stoping the get_module_files function on 2 conditions

  • If list_all is not True
  • If there is no __init__.py file present in the given directory.
E.g
Example
โ”œโ”€โ”€ File.py
โ”œโ”€โ”€ file.txt
โ””โ”€โ”€ Folder
    โ””โ”€โ”€ new.txt

Here src_directory = "Example"
Now doing get_module_files(src_directory, blacklist) would return an empty list
and doing get_module_files(src_directory, blacklist, True) would return a list having path to all the files i.e ['Example/file.txt', 'Example/File.py', 'Example/Folder/new.txt']

Solution:
We can't just remove the __init__.py condition because we actually want this to be a command line flag so for that we can modify the code.

python def get_module_files(src_directory, blacklist, list_all=False, init=False): files = [] for directory, dirnames, filenames in os.walk(src_directory): if directory in blacklist: continue _handle_blacklist(blacklist, dirnames, filenames) if not list_all and not init: dirnames[:] = () continue for filename in filenames: if _is_python_file(filename): src = os.path.join(directory, filename) files.append(src) return files
Here init=False means behavior of not processing directories without __init__.py

After this modification

  • get_module_files(src_directory, blacklist) would again return an empty list
  • get_module_files(src_directory, blacklist, True) or get_module_files(src_directory, blacklist,False, True) would give list of all the files.

Please tell me whether I am on right track or not.

Now the thing I was not able to understand was your refrence to pylint/utils - L800. What will I have to do there ?

Also can you tell me which file will I have to change to make this a command line flag

Hey @mzfr That sounds about right! Thanks for looking into this! Regarding the reference from utils.py:800, I have no idea! The code has changed since I last linked it. Regarding the file to change, you can take a look into lint.py, if you look for options you should find the place where it could go, Out of the top of my head, I'd say that in either Runner or PyLinter. Let me know if this helps!

@PCManticore ok so I am going to ignore utils:800 now.
For options I think this is where I have to add the options.
So following the format the flag would look like:

(('init',
  {'default' : 'True','type': 'yn' 'metavar' : '<y_or_n>',
   'help' : 'When set to False - not processing directories without __init__.py'
            'when set to True  - processing directories without __init__.py'}),

This is what I think let me know whether should I add any short or group in the option.

This is what @PCManticore linked to back when he referenced to that line https://github.com/PyCQA/pylint/blob/5d40a07684be7ed6dbaa54e49c3d436005a533bf/pylint/utils.py#L800

@mzfr Sounds about right! One thing though, init as a name doesn't suggest the intention but rather the semantics. Maybe lint-all or something along these lines?

@PCManticore I think your reference to /pylint/utils.py wasn't supposed to be ignored.
On line 946 we have:

has_init = (not (modname.endswith('.__init__') or modname == '__init__')
                    and basename(filepath) == '__init__.py')

which assign boolean value to has_init

then from line number 949 - 951 we have a condition to call get_module_filesfrom astroid/modutild.py.
The condition is:

        if has_init or is_namespace or is_directory:
            for subfilepath in modutils.get_module_files(dirname(filepath),black_list,list_all=is_namespace):

Now what I can think of is that we have to link our option i.e --lint-all y/n to this function and then simply pass the param to function call, something like:

       if has_init or is_namespace or is_directory:
            for subfilepath in modutils.get_module_files(dirname(filepath)black_list,list_all=is_namespace,lint_all):

Let me know if I am thinking something wrong here.

Now the problem I am having here is how can I link the option(lint_all) with any of the function I want.

@mzfr If I'm understanding right, you need to pass the result of lint-all option to expand_files, which you can do like we're already doing here (https://github.com/PyCQA/pylint/blob/740906eeacc75049f1f85a955f5f23173db8b36f/pylint/lint.py#L930) for other options (self.config.black_list for instance is --ignore from here https://github.com/PyCQA/pylint/blob/740906eeacc75049f1f85a955f5f23173db8b36f/pylint/lint.py#L293).

Let me know if this works for you! Also you might send what you have as a PR, it will be easier to review and play with it in case there are still some issues to be fleshed out.

I think I'll make a PR for the work I have right now and then it will be
easier for everyone. Now just one thing the changes I have made in pylint
will go in pyint repo and changes in astroid will go in astroid repo that
means in total I'll have to make 2 PR right ?

On Thu, Jun 14, 2018 at 12:57 PM, Claudiu Popa notifications@github.com
wrote:

@mzfr https://github.com/mzfr If I'm understanding right, you need to
pass the result of lint-all option to expand_files, which you can do like
we're already doing here (https://github.com/PyCQA/pylint/blob/
740906eeacc75049f1f85a955f5f23173db8b36f/pylint/lint.py#L930) for other
options (self.config.black_list for instance is --ignore from here
https://github.com/PyCQA/pylint/blob/740906eeacc75049f1f85a955f5f23
173db8b36f/pylint/lint.py#L293).

Let me know if this works for you! Also you might send what you have as a
PR, it will be easier to review and play with it in case there are still
some issues to be fleshed out.

โ€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/PyCQA/pylint/issues/352#issuecomment-397198813, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AP2pP0VJaQBugWd2-Zs66_g7GU1oMlG1ks5t8hBmgaJpZM4Hei-Z
.

Yes, that will be two PRs! Thank you for working on this!

Is this feature targeted for release in the near future? pylint . would be very useful.

Hi All,

I just came across this on SO and I believe I found a plausible work around.

I posted details to my answer on StackOverflow

Update

I got this to work by jumping into the directory and using the * :

$ cd /app
$ pylint *.py

or try:

$ pylint /path/to/folder/*.py

Update added .py from @darshakkakkad thanks for the catch!

Using the asterisk * instead of the . worked. I hope this helps some of you! =)

Just to add to @JayRizzo 's reply, if you would like to run pylint on python files only, use

$ pylint /path/to/folder/*.py

If you do not use that, it will run on all the files in the directory (e.g: README.md). You would not want to run pylint on non-python files.

One workaround is to explicitly pass all of the .py files:

find . -type f -name "*.py" | xargs python -m pylint 

Unfortunately I don't think that this is as simple as recursing into directories and finding everything.

pylint relies on what's it's linting to exist in sys.path [1] so that astroid can "import" them when they're referred to by another module. The additions to sys.path happen all the way out in the Run object at the moment. So recursing into directories needs to happen before this sys.path modification so that discovered directories and modules can be put into sys.path.

sys.path manipulation: https://github.com/PyCQA/pylint/blob/86d281103cfac49f7ca9df7c4b23baf70ac8d439/pylint/lint.py#L1744
Where module expansion and subsequent filtering happens: https://github.com/PyCQA/pylint/blob/86d281103cfac49f7ca9df7c4b23baf70ac8d439/pylint/lint.py#L1053

There's a bit of a chicken and egg problem here because expand_modules currently depends on sys.path already being modified so that it can figure out the module name of the module or package that has been passed in [2]. We should be able to work around this by making the modifications to sys.path as we loop through arguments and as we recurse into directories.

Going back to where expansion happens, we can move directory expansion out into the Run object. It's where it should be anyway and it's not something that the Linter should be responsible for. However we would need to make use of Linter.should_analyze_file because we need to filter out non-python files while we have the additional information of what was passed in by the user and what wasn't (because it came from recursing into directories). We can't simply move should_analyze_file from the Linter without also moving module expansion outside of the Linter because we need to be able to filter out files from module expansion as well.
I think it makes sense to do all of this, but it does mean that we would be making a breaking change because should_analyze_file would no longer be be overridable (atleast not on the Linter but we _could_ make it overridable on the Run object). However you can still configure some filtering using the black_list and black_list_re options.

[1] We should really get rid of this. It should be possible to avoid doing it by modifying it into a new list and passing that to wherever sys.path is used at the moment. But this is a whole other issue and the underlying problem of where the path is modified still applies.
[2] Side note: This doesn't work for implicit namespaces at the moment because we have no way of differentiating between a directory and an implicit namespace.

Wanted to throw in another use case -- I just found a bunch of my unit tests in my Django project were being skipped by pylint because Django doesn't require an __init__.py to run the tests.

I don't mind adding an __init__.py ... but I want an automated way to tell me one is missing -- or to just iterate over the files.

I also want to praise pylint --

I had a couple of unit tests that were indented wrong and being skipped -- pylint pointed this out and I found a major bug once I actually enabled the unit test again! Thank you!!!!

Hello, I'm affected by this and might be interested in fixing it. Is there anyone working on it right now? It also seems like a lot of work was done already. Is there something to know or a branch to rebase before trying to do this ticket?

@PCManticore is a breaking change like "should_analyze_file is no longer overridable" a real problem?

Thanks in advance.

I'm also interested in if anything is being done about this.

The suggestion above by @bcoughlan is great. But it doesn't into account directories that have been ignored by my pylintrc.

Is there a way of linting everything in my project, excluding the directories ignored in pylintrc?

I've also tried using pylint-runner, but it produces the same results as @bcoughlan 's solution.

@jaoxford it's been 27 days since I asked we can safely say that no one is working on it right now. Do you want to handle this ticket?

It would be nicer if pylint adopted pytest's way of globbing files/dirs, defined somewhere around:
https://github.com/pytest-dev/pytest/blob/d79179a239ac89b318be47c99ff965376d177257/src/_pytest/config/argparsing.py#L121

I would also find this feature useful. Also, I think it would be nice to have an option to receive the directory name as input and then check all its python files.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

pylint-bot picture pylint-bot  ยท  3Comments

z4y4ts picture z4y4ts  ยท  3Comments

DGalt picture DGalt  ยท  3Comments

mrginglymus picture mrginglymus  ยท  3Comments

GergelyKalmar picture GergelyKalmar  ยท  3Comments