Pytest: Configuring discovery for paths with namespace packages

Created on 19 Apr 2019 · 18Comments · Source: pytest-dev/pytest

The structure of Python namespace packages cannot be determined just by looking at the filesystem. Consider the following directory tree:

src/
    core.py
    aaa/
        test_core.py
    bbb/
        core.py
        test_core.py

If src/ (but nothing underneath it) is in sys.path, we have the following modules:

core
aaa
aaa.test_core
bbb
bbb.core
bbb.test_core

where aaa and bbb are namespace modules. We might also have other parts of the namespace modules outside of this tree, e.g. by adding othersrc/ to sys.path so that import bbb.other will work if othersrc/bbb/other.py exists.)

It's easy to configure pytest to have src in the path, but when discovering and loading src/aaa/test_core.py, as described here, it adds src/aaa to sys.path and imports src/aaa/test_core.py using a different module structure than the programmer may have intended: import test_core. Further, the same thing happens again with src/bbb/test_core.py and now we get a conflict because there's already a test_core module.

It would be nice to be able to configure pytest so that we could tell it that src/ may contain namespace packages and, if it does, they are all intended to be rooted at src/. That is, when discovering src/{aaa,bbb}/test_core.py do not add anything to sys.path but instead generate the fully qualified module name based on the file's position in the hierarchy relative to src/.

Another, more aggressive, option might be to see if the file is under any of the paths already in sys.path and determine the fully qualified module name by the hierarchy relative to the first path in sys.path under which the file falls.

The Stack Overflow question How do I Pytest a project using PEP 420 namespace packages? also discusses the issue.

The way I put these here makes this sound pretty simple (at least as far as description, if not implementation), but there may be other subtleties I'm missing here, about which I welcome discussion.

collection

Source

0cjs

Most helpful comment

I'd like to see pytest work better in this situation as well. I think once we can leave python2 behind this becomes even easier (and more common at the same time) as __init__.py files become entirely optional.

I'm not quite sure the best way to get to that state without breaking everything though 🤔

In my opinion ideally the "namespace package" situation would be the default, and the class of errors around test filename collision would disappear. Though I haven't put any thought or effort into seeing how this would work in code, @nicoddemus would know better

asottile on 20 Apr 2019

👍2

All 18 comments

I'm not quite sure the best way to get to that state without breaking everything though 🤔

asottile on 20 Apr 2019

👍2

I've done a test implementation based on hoefling's answer to the SO question above except instead of having an independently-specified list of namespace package dirs I simply use sys.path, returning the top-level package or dir under the first entry from sys.path that is above the package in question. If no entry in sys,path is above the package we're trying to load (say, becuase someone's asked pytest to discover tests in a directory outside of any in sys.path, I simply pass the path on to the old code and let it do its thing.

I've only given it brief testing, but it seems to be working pretty well. I will report back again after further testing.

I don't know why I didn't think of this before, because it actually seems to be the most obvious and sensible solution (or at least partial solution) to the issue since, when the module is under sys.path we're generating a name that would load that exact module if pytest weren't running. In somewhat pathological situations it might not be the name that the developer would use, but it will definitely not be a name that could ever be used by another module. And it also, when the module can be found in the path, avoids the confusing and shadowing created by adding more locations to the path. (I use --import-mode=append, but even I've been bitten by a module unexpectedly loading in tests when it wouldn't in production.)

My current thinking on the matter is that py.path.LocalPath.pypkgpath() should take an optional second argument (defaulting to an empty list when not provided) that would be the list of directories to consider potential roots of namespace modules. You could pass in nothing to preserve the current behaviour, sys.path to get the (I think sensible) behaviour described here, or any other paths you care to configure. Getting that argument down to it probably means also adding it to pyimport() in the same module, but after that we're out of py-land and in pytest code so the rest is a matter of adding new configuration.

While using sys.path in this way actually seems like sensible default behaviour to me, it might be something we'd want either to phase in gradually (add the option first, and make it the default a few versions later) or just leave with a seemingly-poor default in the name of backward compatibility.

0cjs on 21 Apr 2019

👍1

I just ran into the same thing. Something like a command line option that could be added to pytest.ini to force package imports from rootdir sounds good to me.

girtsf on 6 May 2019

I don't think that using rootdir for this is a good idea: rootdir is about configuration, not import path, and can even change from run to run depending on your current working directory and what paths you give to pytest for discovery.

BTW, for those wondering where I'm at with all of this, my hack above seems to have worked well over the initial week since I implemented it, but the last ten days have been a long holiday in Japan (Golden Week) and so all work on the project using this has stopped.

Assuming my change goes for another week or so without any issues, I will look at trying to find the time to code up a PR for this. (It's unfortunately not trivial in part because it requires changes to both py and pytest.) If anybody's got any special thoughts on how to go about doing this beyond what any decent developer would do, let me know.

0cjs on 7 May 2019

I would love to help get a PR out if somebody pointed me in the right direction on what the recommended approach is.

girtsf on 22 Nov 2019

we have a concept of "collection roots" floating around, however its not yet formalized

for me personally thats something i'd like to tackle sometime after sorting out the collection tree structure

many of the details are currently unclear to me

RonnyPfannschmidt on 22 Nov 2019

Hi!
I've recently found out about pytest not respecting PEP420 namespaces.

To circumvent the issue I've patched the package discovery to treat paths that are already included in sys.path as namespaces. (essentially forcing import with fully qualified module name from a sys.path entry)
You can read my initial thoughts on it in detail here.

In my opinion it would be favorable to have valid PEP420 namespaces working as expected by default. With my patch, this would (I think) always be the case unless the sys.path is not set up correctly. By correctly, I mean that all places where one would import such packages are added to PYTHONPATH or sys.path directly.
One improvement on my patch would be to check if the directories in question have no files in them at all indeed. (make the PEP420 namespace check a little more correct)

Traktormaster on 1 Apr 2020

Using existing paths in sys.path seems a good solution for Python ≤3.4, where modules can be loaded only "through" a path. However, for 3.5 and above I would prefer to see discovery use the newer loading mechanisms that allow discovered test files to be loaded as "free-floating" modules (i.e., not entered at all in sys.modules) without them needing to go through a path to generate a module name for use in sys.modules, as I did in the code in #6966. This should avoid all conflicts and preserve the path that the tests were started with, rather than runnning tests in a system with modified path. (If tests were relying on the path being changed, the test system should be changed to make explicit that those paths are part of the test environment.)

0cjs on 1 Apr 2020

@0cjs
FWIW: it is also possible to restore sys.path after importing - but I agree that it would be better to skip/replace pylib's method here completely maybe.

blueyed on 2 Apr 2020

FWIW: it is also possible to restore sys.path after importing....

@blueyed That hadn't occurred to me! I wouldn't want to do that for Python ≥3.5, where we have better methods of dealing with that, but it seems to me a perfectly reasonable idea to consider when running earlier versions of Python, in the hope of maintaining more compatibility between pytest on ≤3.4 and ≥3.5.

I suppose this would also involve removing the module from sys.modules after loading it as well, to also emulate the behaviour of "nameless" modules, and avoid accidental "'loaded' from cache" for subsequent (but different) files that map to the same module name.

This would require a bit of thought, but possibly might work quite well.

0cjs on 2 Apr 2020

@0cjs
As for sys.path (from one of my stashes):

diff --git a/src/_pytest/python.py b/src/_pytest/python.py
index 372631f10..265bf0689 100644
--- a/src/_pytest/python.py
+++ b/src/_pytest/python.py
@@ -515,6 +515,7 @@ def _importtestmodule(self) -> ModuleType:
         # we assume we are only called once per module
         importmode = self.config.getoption("--import-mode")
         fspath = self.fspath
+        old_sys_path = sys.path[:] if importmode != "importlib" else None
         try:
             mod = fspath.pyimport(ensuresyspath=importmode)
         except SyntaxError:
@@ -554,6 +555,9 @@ def _importtestmodule(self) -> ModuleType:
                 "or @pytest.mark.skipif decorators instead, and to skip a "
                 "module use `pytestmark = pytest.mark.{skip,skipif}."
             )
+        finally:
+            if old_sys_path:
+                sys.path = old_sys_path
         self.config.pluginmanager.consider_module(mod)
         return mod

Feel free to pick it up into a PR.
Tested in https://github.com/blueyed/pytest/pull/338 already (with some required test adjustments).

blueyed on 2 Apr 2020

As for pytest you do not have to consider anything before Python 3.5 btw.

blueyed on 2 Apr 2020

@blueyed Is the above code for pytest 4.x? Because, as you pointed out to me also above (thanks!), pytest 5.x no longer supports Python 3.4 or 2.7.

For more modern versions of Python, wouldn't it make sense to use the more modern important mechanisms that neither require nor touch sys.path or sys.modules, as I mention above and implement here (along with some other stuff that may not be needed or wanted, as discussed in #6966). I'd appreciate your comments on the use of importlib in that code.

0cjs on 2 Apr 2020

I'm fairly certain you need to touch sys.modules at least as tests can and do import from other tests

asottile on 2 Apr 2020

@asottile writes:

I'm fairly certain you need to touch sys.modules at least as tests can and do import from other tests

Importing other tests as modules isn't a problem if you do it in the normal way. Say you have src/foo/bar_test.py in your tree, and you do _not_ have src/foo/__init__.py. Note that bar_test.py is _not_ a module (and has no module name), it's just a file. (It becomes a module and is assigned a module name only when loaded into a Python interpreter through an import statement or similar import mechanism.)

So with src/ in your sys.path, the normal way of importing and using a property value assigned as a global in that imported namespace module (with value = 1 or whatever at the top level in the file) is:

import foo.bar_test
assert 1 == foo.bar_test.value

This works fine currently, and will continue to work with all variants my proposed changes.

__EDIT:__ The following is not correctly expressing what I'm trying to say, nor is the code correct; see further discussion below for details.

What will not work in my proposal is relying on src/foo/bar_test.py having been automatically imported by discovery as (top-level) module bar_test:

# No import of bar_test up to this point.
assert 1 == bar_test.value    # Fails with: NameError: name 'bar_test' is not defined

I (and I think most sensible people) would argue that that's absolutely fine. If you want to use it, import it explicitly, rather than relying on magic that breaks other things. (E.g., in the current code the above will—sometimes!—break (and often in mysterious ways) if you also have an src/bar/bar_test.py file.

0cjs on 3 Apr 2020

I (and I think most sensible people)

please don't go after intelligence to try and make your point, it's not productive to conversation

right, but it's a major breaking change to adjust the module hierarchy here

that is currently importable as from bar_test import ... (all pytest tests are floated up to "top-level" namespace unless the directories are flushed with __init__.py)

your proposal would break existing imports. large breaking changes need very good reason to move and a migration / deprecation period.

your example where bar_test is a NameError is (as far as I know) already the case today (you'd have to stub into builtins to make that not the case) so I'm not sure what's being demonstrated there ?

asottile on 3 Apr 2020

@asottile I'm not sure we have any real disagreement here; I think that perhaps taking a more charitable interpretation of my remarks when there's ambiguity or potential errors may help keep conversations working well. You can start with taking my "most sensible people" comment as, ""people who prefer well-defined behaviour that's in line with what the Python interpreter does."

As I've mentioned several times previously, both here and in #6966, I'm well aware that what I'm proposing breaks certain kinds of changes that pytest makes to the module hierarchy, such that code is different when it runs under pytest as opposed to when it's not.

that is currently importable as from bar_test import ... (all pytest tests are floated up to "top-level" namespace unless the directories are flushed with __init__.py)

Exactly. I argue that this is behaviour that's worth changing. I believe that it's not well-defined (in particular, selecting different subsets of tests to collect can change what module is made available under a given name), and I see no indication that this was ever desired behaviour, rather than an unfortunate side-effect of the implementation (in large part due to what was available in earlier Python APIs).

your proposal would break existing imports.

_Some_ existing imports, yes. I think that the long-term (and even short-term) benefits to the community outwigh the cost of the change.

large breaking changes need very good reason to move and a migration / deprecation period.

Yes, we are in full agreement here. (I had hoped that would be clear from my comments about backward compatibility here and in #6966, but perhaps it was not.)

your example where bar_test is a NameError is (as far as I know) already the case today (you'd have to stub into builtins to make that not the case) so I'm not sure what's being demonstrated there ?

Sorry, I got my explanation wrong in the post above. That second example would require an import bar_test. The problem I'm trying to deal with is where import bar_test would fail in the production environment, but (whether in test code or code under test) imports something in pytest.

0cjs on 5 Apr 2020

Today I updated pytest to 6.0.1 and it broke my testing. After a small dig I've found the culprit. The new method _pytest.pathlib.resolve_package_path does not respect the PEP420 namespaces.

Quick example:

# These are the import paths initially setup by environment and IDE, with packages added in comments.
sys.path = [
  "/myproj/component_a",
    # myproj (PEP420)
    #   abc (normal package)
    # tests (PEP420)
    #   myproj (PEP420)
    #     abc (normal package with conftest inside)
  "/python_venv/site-packages",
    # abc (installed dependency)
]

I have imports in my code like from abc import x which is supposed to import from the dependency package. When pytest imports the conftest it adds its directory to the import paths:
sys.path = [ "/myproj/component_a/tests/myproj", "/myproj/component_a", "/python_venv/site-packages", ]
Now the same import from abc import x will find my package called abc instead of the dependency.

As a workaround I've patched the mentioned function. It's very similar to how I've fixed PEP420 before pytest 6.0. https://github.com/pytest-dev/pytest/issues/6966#issuecomment-606772533

Traktormaster on 31 Jul 2020

Was this page helpful?

0 / 5 - 0 ratings