Bazel: Python rules are not hermetic

Created on 13 Feb 2016  路  4Comments  路  Source: bazelbuild/bazel

the code is run by the system python, it seems, so it depends on what is installed.

It would be nice to only require bazel.

P3 team-Rules-Python feature request

Most helpful comment

I found a particularly difficult effect of this. If the environment has google-auth (or possibly other packages that use setuptools's namespaces) installed as a system or user site- or dist-package, then Python 2.7 will load a .pth file that setups up the google namespace incompatibly with Bazel. In particular, google.__path__ = ['/path/to/site-packages/google'], but not the google directories in the py_binary's runfiles tree. This results in an error like:

Traceback (most recent call last):
  File "test.py", line 31, in <module>
    from google.cloud import pubsub
ImportError: No module named cloud

The only workaround I could find is to delete either uninstall the site package or delete the .pth file. I believe it is also possibly to work around this by activating an empty virtualenv before invoking Bazel, but I haven't tried.

All 4 comments

I found a particularly difficult effect of this. If the environment has google-auth (or possibly other packages that use setuptools's namespaces) installed as a system or user site- or dist-package, then Python 2.7 will load a .pth file that setups up the google namespace incompatibly with Bazel. In particular, google.__path__ = ['/path/to/site-packages/google'], but not the google directories in the py_binary's runfiles tree. This results in an error like:

Traceback (most recent call last):
  File "test.py", line 31, in <module>
    from google.cloud import pubsub
ImportError: No module named cloud

The only workaround I could find is to delete either uninstall the site package or delete the .pth file. I believe it is also possibly to work around this by activating an empty virtualenv before invoking Bazel, but I haven't tried.

I've heard about issues with the google namespace package before. Not sure where the canonical thread for that is.

About hermetic python: In order to not rely on the system python interpreter, the interpreter would have to somehow be part of the workspace, either vendored into the main repo or else referred to as an external repo from the WORKSPACE file. The former option would be undertaken by the user. I suppose the latter option would imply that we either upstream BUILD files into CPython, or else maintain a canonical Python-in-Bazel repo somewhere.

Re: namespace packages: https://github.com/bazelbuild/rules_python/issues/14 but if I remember correctly, that's about a different implementation of namespace packages to the .pth files mentioned above.

This thread discusses two separate issues that can be dupped against other bugs, so I'll go ahead and close it.

For the hermeticity issue: It's standard practice that Bazel can invoke either system tools (non-hermetically) or tools vendored into the workspace. Follow #7375 for the feature work needed to give better control over what runtime the Python rules use.

For the imports issue: Name clashes happen in Python, and I don't think Bazel is in a position to avoid them when two separate projects insist on being called the same thing. But it sounds like the issue here is that a system library is conflicting with the name of a top-level package //google in some repo. There are two points to note: First, when conflicts happen between system libraries and user libraries, the best practice seems to be to resolve in favor of the system; see #5899. Second, top-level packages are currently directly importable because repo roots are put on the PYTHONPATH, but this is an anti-pattern and will be prohibited with #7067. Taken together, these two fixes should mean that you no longer have clashes between system libs and Bazel-built libs, provided that no one names their repo the same thing as a Python lib, and that no one adds a conflicting name to a py_library's imports.

Was this page helpful?
0 / 5 - 0 ratings