Nixpkgs: Python: strip embedded file name

Created on 1 Nov 2017  路  13Comments  路  Source: NixOS/nixpkgs

Issue description

Python compile .pyc files with source filename built into them. For example, here is result of one decompilation:

$ uncompyle6 $b1/__init__.pyc
# uncompyle6 version 2.13.2
# Python bytecode 2.7 (62211)
# Decompiled from: Python 2.7.13 (default, Dec 17 2016, 20:05:07)
# [GCC 6.4.0]
# Embedded file name: /tmp/nix-build-python2.7-backports.shutil_get_terminal_size-1.0.0.drv-0/backports.shutil_get_terminal_size-1.0.0/dist/tmpbuild/backports.shutil-get-terminal-size/backports/__init__.py
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
# okay decompiling /nix/store/rqwkwjv1jkm4dykymcvvmni0zy2i472y-python2.7-backports.shutil_get_terminal_size-1.0.0/lib/python2.7/site-packages/backports/__init__.pyc

$ uncompyle6 $b2/__init__.pyc
# uncompyle6 version 2.13.2
# Python bytecode 2.7 (62211)
# Decompiled from: Python 2.7.13 (default, Dec 17 2016, 20:05:07)
# [GCC 6.4.0]
# Embedded file name: /build/configparser-3.5.0/dist/tmpbuild/configparser/backports/__init__.py
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
# okay decompiling /nix/store/glny1ajfrqwyvjpn1rgnjqdjsndkc3vl-python2.7-configparser-3.5.0/lib/python2.7/site-packages/backports/__init__.pyc

Both files were produced from same source code and same nixpkgs version, but differ in source file name. We should strip this embedded file name either during compilation (patch Python) or in post-fixup phase.

There is an example of stripping done in https://github.com/codewarrior0/pyinstaller/commit/a77005811b2e036515ee6d5900bb8b7c6a456c67

Debian overcomes this issue by compiling .pyc on a user machine.

Steps to reproduce

with import <nixpkgs> { };
pythonFull.withPackages (ps: with ps; [
    ipython
    pylint
])

This results into a collision. Source .py files differ indeed, but only in comments.

Technical details

  • Nixpkgs version: 18.03.pre.799435b
  • Sandboxing enabled: no

cc @FRidh

stale python

Most helpful comment

The collision is indeed due to https://github.com/NixOS/nixpkgs/issues/2412#issuecomment-340356859.

If we would want to strip this filename then I think we should patch the interpreter like we did for SOURCE_DATE_EPOCH. We may also choose to rewrite it considering we know $out. That would not solve the collision.

All 13 comments

My fix is to remove offending .pyc file. It also does some magic related to backports namespace

diff --git a/pkgs/top-level/python-packages.nix b/pkgs/top-level/python-packages.nix
index 0be1ff17ee..b584c90316 100644
--- a/pkgs/top-level/python-packages.nix
+++ b/pkgs/top-level/python-packages.nix
@@ -80,6 +80,19 @@ let
         else throw "Unsupported kind ${kind}");
     in fetcher (builtins.removeAttrs attrs ["format"]) );

+  backportsCommon = ''
+    ## See "absolutely essential rule" in https://pypi.python.org/pypi/backports
+    cat > "$(find $out -regex ".*backports/__init__.py")" <<EOF
+    # A Python "namespace package" http://www.python.org/dev/peps/pep-0382/
+    # This always goes inside of a namespace package's __init__.py
+
+    from pkgutil import extend_path
+    __path__ = extend_path(__path__, __name__)
+    EOF
+    ## .pyc collision fix
+    find $out -regex ".*backports/__init__.pyc" -delete
+  '';
+
 in {

   inherit python bootstrapped-pip pythonAtLeast pythonOlder isPy26 isPy27 isPy33 isPy34 isPy35 isPy36 isPyPy isPy3k mkPythonDerivation buildPythonPackage buildPythonApplication;
@@ -1116,6 +1129,8 @@ in {
       sha256 = "713e7a8228ae80341c70586d1cc0a8caa5207346927e23d09dcbcaf18eadec80";
     };

+    postFixup = backportsCommon;
+
     meta = {
       description = "A backport of the get_terminal_size function from Python 3.3鈥檚 shutil.";
       homepage = https://github.com/chrippa/backports.shutil_get_terminal_size;

But still would be great of generic stripping is implemented. Also, ignoreCollisions = true is still another pragmatic solution.

Thank you for the investigation of the source of the collision. However, embedded file names are necessary for a lot of things, such as displaying source code in the backtraces and finding resources near files. The only issue is with __init__.py files of namespace packages, which is solved by ignoreCollisions = true.

@danbst How does python use these filenames in .pyc files?

@orivej

However, embedded file names are necessary for a lot of things, such as displaying source code in the backtraces and finding resources near files.

Hm. I've never seen these embedded names in callstacks. Those refer to build-time files, which don't even belong to /nix/store, essentially dangling links.

This is more of reproducibility. Hydra-built packages and locally-built should result into same derivation. .pyc files from Hydra contain /build... references, .pyc files locally contain /tmp/nix-build-... references.

How does python use these filenames in .pyc files?

I don't know, haven't investigated that deep.

The only issue is with __init__.py files of namespace packages, which is solved by ignoreCollisions = true

Actually, what I tried was to remove the ignoreCollisions = true here. Also, python.withPackages doesn't offer this parameter, but still suffers (see example in the OP post)

BTW, now I think my patch for backports is a good compromise. I should insert that postFixup phase into every backports package to mitigate collision for __init_.py (not .pyc) file.

This is more of reproducibility. Hydra-built packages and locally-built should result into same derivation. .pyc files from Hydra contain /build... references, .pyc files locally contain /tmp/nix-build-... references.

I believe this is because Hydra use Nix 1.12 (unstable) which builds in "/build", whereas Nix 1.11 builds in "/tmp/nix-build-*".

The collision is indeed due to https://github.com/NixOS/nixpkgs/issues/2412#issuecomment-340356859.

If we would want to strip this filename then I think we should patch the interpreter like we did for SOURCE_DATE_EPOCH. We may also choose to rewrite it considering we know $out. That would not solve the collision.

Great to know I wasn't first who discovered this issue! So, there are 2 issues here:

  1. Patch .pyc to refer to /nix/store paths instead of build paths. As far as I see co_filename is manipulated in modulefinder library. I haven't found other usecases, still it's good as a general cleanup.

  2. Collision for namepsace modules. There is a trivial fix, which I've posted above, that patches namespace package so it a) doesn't provide .pyc for namespace __init__ and b) ensures __init__.py won't collide with other namespace packages. We better not provide .pyc file here, but rather generate all missing .pyc in buildEnv.postBuild.

@FRidh but I see you marked problem 2 as "wontfix" in a linked issue. What do you think about my solution? This is kind of collision that can be solved at distro level.

The second point is OK, could you submit a pull request?

As for the first point, it should be simpler to compile .pyc after copying .py files into the store.

@orivej I guess there are reasons why files aren't compiled in store directly. For example, it's not possible to hook into setup.py to do exactly that.

Probably remove all .pyc in postInstall phase and then run compiler for whole $out? @FRidh you may know why that doesn't happen.

Probably remove all .pyc in postInstall phase and then run compiler for whole $out?

Yes, this is what I imagine (for all outputs).

@FRidh @orivej I've tried to do what I've described, but got into trouble. Some backported packages do declare namespace, which makes NOT creating __init__.py/pyc in first place. I'm a bit WTF on this, because absent __init__.py should declare namespace only in Python3... So investigations still required...

Regarding the namespace packages, there are two solutions I could think of:

  1. have an attribute, e.g. namespaces = [ "backports" ];. buildPythonPackage could remove the __init__.py, and buildEnv uses the info to create a __init__.py when it does the integration;
  2. create a hook that removes __init__.py and provides a __init__.py itself.

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.
Was this page helpful?
0 / 5 - 0 ratings

Related issues

domenkozar picture domenkozar  路  3Comments

grahamc picture grahamc  路  3Comments

copumpkin picture copumpkin  路  3Comments

ayyess picture ayyess  路  3Comments

yawnt picture yawnt  路  3Comments