(https://groups.google.com/forum/#!topic/bazel-discuss/5r_Ajw_j-ZI for context)
The current pkg_* rules don't package runfiles along with C++ binaries. This behavior makes it difficult to deploy an entire C++ application to machines that don't have source access.
Can we update the packaging rules to include runfiles along with binaries?
Operating System:
MacOS 10.13.2
Bazel version (output of bazel info release):
release 0.8.1-homebrew
cc @lberki @ventrescadeatun
Lukacs do you know if this is by design? It looks broken to me. To me pkg_tar should tar everything that's needed to execute the cc_binary, am I wrong? :)
Yep, this is by design because the runfiles are not always necessary.
We internally have a pkg_runfiles rule that packages the runfiles of a binary, which should simply be open sourced.
Major +1 for this
This also is the same for python. At area17, we want to use this for deployment, but it is difficult currently because there isn't a good way to package everything we need together (for both py and cc).
For example, for our python targets, we often depend on pip targets (which are installed within bazel via the pip_import rule). Those pip imports aren't packaged.
This would be a great feature!
@laszlocsomor Is this something your work on runfiles will make simple(r)? :)
@mhlopko : The runfiles library will make it very easy to ~_use_~ _look up_ runfiles, once they are available one way or another. _Packaging_ the runfiles is a separate issue; my work is unrelated to that.
Ok, thanks for the info!
Thanks for looking into this, @mhlopko and @lberki.
Does pkg_runfiles include the associated binary as well? If not, this feature request is for a slightly different behavior: to provide a packaging rule that produces a self-contained package of a binary and its runfiles.
Sorry for the silence. I just looked into the code and it looks like you should be able to write a skylark rule that will provide this behavior. https://docs.bazel.build/versions/master/skylark/lib/runfiles.html and https://docs.bazel.build/versions/master/skylark/rules.html#runfiles should help you get started.
It looks like some commits have made it into 0.15 that 100% resolve this issue on c++ for me (https://github.com/bazelbuild/bazel/commit/f90ed652e223fffdf3f64cf1d9f49663be540b18#diff-73cc3e84377e7c63ef4406039e060016), but doesn't necessarily fix this for python still.
The new include_runfiles parameter to the pkg_tar rule will copy over all the required files for both c++ and python, but don't correctly update the python runfile paths to reflect. The c++ paths work fine.
I created a repository that demonstrates the issue here https://github.com/curtismuntz/bazel_pkg_tar
After building via bazel build src:foo_tar and extracting the produced tarball, the following tree structure exists:
$ tree -L 3
.
โโโ opt
โย ย โโโ foo
โย ย โโโ foo.py
โโโ pypi__numpy_1_13_1
โโโ numpy
โย ย โโโ add_newdocs.py
โย ย โโโ compat
โย ย โโโ __config__.py
โย ย โโโ core
โย ย โโโ ctypeslib.py
โย ย โโโ _distributor_init.py
โย ย โโโ distutils
โย ย โโโ doc
โย ย โโโ dual.py
โย ย โโโ f2py
โย ย โโโ fft
โย ย โโโ _globals.py
โย ย โโโ _import_tools.py
โย ย โโโ __init__.py
โย ย โโโ lib
โย ย โโโ linalg
โย ย โโโ ma
โย ย โโโ matlib.py
โย ย โโโ matrixlib
โย ย โโโ polynomial
โย ย โโโ random
โย ย โโโ setup.py
โย ย โโโ testing
โย ย โโโ tests
โย ย โโโ version.py
โโโ numpy-1.13.1.data
โย ย โโโ scripts
โโโ numpy-1.13.1.dist-info
โโโ DESCRIPTION.rst
โโโ METADATA
โโโ metadata.json
โโโ RECORD
โโโ top_level.txt
โโโ WHEEL
Attempting to run opt/foo produces:
$ ./opt/foo
Traceback (most recent call last):
File "./foo", line 203, in <module>
Main()
File "./foo", line 139, in Main
module_space = FindModuleSpace()
File "./foo", line 86, in FindModuleSpace
raise AssertionError('Cannot find .runfiles directory for %s' % sys.argv[0])
AssertionError: Cannot find .runfiles directory for ./foo
And trying to call opt/foo.py directly:
$ ./opt/foo.py
Traceback (most recent call last):
File "./foo.py", line 2, in <module>
import numpy as np
ImportError: No module named 'numpy'
Pretty sure this is a bazel core py_binary issue, but is the plan for pkg_tar to provide this interface via include_runfiles? Or is it best practice to implement a skylark rule for this functionality?
CC @c4urself
@lberki : how hard would it be to opensource pkg_runfiles? The question came up again: https://stackoverflow.com/questions/52823983
Actually, it should be easy to implement pkg_runfiles in Starlark:
DefaultInfo.data_runfiles (or DefaultInfo.default_runfiles?) contains the File objects for the runfilesFilesToRunProvider.runfiles_manifest contains File object for the runfiles manifestpkg_runfiles rule could create a ctx.actions.run / ctx.actions.run_shell with the binary/script, pass all the File objects for the runfiles and the manifest as inputs, and expect just a tar or zip as the output.WDYT?
FilesToRunProvider.runfiles_manifestcontains File object for the runfiles manifest
I think this attribute should be removed from Starlark. The manifest contains absolute paths, so depending on this artifact is a good way to make your rule non-reproducible across machines. I raised this point on the mailing list, but it didn't generate much interest.
What would be helpful would be extending the Starlark runfiles API as I have proposed to allow complete introspection of runfiles objects. With my CL merged, it should be possible for Starlark to construct a runfiles tree identical to the one build-runfiles makes in whatever package is desired.
That's a good point.
Your proposal sounds good to me in general. My only question is how to distinguish runfiles types -- normal symlinks vs. empty files (__init__.py) vs. whatever else there could be? Implementing a pkg_runfiles rule would need that ability.
@benjaminp : let's continue this discussion on the thread: https://groups.google.com/d/msg/bazel-dev/uCfpNnVLJa4/Uomy07-iBgAJ
@laszlocsomor : depends on what you want to do -- we seem to have an API so that one can look into runfiles trees to see which symlinks there are and where they point. There are all sorts of little wrinkles, though, that need to be considered. They are mostly for Google-internal awful hacks we had the sense not to contaminate Bazel with, but still, some auditing will be required.
Most helpful comment
Major +1 for this
This also is the same for python. At area17, we want to use this for deployment, but it is difficult currently because there isn't a good way to package everything we need together (for both py and cc).
For example, for our python targets, we often depend on pip targets (which are installed within bazel via the pip_import rule). Those pip imports aren't packaged.
This would be a great feature!