When building tarballs with pkg_tar
, the output tarball has a 0555 mode. But tarballs are generally not going to be executable. I think the default mode should be 0666 or, if you want to be very conservative 0444. In both those cases, it would be nice to respect the user's umask, as long as it did not interfere with reproducible builds. An alternative would be to provide an explicit pkg_tar
parameter that functions similarly to the existing mode
(although mode
is for files packaged inside the tarball, not for the tarball itself, #2925).
Claiming that output tarballs are executable makes it harder to auto-complete sibling paths that are executable. And if you attempt to execute the tarball, I think an EPERM is more obvious than "cannot execute binary file".
$ bazel version
Build label: 0.15.0- (@non-git)
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Wed Jun 27 14:43:20 2018 (1530110600)
Build timestamp: 1530110600
Build timestamp as int: 1530110600
$ cat WORKSPACE
workspace(name = "test")
$ cat BUILD.bazel
load("@bazel_tools//tools/build_defs/pkg:pkg.bzl", "pkg_tar")
pkg_tar(
name = "test",
srcs = glob(
["*.txt"],
exclude_directories = 1,
),
mode = "0666",
)
$ touch text.txt
$ bazel build test
Starting local Bazel server and connecting to it...
........
INFO: Analysed target //:test (16 packages loaded).
INFO: Found 1 target...
Target //:test up-to-date:
bazel-bin/test.tar
INFO: Elapsed time: 1.360s, Critical Path: 0.11s
INFO: 1 process: 1 processwrapper-sandbox.
INFO: Build completed successfully, 6 total actions
$ ls -l bazel-bin/test.tar
-r-xr-xr-x. 1 wking wking 10240 Jul 12 21:49 bazel-bin/test.tar
RHEL 7.5 (Linux).
bazel info release
?$ bazel info release
release 0.15.0- (@non-git)
bazel info release
returns "development version" or "(@non-git)", tell us how you built Bazel.Installed from COPR.
The output of pkg_tar
is an artefact as any other artefact generated by bazel hence gets the same permissions. So, while it is questionable that outputs always have permissions 0555
, this has nothing to do with pkg_tar
. The most simple way to demonstrate that generic bazel
behavior is a BUILD
file with the following contents.
genrule(
name = "hello",
outs = ["hello.txt"],
cmd = "echo Hello World > $@ && chmod 0444 $@",
)
Even then, bazel build :hello
will generate a file with permissions 0555
.
why do artifacts have a permission of 555, even if they aren't executable? why not use 444 unless it is a directory or marked as an executable?
That's a good question, but changing it is going to be a breaking change for someone, so we are not going to change it without careful reflection.
Is there any compelling reason why 555 modes would be harmful?
@aiuto Breaking in theory or breaking in practice? Are there known tools/systems that expect the execution bit on non-executable files?
We follow Hyrum's Law - that someone will end up depending on every observable behavior of your code, documented or not. So we assume that theory is equivalent to practice. I'm sure there is someone who generates a bash script with one rule, then passes it as data to another genrule and then expects it to run. So changing it to 444 would break them. This is not so far fetched.
What I meant was that if we are going to make any behavior change, let's figure out the real needs first. Is executable/non-executable sufficient? Do we need files which are non-readable by other users? Do we want to be able to set arbitrary modes for output, so that if we repackaged that output with pkg_deb it would have the "right" file modes for the package?
I just found this, and I have a real need. I was setting the modes explicitly to what I needed them to be and they are reverted to 555, which is breaking my rules. Why are the modes adjusted at all? Can we have an opt-out option? Maybe another magic flag in the execution_requirements
dict on a ctx.actions.run()
that prevents the automatic chmod?
Fixing this has nothing to do with local execution and is a broader change to how Bazel actually treats artifacts. Changing teams.
I haven't heard a compelling use case for this (I'd be curious to hear what @alanfalloon experienced with broken rules). Given the complexity of the change, keeping at a low priority.
Here are some of the ways that this behaviour has broken my rules:
Besides those concrete examples, there is also the case that the current behaviour is both wrong and surprising. Modes are just as much a part of the file as the content -- if not the entire POSIX mode, then certainly the executable bits. Chmoding my outputs after an action is just as wrong as normalizing newlines on every output. As a rule author, it breaks basic assumptions about how actions are executed, and requires me to jump through hoops to work around this issue.
It is somewhat understandable that something like the remote cache is limited to a subset of POSIX behaviour (not preserving xattrs for example) so cached artifacts could lose some permission information. But even in those situations, basic permissions should be preserved: for example, even git preserves executable information in its tree representations. That said, local actions should preserve anything the local filesystems supports so you can always no-remote
an action if you need more detailed permissions for its outputs.
I also don't particularly find the Hyrums law argument very convincing: if people were depending on outputs magically becoming executable, then their rules are wrong -- this is just hiding it. We should gate the fix behind an incompatible-flag for a couple of releases to give them a chance to fix their rules, but its not a reason to avoid fixing this bug.
cc @alexjski
Most helpful comment
Here are some of the ways that this behaviour has broken my rules:
Besides those concrete examples, there is also the case that the current behaviour is both wrong and surprising. Modes are just as much a part of the file as the content -- if not the entire POSIX mode, then certainly the executable bits. Chmoding my outputs after an action is just as wrong as normalizing newlines on every output. As a rule author, it breaks basic assumptions about how actions are executed, and requires me to jump through hoops to work around this issue.
It is somewhat understandable that something like the remote cache is limited to a subset of POSIX behaviour (not preserving xattrs for example) so cached artifacts could lose some permission information. But even in those situations, basic permissions should be preserved: for example, even git preserves executable information in its tree representations. That said, local actions should preserve anything the local filesystems supports so you can always
no-remote
an action if you need more detailed permissions for its outputs.I also don't particularly find the Hyrums law argument very convincing: if people were depending on outputs magically becoming executable, then their rules are wrong -- this is just hiding it. We should gate the fix behind an incompatible-flag for a couple of releases to give them a chance to fix their rules, but its not a reason to avoid fixing this bug.