Variables set by --action_env are not forwarded through tools dependency of a rule. To generate test example, paste following lines into terminal:
mkdir test
cd test
touch WORKSPACE
touch tool.sh
chmod +x tool.sh
cat > BUILD << EOF
genrule(
name = "rule1",
srcs = [],
outs = ["out1.c"],
cmd = "bash -c 'echo RULE1:TESTVAR=\$\$TESTVAR'; exit 1",
)
sh_binary(
name = "tool",
srcs = ["tool.sh"],
# srcs = [":rule1"],
data = [":rule1"],
)
genrule(
name = "rule2",
srcs = [],
outs = ["out2.c"],
cmd = "",
tools = [":tool"],
)
genrule(
name = "rule3",
srcs = [],
outs = ["out3.c"],
cmd = "",
tools = [":rule1"],
)
genrule(
name = "rule4",
srcs = [":rule1"],
outs = ["out4.c"],
cmd = "",
)
cc_binary(
name = "main",
srcs = [
# ":rule1",
# ":rule2",
# ":rule3",
# ":rule4",
],
data = [
# ":rule1",
":rule2",
# ":rule3",
# ":rule4",
# ":tool",
],
)
EOF
bazel build :main --action_env TESTVAR=hello
outputs
RULE1:TESTVAR=
however depending on what is uncommented, i.e., ":rule1",, ":rule2",, ":rule3",, ":rule4",, ":tool", at the end of BUILD, bazel may output also
RULE1:TESTVAR=hello
rule1 gives RULE1:TESTVAR=hello - rule1 is directly referenced
rule2 and rule3 gives RULE1:TESTVAR= - rule1 is referenced via tools option
rule4 and tool again gives RULE1:TESTVAR=hello - rule1 is referenced via middleman
If the rule1 is referenced from the "_tool_" and from another rule as a dependency, it may be randomly executed with different environment set for each run. Is this the expected behaviour?
A consistent behaviour is obtained when exporting the environment variable into bazel environment, i.e.,
export TESTVAR="hello world!"
bazel shutdown
bazel build :main
_NOTE:_ in case there is a linking error, execute the last command (bazel build ...) once more.
_NOTE2:_ if more than one rule dependency is uncommented, it may randomly output both results, execute multiple times to verify, e.g.,
for i in 1 2 3 4 5; do bazel build :main --action_env TESTVAR=hello 2> /dev/null | grep TESTVAR ; done
Operating System: macOS, Linux
Bazel version: release 0.7.0, release 0.5.4
Encountered the same problem trying to compile TensorFLow with CUDA and LD_LIBRARY_PATH: Nvidia's docs say using LD_LIBRARY_PATH is a way to set up compilation with cuda. This mostly works with the bazel/TensorFlow build, until we hit a host-compiled tool that links into cuda - the link then fails.
I encountered the same bug trying to find out the cause of spurious rebuild because setting --action_env PATH=something will make all your build rebuild all host dependencies every time your PATH change :(
My repro:
dmg@aurora:/tmp/repro/test$ cat WORKSPACE
dmg@aurora:/tmp/repro/test$ bazel clean --expunge; PATH=/usr/local/bin:/usr/bin:/bin bazel --bazelrc=/dev/null --nomaster_bazelrc build --explain=explain --verbose_explanations --action_env=PATH=/usr/local/bin:/usr/bin:/bin from_host ; PATH=/usr/local/bin:/usr/bin:/bin:/sbin bazel --bazelrc=/dev/null --nomaster_bazelrc build --explain=explain --verbose_explanations --action_env=PATH=/usr/local/bin:/usr/bin:/bin from_host; cat explain
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
Starting local Bazel server and connecting to it...
........
INFO: Analysed target //:from_host (8 packages loaded).
INFO: Found 1 target...
INFO: Writing explanation of rebuilds to 'explain'
Target //:from_host up-to-date:
bazel-genfiles/from_host.txt
INFO: Elapsed time: 2.002s, Critical Path: 0.13s
INFO: 2 processes: 2 linux-sandbox.
INFO: Build completed successfully, 3 total actions
INFO: Analysed target //:from_host (0 packages loaded).
INFO: Found 1 target...
INFO: Writing explanation of rebuilds to 'explain'
Target //:from_host up-to-date:
bazel-genfiles/from_host.txt
INFO: Elapsed time: 0.454s, Critical Path: 0.12s
INFO: 2 processes: 2 linux-sandbox.
INFO: Build completed successfully, 3 total actions
Build options: --explain=explain --verbose_explanations --action_env='PATH=/usr/local/bin:/usr/bin:/bin'
Executing action 'BazelWorkspaceStatusAction stable-status.txt': unconditional execution is requested.
Executing action 'Executing genrule //:env [for host]': Effective client environment has changed. Now using
PATH=/usr/local/bin:/usr/bin:/bin:/sbin
.
Executing action 'Executing genrule //:from_host': One of the files has changed.
(With bazel 0.15 and Linux)
Catting the output file show indeed that the PATH used is the one from the env, not the one from the --action_env :( Do that with protobuf and you get to recompile protobuf everytime you open a new window with a changing environment.
Now that --incompatible_strict_action_env has been flipped, this has become a more widespread problem, and notably impacts bootstrapping bazel if LD_LIBRARY_PATH or similar is required.
This has now become a blocking issue for us too. We make extensive use of genrules to allow third-party build systems of complicated dependencies to do their thing. We also use --action_env as a way of configuring the behaviour of these genrules depending on environment (e.g., our CI image has stricter memory requirements). Sometimes these genrules need to depend on one another and the configuration then breaks.
This doesn't seem like it would be a hard fix for somebody familiar with the codebase?
There is a workaround:
You write your own custom bazel rule which uses ctx.actions.run to execute the shell script, and make sure to set the use_default_shell_env parameter to True - this parameter controls whether to forward the --action_env env vars to the shell script.
If you need to access the environment variables within the custom rule itself you can use ctx.configuration.
See docs: https://docs.bazel.build/versions/master/skylark/lib/actions.html#run
As proposed by @ulfjack at https://github.com/bazelbuild/bazel/issues/6473#issuecomment-441230398, I think we should add --host_action_env (maybe --host_test_env as well). Is there any progress on this?
This has been a problem for many TensorFlow users. (eg. tensorflow/tensorflow#22395)
@lberki @aehlig
I believe I'm running into this in relation to #7899 on macOS. The default path that bazel uses does not find my installation of python3 (macports, /opt/local/bin), and I cannot use action_env to make it find it for the actions that run without use_default_shell_env which I cannot fix as they are in dependencies.
I currently have a 'workaround' with --extra_toolchains=@bazel_tools//tools/python:autodetecting_toolchain_nonstrict but that is just putting off migrating to the correct path.
What is the status of adding --host_action_env to Bazel?
I currently experienced something very similar in TensorFlow. But there it was due to use of exec_tools while tools works.
Most helpful comment
What is the status of adding
--host_action_envto Bazel?