Bazel: --action_env not forwarded through "tools" dependency

Created on 2 Nov 2017 · 9Comments · Source: bazelbuild/bazel

Description of the problem:

Variables set by --action_env are not forwarded through tools dependency of a rule. To generate test example, paste following lines into terminal:

mkdir test
cd test
touch WORKSPACE
touch tool.sh
chmod +x tool.sh
cat > BUILD << EOF
genrule(
    name = "rule1",
    srcs = [],
    outs = ["out1.c"],
    cmd = "bash -c 'echo RULE1:TESTVAR=\$\$TESTVAR'; exit 1",
)

sh_binary(
    name = "tool",
    srcs = ["tool.sh"],
    # srcs = [":rule1"],
    data = [":rule1"],
)

genrule(
    name = "rule2",
    srcs = [],
    outs = ["out2.c"],
    cmd = "",
    tools = [":tool"],
)

genrule(
    name = "rule3",
    srcs = [],
    outs = ["out3.c"],
    cmd = "",
    tools = [":rule1"],
)

genrule(
    name = "rule4",
    srcs = [":rule1"],
    outs = ["out4.c"],
    cmd = "",
)

cc_binary(
    name = "main",
    srcs = [
        # ":rule1",
        # ":rule2",
        # ":rule3",
        # ":rule4",
    ],
    data = [
        # ":rule1",
        ":rule2",
        # ":rule3",
        # ":rule4",
        # ":tool",
    ],
)
EOF
bazel build :main --action_env TESTVAR=hello

outputs

RULE1:TESTVAR=

however depending on what is uncommented, i.e., ":rule1",, ":rule2",, ":rule3",, ":rule4",, ":tool", at the end of BUILD, bazel may output also

RULE1:TESTVAR=hello

rule1 gives RULE1:TESTVAR=hello - rule1 is directly referenced
rule2 and rule3 gives RULE1:TESTVAR= - rule1 is referenced via tools option
rule4 and tool again gives RULE1:TESTVAR=hello - rule1 is referenced via middleman

If the rule1 is referenced from the "_tool_" and from another rule as a dependency, it may be randomly executed with different environment set for each run. Is this the expected behaviour?

A consistent behaviour is obtained when exporting the environment variable into bazel environment, i.e.,

export TESTVAR="hello world!"
bazel shutdown
bazel build :main

_NOTE:_ in case there is a linking error, execute the last command (bazel build ...) once more.
_NOTE2:_ if more than one rule dependency is uncommented, it may randomly output both results, execute multiple times to verify, e.g.,

for i in 1 2 3 4 5; do bazel build :main --action_env TESTVAR=hello 2> /dev/null | grep TESTVAR ; done

Environment info

Operating System: macOS, Linux
Bazel version: release 0.7.0, release 0.5.4

P2 misc > misc bug

Source

didzis

👍6

Most helpful comment

What is the status of adding --host_action_env to Bazel?

crorvick on 17 Nov 2019

👍2

All 9 comments

Encountered the same problem trying to compile TensorFLow with CUDA and LD_LIBRARY_PATH: Nvidia's docs say using LD_LIBRARY_PATH is a way to set up compilation with cuda. This mostly works with the bazel/TensorFlow build, until we hit a host-compiled tool that links into cuda - the link then fails.

r4nt on 8 May 2018

I encountered the same bug trying to find out the cause of spurious rebuild because setting --action_env PATH=something will make all your build rebuild all host dependencies every time your PATH change :(

My repro:

dmg@aurora:/tmp/repro/test$ cat WORKSPACE 
dmg@aurora:/tmp/repro/test$ bazel clean --expunge; PATH=/usr/local/bin:/usr/bin:/bin bazel --bazelrc=/dev/null --nomaster_bazelrc build --explain=explain --verbose_explanations --action_env=PATH=/usr/local/bin:/usr/bin:/bin from_host ; PATH=/usr/local/bin:/usr/bin:/bin:/sbin bazel --bazelrc=/dev/null --nomaster_bazelrc build --explain=explain --verbose_explanations --action_env=PATH=/usr/local/bin:/usr/bin:/bin from_host; cat explain
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
Starting local Bazel server and connecting to it...
........
INFO: Analysed target //:from_host (8 packages loaded).
INFO: Found 1 target...
INFO: Writing explanation of rebuilds to 'explain'
Target //:from_host up-to-date:
  bazel-genfiles/from_host.txt
INFO: Elapsed time: 2.002s, Critical Path: 0.13s
INFO: 2 processes: 2 linux-sandbox.
INFO: Build completed successfully, 3 total actions
INFO: Analysed target //:from_host (0 packages loaded).
INFO: Found 1 target...
INFO: Writing explanation of rebuilds to 'explain'
Target //:from_host up-to-date:
  bazel-genfiles/from_host.txt
INFO: Elapsed time: 0.454s, Critical Path: 0.12s
INFO: 2 processes: 2 linux-sandbox.
INFO: Build completed successfully, 3 total actions
Build options: --explain=explain --verbose_explanations --action_env='PATH=/usr/local/bin:/usr/bin:/bin'
Executing action 'BazelWorkspaceStatusAction stable-status.txt': unconditional execution is requested.
Executing action 'Executing genrule //:env [for host]': Effective client environment has changed. Now using
  PATH=/usr/local/bin:/usr/bin:/bin:/sbin
.
Executing action 'Executing genrule //:from_host': One of the files has changed.

(With bazel 0.15 and Linux)

Catting the output file show indeed that the PATH used is the one from the env, not the one from the --action_env :( Do that with protobuf and you get to recompile protobuf everytime you open a new window with a changing environment.

damienmg on 1 Jul 2018

Now that --incompatible_strict_action_env has been flipped, this has become a more widespread problem, and notably impacts bootstrapping bazel if LD_LIBRARY_PATH or similar is required.

asuffield on 16 Jan 2019

This has now become a blocking issue for us too. We make extensive use of genrules to allow third-party build systems of complicated dependencies to do their thing. We also use --action_env as a way of configuring the behaviour of these genrules depending on environment (e.g., our CI image has stricter memory requirements). Sometimes these genrules need to depend on one another and the configuration then breaks.

This doesn't seem like it would be a hard fix for somebody familiar with the codebase?

johnmarkwayve on 22 Jan 2019

There is a workaround:
You write your own custom bazel rule which uses ctx.actions.run to execute the shell script, and make sure to set the use_default_shell_env parameter to True - this parameter controls whether to forward the --action_env env vars to the shell script.
If you need to access the environment variables within the custom rule itself you can use ctx.configuration.
See docs: https://docs.bazel.build/versions/master/skylark/lib/actions.html#run

RNabel on 8 Feb 2019

👍1

As proposed by @ulfjack at https://github.com/bazelbuild/bazel/issues/6473#issuecomment-441230398, I think we should add --host_action_env (maybe --host_test_env as well). Is there any progress on this?
This has been a problem for many TensorFlow users. (eg. tensorflow/tensorflow#22395)

@lberki @aehlig

meteorcloudy on 19 Mar 2019

I believe I'm running into this in relation to #7899 on macOS. The default path that bazel uses does not find my installation of python3 (macports, /opt/local/bin), and I cannot use action_env to make it find it for the actions that run without use_default_shell_env which I cannot fix as they are in dependencies.

I currently have a 'workaround' with --extra_toolchains=@bazel_tools//tools/python:autodetecting_toolchain_nonstrict but that is just putting off migrating to the correct path.