Change in --output_base parameter causes the build to break with very obscure error messages.
When build is done using some IDEs (VSCode or IntelliJ) they tend to set their own --output_base, different from the one configured by the user for the command line builds. This should not cause any problems, but unfortunately it appears that after --output_base changes the build is broken.
Select any simplest Bazel project. Build it with bazel --output_base=C:/blah1 build ...
so it builds successfully. Then try to build it with bazel --output_base=C:/blah2 build ...
this time it breaks with the error messages which don't make any sense.
The problem does not seem to be OS dependent.
bazel info release
?2.0.0
I figured that the problem is must probably caused by the stale "courtesy" symlink in WORKSPACE folder. After the first build Bazel creates "courtesy" symlink such as bazel-<workspace_name>
in the workspace folder and it points inside output_base. When we issue second build command with the different output_base Bazel is smart enough to realize that and spawn second build server process, but unfortunately that stale bazel-<workspace_name>
symlink still stays and points inside the old output_base which seems to confuse Bazel. Running bazel clean
between builds or simply deleting of that symlink fixes the problem. It looks like when Bazel discovers the change in startup parameters which warrants spawning new build server it should at the same time remove existing courtesy symlinks as they are not valid anymore and cause the build to fail.
Reproduced with Bazel 2.0:
03:30:25 /tmp/ws
$ cat BUILD
genrule(
name = "g",
outs = ["g.txt"],
cmd = "touch $@",
)
03:30:28 /tmp/ws
$ cat WORKSPACE
03:30:31 /tmp/ws
$ bazel --output_base=/tmp/one build //... && bazel --output_base=/tmp/two build //...
INFO: Analyzed target //:g (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //:g up-to-date:
bazel-bin/g.txt
INFO: Elapsed time: 0.119s, Critical Path: 0.00s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
ERROR: error loading package 'bazel-ws/external/bazel_tools/tools/build_defs/pkg': Label '//tools/python:private/defs.bzl' is invalid because 'tools/python' is not a package; perhaps you meant to put the colon here: '//:tools/python/private/defs.bzl'?
INFO: Elapsed time: 0.219s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (12 packages loaded)
currently loading: bazel-ws/external/bazel_tools/tools/test/CoverageOutp\
utGenerator/java/com/google/devtools/coverageoutputgenerator ... (2 packages\
)
Fetching @rules_java; fetching
I'm seeing something very similar, but even more basic when using the latest pre-release vscode-bazel (which sets --output_base by default). vscode-bazel runs this command:
bazel --output_base=/tmp/ee79067f914abe58284ab7a8abdc7f7d query ...:* --output=package
It fails with the following output:
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
INFO: Call stack for the definition of repository 'rules_cc' which is a http_archive (rule definition at /tmp/ee79067f914abe58284ab7a8abdc7f7d/external/bazel_tools/tools/build_defs/repo/http.bzl:292:16):
- /tmp/ee79067f914abe58284ab7a8abdc7f7d/external/bazel_tools/tools/build_defs/repo/utils.bzl:205:9
- /DEFAULT.WORKSPACE.SUFFIX:302:1
ERROR: error loading package 'bazel-sdk/external/bazel_tools/third_party/jarjar': Label '//tools/jdk:remote_java_tools_aliases.bzl' is invalid because 'tools/jdk' is not a package; perhaps you meant to put the colon here: '//:tools/jdk/remote_java_tools_aliases.bzl'?
If I remove the --output_base, the query succeeds just fine. If I change the --output_base to --output_user_root, the query succeeds just fine. It's almost like bazel cannot modify things in install_base when outputBase has been specified like that. Perhaps the query is sandboxed and overriding output_base is causing install_base to not be in the sandbox? Just wild-guessing.
For reference, I've seen a number of different but very similar errors and they all involve downloading http_archive rules (or similar download rules) for setting up a workspace prior to running a query. I think the intent of output_base was to not disturb the main workspace environment (and thus be able to run concurrently), but maybe the install_base is write-only in that case? For what it's worth the install_base HAD been populated with the downloads prior to running the query with --output_base, so I'm guessing the need to re-download is because changing output_base invalidated something. Maybe that's a clue?
Is anyone actively working on this? This is a pretty annoying bug that affects how e.g. Jenkins build nodes have to be spawned because the cannot handle multiple executors being isolated with output_base
. I'd be willing to investigate a fix but am not familiar enough with bazel's codebase to even know where to start looking. Any pointers would be appreciated :)
I found a workaround, but I don't know why it works:
bazel-app
echo "dummy" >| bazel-app
failed to create one or more convenience symlinks for prefix 'bazel-'
I tried setting --symlink_prefix
, --experimental_use_sandboxfs
, but it did not work. It's as if 'bazel-' is hardcoded somewhere.
I hope that helps.
The basic problem here is that the bazel-$WORKSPACE convenience symlink is blindly traversed by Bazel: you can even do bazel build //bazel-$WORKSPACE/...
and it will load the labels without any issues, and even build the targets there if you're lucky. When you switch output bases, the symlinks are broken, but the fundamental issue is visiting that convenience symlink in the first place.
My suggestion for a workaround for now is to do echo "bazel-$WORKSPACE" >> .bazelignore
. The .bazelignore file in the root of the workspace tells Bazel not to consider those directories. Of course, Bazel should be smart enough to not consider them on its own.
cc @mhy1992
Most helpful comment
The basic problem here is that the bazel-$WORKSPACE convenience symlink is blindly traversed by Bazel: you can even do
bazel build //bazel-$WORKSPACE/...
and it will load the labels without any issues, and even build the targets there if you're lucky. When you switch output bases, the symlinks are broken, but the fundamental issue is visiting that convenience symlink in the first place.My suggestion for a workaround for now is to do
echo "bazel-$WORKSPACE" >> .bazelignore
. The .bazelignore file in the root of the workspace tells Bazel not to consider those directories. Of course, Bazel should be smart enough to not consider them on its own.cc @mhy1992