When using the bazel run
command, the spawned child has a working directory different from the directory where the command is run. This leads to bazel run //path/to/target
and bazel-bin/path/to/target/target
being quite different. If possible, it would be ideal to have the child inherit the bazel
invocation's working directory.
Note difference between output of last two commands.
$ cd $(mktemp -d)
$ pwd
/tmp/tmp.VzGLDvgQvk
$ touch WORKSPACE
$ mkdir -p path/to/target
$ cat <<EOF > path/to/target/BUILD
sh_binary(
name = "target",
srcs = ["target.sh"],
)
EOF
$ cat <<EOF > path/to/target/target.sh
#!/bin/bash
pwd
EOF
$ chmod +x path/to/target/target.sh
$ bazel run //path/to/target 2>/dev/null
/home/kamal/.cache/bazel/_bazel_kamal/f4a400f5a0927c37f590fcc3c0d7cf1c/execroot/tmp.VzGLDvgQvk/bazel-out/local-fastbuild/bin/path/to/target/target.runfiles/__main__
$ bazel-bin/path/to/target/target
/tmp/tmp.VzGLDvgQvk
$ head -1 /etc/os-release
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
$ uname -r
4.9.0-3-amd64
bazel info release
):$ bazel info release
release 0.5.1
Original bazel-discuss
thread: https://groups.google.com/d/msgid/bazel-discuss/CAA%3DsxTgFY9D1kyGaGKCuqUftUBAizNdXu28o_zsKRkgA3F-ZwA
The working directory seems to be set here to the runfiles dir: https://github.com/bazelbuild/bazel/blob/5e60e3868b4bafacc138f786b304589dbd687016/src/main/java/com/google/devtools/build/lib/runtime/commands/RunCommand.java#L220
As I said on the other bug, I'm not sure we should change this. If you want to run your binary from a different working directory, don't use bazel run?
Closing this in favour of #2579; will comment there.
Actually reopening this as I think they're not the same issue. I don't know that the behaviour requested in #2579 is desirable.
@ulfjack
As I said on the other bug, I'm not sure we should change this. If you want to run your binary from a different working directory, don't use bazel run?
bazel run
is a really nice shorthand for build and run. It makes instructions for many common tasks simpler. Is there a technical reason to have the child's working directory to be its runfiles directory? The binary cannot be written expecting to find its runfiles relative to its working directory since it can be run via the link under bazel-bin
. Am I missing something else?
One workaround for your problem is to use absolute path for passed files. Say, I want to check in CI build the formatting of all java files in specific commit and I have defined this java_binary
rule in tools/BUILD
file:
java_binary(
name = "gjf",
main_class = "com.google.googlejavaformat.java.Main",
visibility = ["//visibility:public"],
runtime_deps = ["@google_java_format//jar"],
)
and WORKSPACE
is defined like this:
# TODO(davido): Switch to use vanilla upstream gjf version when one of these PRs are merged:
# https://github.com/google/google-java-format/pull/106
# https://github.com/google/google-java-format/pull/154
http_jar(
name = "google_java_format",
sha256 = "9fe87113a2cf27e827b72ce67c627b36d163d35d1b8a10a584e4ae024a15c854",
url = "https://github.com/davido/google-java-format/releases/download/1.3-1-gec5ce10/google-java-format-1.3-1-gec5ce10-all-deps.jar"
)
Now, i can use bazel run
as following:
$ git show --diff-filter=AM --name-only HEAD | grep java$ | \
sed 's@^@'"$PWD"/'@' | xargs -r bazel run tools:gjf -- --dry-run
INFO: Running command line: bazel-bin/tools/gjf --dry-run
Need Formatting:
[gerrit-server/src/test/java/com/google/gerrit/testutil/InMemoryH2Type.java,
gerrit/gerrit-server/src/test/java/com/google/gerrit/testutil/NoteDbMode.java]
ERROR: Non-zero return code '1' from command: Process exited with
status 1.
Is there a technical reason to have the child's working directory to be its runfiles directory?
It lets you have runtime dependencies that are different than what's in your source tree. There are a couple of nice things this gets you:
//foo/bar:my-tool
and use it as a runtime dependency, you can refer to it in your binary as foo/bar/my-tool. In your source directory, it'll be under a configuration-dependent symlink (e.g, bazel-bin) that might not even exist.//foo/bar:data-test.csv
and //foo/bar:data-prod.csv
, your binary can have a runtime dependency on data-test.csv and then, if it's run from its runfiles directory and looks for .csv files under foo/bar, it won't see data-prod.csv. One thing that our shell tests do that you might want to consider is starting scripts with "if $PWD
doesn't end with .runfiles, cd to ${0}.runfiles
." That way you'll always be in .runfiles by line 2.
I believe you can use --run_under
to work around this:
bazel run --run_under="cd $PWD && " //my:target -- ARGS
(Obviously doesn't work on Windows.)
@kchodorow
You can depend on outputs by their relative path.
As you point out, you can always do this by looking them up in the runfiles by searching adjacent to the executable. Since the binary might be started in ways other than bazel run
, a binary should not rely on its runfiles being found relative to its working directory.
The binary doesn't "see" files that aren't runtime dependencies.
Again this seems like a binary implementor's responsibility: they should look up runtime dependencies in runfiles. It seems wrong to rely on your binary only ever being called via bazel run
.
Neither of these seem like reasons to have the child from bazel run
start in its runfiles directory. They are more like reasons to load expected files relative to the executable's path, which I'm completely on board with.
That way you'll always be in .runfiles by line 2.
This is what I think binaries should do if they want to always be in their runfiles directory. Doing this would not conflict withbazel run
respecting the working directory I invoke it from: in that situation, running the binary frombazel-bin
would do the same.
@davido, I first came across this with formatting tools as well!
One workaround for your problem is to use absolute path for passed files.
This is a pretty annoying workaround.
One workaround for your problem is to use absolute path for passed files.
This is a pretty annoying workaround.
Well, at least for format checking only, you don't need any hacks, just use Bazel's sh_test
rule. For google-java-format check, this could be easily achieved with my patched gjf version:
WORKSPACE
:
# TODO(davido): Switch to use vanilla upstream gjf version when one of these PRs are merged:
# https://github.com/google/google-java-format/pull/106
# https://github.com/google/google-java-format/pull/154
http_jar(
name = "google_java_format",
sha256 = "9fe87113a2cf27e827b72ce67c627b36d163d35d1b8a10a584e4ae024a15c854",
url = "https://github.com/davido/google-java-format/releases/download/1.3-1-gec5ce10/google-java-format-1.3-1-gec5ce10-all-deps.jar"
)
BUILD
:
java_binary(
name = "gjf-binary",
main_class = "com.google.googlejavaformat.java.Main",
visibility = ["//visibility:public"],
runtime_deps = ["@google_java_format//jar"],
)
genrule(
name = "gjf-check-invocation",
srcs = glob(["src/main/java/**/*.java"]),
cmd = "find . -name \"*.java\" | xargs $(location :gjf-binary) --dry-run > $@",
tools = [":gjf-binary"],
outs = ["check-java-format.log"],
)
sh_test(
name = "gjf-check",
srcs = [":gjf-check-invocation"],
tags = ["oauth"],
)
Now, to check the format for all files, would just be bazel test foo
:
Failure:
$ bazel test gjf-check
.........
INFO: Found 1 test target...
ERROR: /home/davido/projects/oauth/BUILD:28:1: Executing genrule //:gjf-check-invocation failed: Process exited with status 123 [sandboxed].
Need Formatting:
[./src/main/java/com/googlesource/gerrit/plugins/oauth/GoogleOAuthService.java]
Use --strategy=Genrule=standalone to disable sandboxing for the failing actions.
Target //:gjf-check failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 8.210s, Critical Path: 1.25s
Executed 0 out of 1 test: 1 fails to build.
Success:
$ bazel test gjf-check
INFO: Found 1 test target...
Target //:gjf-check up-to-date:
bazel-genfiles/check-java-format.log
bazel-bin/gjf-check
INFO: Elapsed time: 1.323s, Critical Path: 1.09s
//:gjf-check PASSED in 0.1s
Executed 1 out of 1 test: 1 test passes.
I understand the current state, I think it's probably the right default choice. However there are lots of use cases where you really do want to know the place you were invoked from because it's a tool that modifies the source (formatters, automated fixers, refactoring tools etc) and at the moment the information is not recoverable.
I think it would be sufficient to have bazel run
set an environment variable with the original working directory of bazel, so that tools can recover the information (or be wrapped in a sh_binary
that cd's back before invoking the underlying tool for instance)
Bazel now has a --direct_run
flag. It implements a similar idea to @ianthehat describes above and attempted in #3635: it sets $BUILD_WORKING_DIRECTORY
to record the working directory at the point where bazel run
was called. So I think this ticket can be closed.
--direct_run is deprecated because it's enabled by default now.
Most helpful comment
I believe you can use
--run_under
to work around this:(Obviously doesn't work on Windows.)