Bazel: `bazel run` child should run with `bazel` invocation's current directory

Created on 5 Jul 2017  路  12Comments  路  Source: bazelbuild/bazel

Description of the problem / feature request / question:

When using the bazel run command, the spawned child has a working directory different from the directory where the command is run. This leads to bazel run //path/to/target and bazel-bin/path/to/target/target being quite different. If possible, it would be ideal to have the child inherit the bazel invocation's working directory.

If possible, provide a minimal example to reproduce the problem:

Note difference between output of last two commands.

$ cd $(mktemp -d)
$ pwd
/tmp/tmp.VzGLDvgQvk
$ touch WORKSPACE
$ mkdir -p path/to/target
$ cat <<EOF > path/to/target/BUILD
sh_binary(
    name = "target",
    srcs = ["target.sh"],
)
EOF
$ cat <<EOF > path/to/target/target.sh
#!/bin/bash
pwd
EOF
$ chmod +x path/to/target/target.sh
$ bazel run //path/to/target 2>/dev/null
/home/kamal/.cache/bazel/_bazel_kamal/f4a400f5a0927c37f590fcc3c0d7cf1c/execroot/tmp.VzGLDvgQvk/bazel-out/local-fastbuild/bin/path/to/target/target.runfiles/__main__
$ bazel-bin/path/to/target/target
/tmp/tmp.VzGLDvgQvk

Environment info

  • Operating System:
$ head -1 /etc/os-release 
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
$ uname -r
4.9.0-3-amd64
  • Bazel version (output of bazel info release):
$ bazel info release
release 0.5.1

Have you found anything relevant by searching the web?

Original bazel-discuss thread: https://groups.google.com/d/msgid/bazel-discuss/CAA%3DsxTgFY9D1kyGaGKCuqUftUBAizNdXu28o_zsKRkgA3F-ZwA

Anything else, information or logs or outputs that would be helpful?

The working directory seems to be set here to the runfiles dir: https://github.com/bazelbuild/bazel/blob/5e60e3868b4bafacc138f786b304589dbd687016/src/main/java/com/google/devtools/build/lib/runtime/commands/RunCommand.java#L220

feature request

Most helpful comment

I believe you can use --run_under to work around this:

bazel run --run_under="cd $PWD && " //my:target -- ARGS

(Obviously doesn't work on Windows.)

All 12 comments

As I said on the other bug, I'm not sure we should change this. If you want to run your binary from a different working directory, don't use bazel run?

Closing this in favour of #2579; will comment there.

Actually reopening this as I think they're not the same issue. I don't know that the behaviour requested in #2579 is desirable.

@ulfjack

As I said on the other bug, I'm not sure we should change this. If you want to run your binary from a different working directory, don't use bazel run?

bazel run is a really nice shorthand for build and run. It makes instructions for many common tasks simpler. Is there a technical reason to have the child's working directory to be its runfiles directory? The binary cannot be written expecting to find its runfiles relative to its working directory since it can be run via the link under bazel-bin. Am I missing something else?

One workaround for your problem is to use absolute path for passed files. Say, I want to check in CI build the formatting of all java files in specific commit and I have defined this java_binary rule in tools/BUILD file:

java_binary(
    name = "gjf",
    main_class = "com.google.googlejavaformat.java.Main",
    visibility = ["//visibility:public"],
    runtime_deps = ["@google_java_format//jar"],
)

and WORKSPACE is defined like this:

# TODO(davido): Switch to use vanilla upstream gjf version when one of these PRs are merged:
# https://github.com/google/google-java-format/pull/106
# https://github.com/google/google-java-format/pull/154
http_jar(
    name = "google_java_format",
    sha256 = "9fe87113a2cf27e827b72ce67c627b36d163d35d1b8a10a584e4ae024a15c854",
    url = "https://github.com/davido/google-java-format/releases/download/1.3-1-gec5ce10/google-java-format-1.3-1-gec5ce10-all-deps.jar"
)

Now, i can use bazel run as following:

$ git show --diff-filter=AM --name-only HEAD | grep java$ | \
   sed 's@^@'"$PWD"/'@' | xargs -r bazel run tools:gjf -- --dry-run
  INFO: Running command line: bazel-bin/tools/gjf --dry-run
  Need Formatting:
  [gerrit-server/src/test/java/com/google/gerrit/testutil/InMemoryH2Type.java,
   gerrit/gerrit-server/src/test/java/com/google/gerrit/testutil/NoteDbMode.java]
  ERROR: Non-zero return code '1' from command: Process exited with
  status 1.

Is there a technical reason to have the child's working directory to be its runfiles directory?

It lets you have runtime dependencies that are different than what's in your source tree. There are a couple of nice things this gets you:

  1. You can depend on outputs by their relative path: if you build a helper tool like //foo/bar:my-tool and use it as a runtime dependency, you can refer to it in your binary as foo/bar/my-tool. In your source directory, it'll be under a configuration-dependent symlink (e.g, bazel-bin) that might not even exist.
  2. The binary doesn't "see" files that aren't runtime dependencies. If you have //foo/bar:data-test.csv and //foo/bar:data-prod.csv, your binary can have a runtime dependency on data-test.csv and then, if it's run from its runfiles directory and looks for .csv files under foo/bar, it won't see data-prod.csv.

One thing that our shell tests do that you might want to consider is starting scripts with "if $PWD doesn't end with .runfiles, cd to ${0}.runfiles." That way you'll always be in .runfiles by line 2.

I believe you can use --run_under to work around this:

bazel run --run_under="cd $PWD && " //my:target -- ARGS

(Obviously doesn't work on Windows.)

@kchodorow

You can depend on outputs by their relative path.

As you point out, you can always do this by looking them up in the runfiles by searching adjacent to the executable. Since the binary might be started in ways other than bazel run, a binary should not rely on its runfiles being found relative to its working directory.

The binary doesn't "see" files that aren't runtime dependencies.

Again this seems like a binary implementor's responsibility: they should look up runtime dependencies in runfiles. It seems wrong to rely on your binary only ever being called via bazel run.

Neither of these seem like reasons to have the child from bazel run start in its runfiles directory. They are more like reasons to load expected files relative to the executable's path, which I'm completely on board with.

That way you'll always be in .runfiles by line 2.
This is what I think binaries should do if they want to always be in their runfiles directory. Doing this would not conflict with bazel run respecting the working directory I invoke it from: in that situation, running the binary from bazel-bin would do the same.

@davido, I first came across this with formatting tools as well!

One workaround for your problem is to use absolute path for passed files.

This is a pretty annoying workaround.

One workaround for your problem is to use absolute path for passed files.
This is a pretty annoying workaround.

Well, at least for format checking only, you don't need any hacks, just use Bazel's sh_test rule. For google-java-format check, this could be easily achieved with my patched gjf version:

WORKSPACE:

# TODO(davido): Switch to use vanilla upstream gjf version when one of these PRs are merged:
# https://github.com/google/google-java-format/pull/106
# https://github.com/google/google-java-format/pull/154
http_jar(
    name = "google_java_format",
    sha256 = "9fe87113a2cf27e827b72ce67c627b36d163d35d1b8a10a584e4ae024a15c854",
    url = "https://github.com/davido/google-java-format/releases/download/1.3-1-gec5ce10/google-java-format-1.3-1-gec5ce10-all-deps.jar"
)

BUILD:

java_binary(
    name = "gjf-binary",
    main_class = "com.google.googlejavaformat.java.Main",
    visibility = ["//visibility:public"],
    runtime_deps = ["@google_java_format//jar"],
)

genrule(
    name = "gjf-check-invocation",
    srcs = glob(["src/main/java/**/*.java"]),
    cmd = "find . -name \"*.java\" | xargs $(location :gjf-binary) --dry-run > $@",
    tools = [":gjf-binary"],
    outs = ["check-java-format.log"],
)

sh_test(
    name = "gjf-check",
    srcs = [":gjf-check-invocation"],
    tags = ["oauth"],
)

Now, to check the format for all files, would just be bazel test foo:

Failure:

$ bazel test gjf-check
.........
INFO: Found 1 test target...
ERROR: /home/davido/projects/oauth/BUILD:28:1: Executing genrule //:gjf-check-invocation failed: Process exited with status 123 [sandboxed].
Need Formatting:
[./src/main/java/com/googlesource/gerrit/plugins/oauth/GoogleOAuthService.java]
Use --strategy=Genrule=standalone to disable sandboxing for the failing actions.
Target //:gjf-check failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 8.210s, Critical Path: 1.25s

Executed 0 out of 1 test: 1 fails to build.

Success:

$ bazel test gjf-check
INFO: Found 1 test target...
Target //:gjf-check up-to-date:
  bazel-genfiles/check-java-format.log
  bazel-bin/gjf-check
INFO: Elapsed time: 1.323s, Critical Path: 1.09s
//:gjf-check                                                             PASSED in 0.1s

Executed 1 out of 1 test: 1 test passes.

I understand the current state, I think it's probably the right default choice. However there are lots of use cases where you really do want to know the place you were invoked from because it's a tool that modifies the source (formatters, automated fixers, refactoring tools etc) and at the moment the information is not recoverable.
I think it would be sufficient to have bazel run set an environment variable with the original working directory of bazel, so that tools can recover the information (or be wrapped in a sh_binary that cd's back before invoking the underlying tool for instance)

Bazel now has a --direct_run flag. It implements a similar idea to @ianthehat describes above and attempted in #3635: it sets $BUILD_WORKING_DIRECTORY to record the working directory at the point where bazel run was called. So I think this ticket can be closed.

--direct_run is deprecated because it's enabled by default now.

Was this page helpful?
0 / 5 - 0 ratings