Bazel: Windows: Flipping --incompatible_auto_configure_host_platform flag causes "failed to delete output files before executing action" error

Created on 7 Aug 2019  Â·  24Comments  Â·  Source: bazelbuild/bazel

Related https://github.com/bazelbuild/bazel/issues/7081

To reproduce:

git clone https://github.com/bazelbuild/rules_nodejs.git
cd rules_nodejs
bazel build //e2e:e2e_karma
...
bazel build --incompatible_auto_configure_host_platform //e2e:e2e_karma
ERROR: C:/users/pcloudy/_bazel_pcloudy/owxwy7oq/external/build_bazel_rules_typescript/internal/BUILD.bazel:80:1: failed to delete output files before executing action
Target //e2e:e2e_karma failed to build
ERROR: C:/tools/msys64/home/pcloudy/workspace/rules_nodejs/packages/karma/BUILD.bazel:38:1 failed to delete output files before executing action

This affects project like rules_nodejs where worker is used during the build. If I manually kill the worker after the first build, the second build with the flag won't fail to delete the outputs.

@katre

P2 team-Configurability bug

All 24 comments

I also noticed if you remove the flag again, the build will still fail to delete outputs. This is worth investigating.

Huh, that's a really unexpected error from changing the host platform. I'll try to investigate. Is there a windows VM configured that I can use to reproduce?

This is really weird: flipping the flag causes the failure, regardless of which way it is flipped.

bazel clean
bazel build  //e2e:e2e_karma --incompatible_auto_configure_host_platform
# Succeeds
bazel build  //e2e:e2e_karma --noincompatible_auto_configure_host_platform
# Fails

And...

bazel clean
bazel build  //e2e:e2e_karma --noincompatible_auto_configure_host_platform
# Succeeds
bazel build  //e2e:e2e_karma --incompatible_auto_configure_host_platform
# Fails

Finally managed a clean debug, here is the error:

  exception: java.io.IOException: C:/users/jcater/_bazel_jcater/ayo6twso/execroot/build_bazel_rules_nodejs/bazel-out/host/bin/external/build_bazel_rules_typescript/internal/tsc_wrapped_bin.exe (Permission denied)
  action: action 'Writing script external/build_bazel_rules_typescript/internal/tsc_wrapped_bin.exe [for host]' (FileWrite[[File:[C:/users/jcater/_bazel_jcater/ayo6twso[source]]external/bazel_tools/tools/launcher/launcher.exe] -> [File:[[<execution_root>]bazel-out/host/bin]external/build_bazel_rules_typescript/internal/tsc_wrapped_bin.exe]])

So it appears that changing this flag is tickling a permissions error. Possibly the worker still being active means the file can't be deleted?

When we discard the analysis cache, should workers also be shut down?

I guess that's exactly the problem, we don't shutdown the workers after discarding the analysis cache. @alexeagle @gregmagolan How does the worker get started and what could we do to shutdown the workers?

I would like to submit the flag flip for `--incompatible_auto_configure_host_platform`` (#7081). Should this issue block that?

I don't think the flag causes the error, I think it's just one way to cause re-running the same action, which is what triggers it.

I'm if you can go ahead and flip the flag. But you should do a downstream test for all projects with this flag flipped.

This error is not happening in the latest downstream build: https://buildkite.com/bazel/bazel-at-head-plus-downstream/builds/1144

(I'm not sure what the error is with res_test, however).

I did a bisect, it turned out the failure of res_test is indeed caused by flipping --incompatible_auto_configure_host_platform. I supposed it affected how windows sdk toolchain was selected. Maybe we should fix from the test side. /cc @laszlocsomor can you take a look?

Thank you for debugging. I was confused because that didn't fail in the actual CI run, only the downstream.

Yes, I'll take a look.

Yes, because that's not a shell test, it's somehow affected by the Bazel binary that runs the test.

@laszlocsomor Thanks!

confusion intensifies

Without the flag:

c:\src\bazel-releases\0.29.0\rc6\bazel.exe --ignore_all_rc_files build //src/test/res:res --noincompatible_auto_configure_host_platform -s --toolchain_resolution_debug
(...)
INFO: Build options --incompatible_auto_configure_host_platform and --platforms have changed, discarding analysis cache.
INFO: ToolchainResolution: Looking for toolchain of type @io_bazel//src/main/res:toolchain_type...
INFO: ToolchainResolution:   Considering toolchain @local_config_winsdk//:local_arm64_tc...
INFO: ToolchainResolution:     Toolchain constraint @platforms//cpu:cpu has value @platforms//cpu:aarch64, which does not match value @platforms//cpu:x86_64 from the target platform @bazel_tools//platforms:target_platform
INFO: ToolchainResolution:   Rejected toolchain @local_config_winsdk//:local_arm64_tc, because of target platform mismatch
INFO: ToolchainResolution:   Considering toolchain @local_config_winsdk//:local_x64_tc...
INFO: ToolchainResolution:   Considering toolchain @local_config_winsdk//:local_x86_tc...
INFO: ToolchainResolution:     Toolchain constraint @platforms//cpu:cpu has value @platforms//cpu:x86_32, which does not match value @platforms//cpu:x86_64 from the target platform @bazel_tools//platforms:target_platform
INFO: ToolchainResolution:   Rejected toolchain @local_config_winsdk//:local_x86_tc, because of target platform mismatch
INFO: ToolchainResolution:   Considering toolchain //src/main/res:empty...
INFO: ToolchainResolution:   For toolchain type @io_bazel//src/main/res:toolchain_type, possible execution platforms and toolchains: {//:default_host_platform -> @local_config_winsdk//:local_x64_tc, @bazel_tools//platforms:host_platform -> @local_config_winsdk//:local_x64_tc}
INFO: ToolchainResolution: Selected execution platform //:default_host_platform, type @io_bazel//src/main/res:toolchain_type -> toolchain @local_config_winsdk//:local_x64_tc
INFO: ToolchainResolution: Selected execution platform //:default_host_platform,
INFO: ToolchainResolution: Selected execution platform //:default_host_platform,
INFO: Analyzed target //src/test/res:res (3 packages loaded, 77 targets configured).

With the flag:

c:\src\bazel-releases\0.29.0\rc6\bazel.exe --ignore_all_rc_files build //src/test/res:res --incompatible_auto_configure_host_platform -s --toolchain_resolution_debug
(...)
INFO: Build options --incompatible_auto_configure_host_platform and --platforms have changed, discarding analysis cache.
INFO: ToolchainResolution: Looking for toolchain of type @io_bazel//src/main/res:toolchain_type...
INFO: ToolchainResolution:   Considering toolchain @local_config_winsdk//:local_arm64_tc...
INFO: ToolchainResolution:     Toolchain constraint @platforms//cpu:cpu has value @platforms//cpu:aarch64, which does not match value @platforms//cpu:x86_64 from the target platform @local_config_platform//:host
INFO: ToolchainResolution:   Rejected toolchain @local_config_winsdk//:local_arm64_tc, because of target platform mismatch
INFO: ToolchainResolution:   Considering toolchain @local_config_winsdk//:local_x64_tc...
INFO: ToolchainResolution:     Toolchain constraint @platforms//os:os has value @platforms//os:windows, which does not match value <missing> from the execution platform //:default_host_platform
INFO: ToolchainResolution:     Toolchain constraint @platforms//cpu:cpu has value @platforms//cpu:x86_64, which does not match value <missing> from the execution platform //:default_host_platform
INFO: ToolchainResolution:   Considering toolchain @local_config_winsdk//:local_x86_tc...
INFO: ToolchainResolution:     Toolchain constraint @platforms//cpu:cpu has value @platforms//cpu:x86_32, which does not match value @platforms//cpu:x86_64 from the target platform @local_config_platform//:host
INFO: ToolchainResolution:   Rejected toolchain @local_config_winsdk//:local_x86_tc, because of target platform mismatch
INFO: ToolchainResolution:   Considering toolchain //src/main/res:empty...
INFO: ToolchainResolution:   For toolchain type @io_bazel//src/main/res:toolchain_type, possible execution platforms and toolchains: {@local_config_platform//:host -> @local_config_winsdk//:local_x64_tc, //:default_host_platform -> //src/main/res:empty}
INFO: ToolchainResolution: Selected execution platform //:default_host_platform, type @io_bazel//src/main/res:toolchain_type -> toolchain //src/main/res:empty
INFO: ToolchainResolution: Selected execution platform //:default_host_platform,
INFO: ToolchainResolution: Selected execution platform //:default_host_platform,
INFO: Analyzed target //src/test/res:res (0 packages loaded, 74 targets configured).

Confusing bit:

... which does not match value <missing> from the execution platform

I think this is because //:default_host_platform is inheriting from the old @bazel_tools//platforms:host_platform. Previously, this had the cpu and os constraints magically added based on the --hpst_cpu flag, but with the incompatible change they are not set, and so //:default_host_platform also has them unset.

I'll send a PR shortly to fix default_host_platform to inherit correctly. Will a normal CI run tell me if the issue is fixed?

Right, I can't update default_host_platform: see https://github.com/bazelbuild/bazel/issues/9148 for details.

OK, looks like we should do the same for res_test?

Yes, I think so.

On Wed, Aug 14, 2019, 7:39 AM Yun Peng notifications@github.com wrote:

OK, looks like we should do the same for res_test?

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
https://github.com/bazelbuild/bazel/issues/9104?email_source=notifications&email_token=AACPW74S42GGA3AS5AKICOTQEPVHZA5CNFSM4IJ7W6D2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4IQZLA#issuecomment-521211052,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AACPW73K5ZVWGGBLK5MIGL3QEPVHZANCNFSM4IJ7W6DQ
.

Sorry, I don't follow.
https://github.com/bazelbuild/bazel/issues/9148 says:

However, that currently fails because in Bazel 0.28.1, @local_config_platform//:host was missing public visibility. This was added in ff81b0c, which is included in Bazel 0.29.0.

So can't we fix the test in 0.29.0?

OK, looks like we should do the same for res_test?

Sorry again, do what?

I suppose if we want to fix the tests (//src/tools/singlejar:combiners_test and //src/test/res:res_test) after enabling --incompatible_auto_configure_host_platform, we'll have to break them with 0.28.1? Therefore we should disable the tests first, wait for the release of 0.29.0, then enable them with the fix. @katre Did I understand correctly?

Thanks @meteorcloudy ! I'm fine with that.

Thanks! I'll take care of cleaning up this (and combiners_test) after the release.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ob picture ob  Â·  3Comments

iirina picture iirina  Â·  3Comments

GaofengCheng picture GaofengCheng  Â·  3Comments

davidzchen picture davidzchen  Â·  3Comments

f1recracker picture f1recracker  Â·  3Comments