$ bazel build java/com/google/gerrit/server/diff:diff -s
INFO: Analysed target //java/com/google/gerrit/server/diff:diff (37 packages loaded).
INFO: Found 1 target...
SUBCOMMAND: # //java/com/google/gerrit/server/diff:diff [action 'Building java/com/google/gerrit/server/diff/libdiff.jar (17 source files) and running annotation processors (AutoAnnotationProcessor, AutoValueProcessor)']
(cd /home/davido/.cache/bazel/_bazel_davido/5c01f4f713b675540b8b424c5c647f63/execroot/gerrit && \
exec env - \
LC_CTYPE=en_US.UTF-8 \
external/local_jdk/bin/java -XX:+TieredCompilation '-XX:TieredStopAtLevel=1' -Xbootclasspath/p:external/bazel_tools/third_party/java/jdk/langtools/javac-9-dev-r4023-3.jar -jar external/bazel_tools/tools/jdk/JavaBuilder_deploy.jar @bazel-out/k8-fastbuild/bin/java/com/google/gerrit/server/diff/libdiff.jar-2.params)
Target //java/com/google/gerrit/server/diff:diff up-to-date:
bazel-bin/java/com/google/gerrit/server/diff/libdiff.jar
INFO: Elapsed time: 4.139s, Critical Path: 0.92s
INFO: Build completed successfully, 3 total actions
it will succeed, where the libdiff.jar-2.params - file on Linux is here: http://paste.openstack.org/show/626680.
$ bazel build java/com/google/gerrit/server/diff:diff -s
SUBCOMMAND: # //java/com/google/gerrit/server/diff:diff [action 'Building java/com/google/gerrit/server/diff/libdiff.jar (17 source files) and running annotation processors (AutoAnnotationProcessor, AutoValueProcessor)']
cd C:/msys64/tmp/_bazel_davido/_biltsqa/execroot/gerrit
SET LC_CTYPE=en_US.UTF-8
external/local_jdk/bin/java.exe -XX:+TieredCompilation -XX:TieredStopAtLevel=1 -Xbootclasspath/p:external/bazel_tools/third_party/java/jdk/langtools/javac-9-dev-r4023-3.jar -jar external/bazel_tools/tools/jdk/JavaBuilder_deploy.jar @bazel-out/x64_windows-fastbuild/bin/java/com/google/gerrit/server/diff/libdiff.jar-2.params
[54 / 55] Building java/com/google/gerrit/server/diff/libdiff.jar (17 source files) and running annotation processors (AutoAnnotationProcessor, AutoValueProcessor); 1s worker
ERROR: C:/users/davido/projects/gerrit/java/com/google/gerrit/server/diff/BUILD:1:1: Building java/com/google/gerrit/server/diff/libdiff.jar (17 source files) and running annotation processors (AutoAnnotationProcessor, AutoValueProcessor) failed (Exit 1)
bazel-out\x64_windows-fastbuild\bin\java\com\google\gerrit\server\diff\_javac\diff\libdiff_sourcegenfiles\com\google\gerrit\server\diff\AutoValue_IntraLineDiffArgs.java:105: error: incompatible types: NameKey cannot be converted to Object
&& (this.project.equals(that.project()))
^
Note: Some messages have been simplified; recompile with -Xdiags:verbose to get full output
Target //java/com/google/gerrit/server/diff:diff failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 44,308s, Critical Path: 4,68s
FAILED: Build did NOT complete successfully
where the libdiff.jar-2.params - file in Windows is here: http://paste.openstack.org/show/626690.
It turns out, that the root cause is missing transitive dependency on gwtorm.jar that was fixed here: [1]. I've created a reproducer: [2].
It seems that the problem is reported by error prone, in compilation unit, that was generated by annotation processor (auto-value). I wrote this issue to improve the misleading error message: #4105.
This issue is to track the different build outcomes on Linux and Windows: success vs. failure, correspondingly.
//CC @cushon @damienmg
It seems that inaccuracy and bad buid outcome is caused by error prone. If I disable all checks on Windows by passing this bazel option: --javacopt=-XepDisableAllChecks, the build succeeds (as it does on Linux without disabling error prone checks):
$ bazel build --javacopt=-XepDisableAllChecks java/com/google/gerrit/server/diff:diff -s
[...]
SUBCOMMAND: # //java/com/google/gerrit/common:server [action 'Building java/com/google/gerrit/common/libserver.jar (57 source files)']
cd C:/msys64/tmp/_bazel_davido/_biltsqa/execroot/gerrit
SET LC_CTYPE=en_US.UTF-8
external/local_jdk/bin/java.exe -XX:+TieredCompilation -XX:TieredStopAtLevel=1 -Xbootclasspath/p:external/bazel_tools/third_party/java/jdk/langtools/javac-9-dev-r4023-3.jar -jar external/bazel_tools/tools/jdk/JavaBuilder_deploy.jar @bazel-out/x64_windows-fastbuild/bin/java/com/google/gerrit/common/libserver.jar-2.params
SUBCOMMAND: # //java/com/google/gerrit/server/diff:diff [action 'Building java/com/google/gerrit/server/diff/libdiff.jar (17 source files) and running annotation processors (AutoAnnotationProcessor, AutoValueProcessor)']
cd C:/msys64/tmp/_bazel_davido/_biltsqa/execroot/gerrit
SET LC_CTYPE=en_US.UTF-8
external/local_jdk/bin/java.exe -XX:+TieredCompilation -XX:TieredStopAtLevel=1 -Xbootclasspath/p:external/bazel_tools/third_party/java/jdk/langtools/javac-9-dev-r4023-3.jar -jar external/bazel_tools/tools/jdk/JavaBuilder_deploy.jar @bazel-out/x64_windows-fastbuild/bin/java/com/google/gerrit/server/diff/libdiff.jar-2.params
[54 / 55] Building java/com/google/gerrit/server/diff/libdiff.jar (17 source files) and running annotation processors (AutoAnnotationProcessor, AutoValueProcessor); 1s worker
Target //java/com/google/gerrit/server/diff:diff up-to-date:
C:/msys64/tmp/_bazel_davido/_biltsqa/execroot/gerrit/bazel-out/x64_windows-fastbuild/bin/java/com/google/gerrit/server/diff/libdiff.jar
INFO: Elapsed time: 4,891s, Critical Path: 4,22s
INFO: Build completed successfully, 37 total actions
Do you have a minimal repro for the part of this that only happens on windows?
I don't have access to a windows machine right now, would you mind adding javacopts = ["-verbose"] to //java/com/google/gerrit/server/diff:diff and then sharing the log output for that action for both windows and linux?
@cushon Sure, it took me quite some time to extract the minimal reproducer for this breakage from Gerrit Code Review and gwtorm. See this repro https://github.com/davido/bazel-annotation-processor-breakage. I also was able to eliminate all external deps, except auto-value.
I also filed this issue upstream: [1].
I don't have access to a windows machine right now, would you mind adding javacopts = ["-verbose"] to //java/com/google/gerrit/server/diff:diff and then sharing the log output for that action for both windows and linux?
Looking into the verbose output, as expected, the root cause is the missing dependency on indirect dependency on Windows on gwtorm:client:
Windows
[search path for class files:
external\local_jdk\jre\lib\rt.jar,
external\local_jdk\jre\lib\resources.jar,
external\local_jdk\jre\lib\jsse.jar,
external\local_jdk\jre\lib\jce.jar,
external\local_jdk\jre\lib\charsets.jar,
external\local_jdk\jre\lib\ext\access-bridge-64.jar,
external\local_jdk\jre\lib\ext\cldrdata.jar,
external\local_jdk\jre\lib\ext\dnsns.jar,
external\local_jdk\jre\lib\ext\jaccess.jar,
external\local_jdk\jre\lib\ext\jfxrt.jar,
external\local_jdk\jre\lib\ext\localedata.jar,
external\local_jdk\jre\lib\ext\nashorn.jar,
external\local_jdk\jre\lib\ext\sunec.jar,
external\local_jdk\jre\lib\ext\sunjce_provider.jar,
external\local_jdk\jre\lib\ext\sunmscapi.jar,
external\local_jdk\jre\lib\ext\sunpkcs11.jar,
external\local_jdk\jre\lib\ext\zipfs.jar,
bazel-out\x64_windows-fastbuild\bin\java\com\google\gerrit\reviewdb\libserver-hjar.jar,
bazel-out\x64_windows-fastbuild\genfiles\external\auto_value\jar\_ijar\jar\external\auto_value\jar\auto-value-1.5.2-ijar.jar]
Whereas on Linux indirect dependency on gwtorm:client is there:
Linux
[...]
bazel-out/k8-fastbuild/bin/java/com/google/gwtorm/libclient-hjar.jar
Thanks for taking the time to make the repro! I think there's a second bug here preventing gwtorm:client from being on the classpath for the windows build.
Can you share the contents of bazel-bin/java/com/google/gerrit/reviewdb/libserver-hjar.jdeps from the windows build? And do either of --java_classpath=off or --nojava_header_compilation make a difference?
Can you share the contents of bazel-bin/java/com/google/gerrit/reviewdb/libserver-hjar.jdeps from the windows build?
Here is the content of bazel-bin/java/com/google/gerrit/reviewdb/libserver-hjar.jdeps:
$ cat bazel-bin/java/com/google/gerrit/reviewdb/libserver-hjar.jdeps
Q
Mbazel-out\x64_windows-fastbuild\bin\java\com\google\gwtorm\libclient-hjar.jar
%
!external\local_jdk\jre\lib\rt.jar(//java/com/google/gerrit/reviewdb:server
And do either of --java_classpath=off
Indeed, passing --java_classpath=off did fix it:
$ bazel build --java_classpath=off java/com/google/gerrit/server/diff
Loading:
Loading: 0 packages loaded
INFO: Analysed target //java/com/google/gerrit/server/diff:diff (0 packages loaded).
INFO: Found 1 target...
[1 / 11] BazelWorkspaceStatusAction stable-status.txt
Target //java/com/google/gerrit/server/diff:diff up-to-date:
C:/msys64/tmp/_bazel_davido/dl4nuxf6/execroot/bazel_annotation_processor_breakage/bazel-out/x64_windows-fastbuild/bin/java/com/google/gerrit/server/diff/libdiff.jar
INFO: Elapsed time: 1,756s, Critical Path: 1,02s
INFO: Build completed successfully, 9 total actions
or --nojava_header_compilation make a difference?
Passing --nojava_header_compilation didn't fix it:
$ bazel build --nojava_header_compilation java/com/google/gerrit/server/diff
Loading:
Loading: 0 packages loaded
INFO: Analysed target //java/com/google/gerrit/server/diff:diff (0 packages loaded).
INFO: Found 1 target...
[0 / 10] BazelWorkspaceStatusAction stable-status.txt
ERROR: C:/users/davido/projects/bazel-annotation-processor-breakage/java/com/google/gerrit/server/diff/BUILD:3:1: Building java/com/google/gerrit/server/diff/libdiff.jar (1 source file) and running annotation processors (AutoAnnotationProcessor, AutoValueProcessor) failed (Exit 1)
bazel-out\x64_windows-fastbuild\bin\java\com\google\gerrit\server\diff\_javac\diff\libdiff_sourcegenfiles\com\google\gerrit\server\diff\AutoValue_IntraLineDiffArgs.java:51: error: incompatible types: NameKey cannot be converted to Object
return (this.project.equals(that.project()))
^
Note: Some messages have been simplified; recompile with -Xdiags:verbose to get full output
Target //java/com/google/gerrit/server/diff:diff failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 1,525s, Critical Path: 0,81s
FAILED: Build did NOT complete successfully
Can you attach libserver-hjar.jdeps to the bug? It's a binary proto.
Can you attach libserver-hjar.jdeps to the bug? It's a binary proto.
Done libserver-hjar.zip.
JavaBuilder compares paths as strings when computing reduced class paths. The params file contains /-separated paths like bazel-out/msvc_x64-fastbuild/bin/java/com/google/gwtorm/libclient-hjar.jar, and the jdeps files contain \-separated paths like bazel-out\msvc_x64-fastbuild\bin\java\com\google\gwtorm\libclient-hjar.jar.
@meteorcloudy - I'm surprised Bazel is using /-separated paths for params files on Windows (see the paste in the original comment: http://paste.openstack.org/show/626690/). Is that expected behaviour?
Yes, for some reasons we still use / as seperator, I think it should work fine on Windows with Java.
https://github.com/bazelbuild/bazel/blob/755e176cc116ab8332e79c52c06441c64459af1e/src/main/java/com/google/devtools/build/lib/vfs/WindowsPathFragment.java#L53-L58
I wonder where does \ come from?
FYI, @laszlocsomor
The \ comes from JavaBuilder, which is writing out a .jdeps proto containing paths to some of the entries on the classpath. Those paths round-trip through java.nio.file.Path, which uses the platform default path separator when they get converted back to strings.
Out of curiosity, why is / still used on Windows?
@meteorcloudy : FYI, I didn't add that TODO, @haxorz did.
@cushon For example, you cannot use \ in BUILD files or label.
But I agree we should switch to \, because it's also causing other problems. like https://github.com/bazelbuild/bazel/issues/3949
@cushon @meteorcloudy @laszlocsomor
Do we understand what is going on here, and why the transitive class path is broken on Windows and why gwtorm:client is not passed to javac/error prone compilers (while all is fine on Linux)?
bazel-out/k8-fastbuild/bin/java/com/google/gwtorm/libclient-hjar.jar
It took me so long time to track that down, create the reproducer and find a workaround: pass gwtorm as direct dependency on Windows instead of relying on transitive dependency to be provided on the classpath also on Windows.
Should we warn the Bazel community (mailing list or Bazel main page where it claims that Windows is supported), and tell them to stop using Bazel on Windows, as it is inherently broken?
Not to mention, that inacurate build outcome (passing on Linux but broken on Windows, that claimed to be supported platform) is at least P1 for me, if not P0!
@davido Thanks for spending so much time debugging this! As you can see, this bug is hidden deeply and it didn't happen very often, that's why we missed it.
You are right, this should be a P1 bug for us. But since it doesn't happen frequently (so far this is the first time I've seen such error) and completely broke Bazel on Windows, I don't think we need to tell the community not to use Bazel on Windows.
@cushon Can you briefly summarize the problem, is there any workaround for this issue before we completely switching to \? For example, comparing paths as PathFragment instead of just string, that would be more robust in my opinion.
@cushon, @meteorcloudy : I'll look into changing the separator we use on Windows to \ when creating Path objects.
/ is a supported path separator on Windows. Is the problem that ErrorProne does not accept such separators?
Can we normalize JavaBuilder to always output /?
.jdeps should be platform-independent, no need to introduce superficial differences.
@dslomov :
/is a supported path separator on Windows.
AFAIK that's not universally the case. API functions like CreateFile{A,W} only support / in paths if you don't use the \\?\ prefix.
Can you briefly summarize the problem, is there any workaround for this issue before we completely switching to \?
The problem is that JavaBuilder uses a heuristic to reduce the number of jars it reads at compile-time. It filters the classpath to only contain direct dependencies and the jars that were needed to compile those direct dependencies (as reported in the dependencies' jdeps files). Since the jar paths in the current compilation's params file don't match the paths in the jdeps files for its direct dependencies, the heuristic is failing to include the jars that were needed when compiling the direct dependencies.
There's a fall back path in case the heuristic is too aggressive that retries with a full classpath, but that wasn't happening due to the Error Prone issue that was spun off into google/error-prone#827.
For example, comparing paths as PathFragment instead of just string, that would be more robust in my opinion.
Note that the comparison is happening in JavaBuilder, not in Bazel, so we probably don't want to use PathFragment. Using java.nio.file.Path might be better than strings, but if we're using a consistent string representation of the paths it doesn't matter.
.jdeps should be platform-independent
@dslomov can you expand on this? They're intermediate build artifacts that should only be used in the context of a single build, and a single build configuration. I wouldn't expect them to be more platform-independent than params files.
@laszlocsomor @meteorcloudy do you have a sense of how long it will take to change the separator on Windows? (I'm wondering if we should be looking for a work-around.)
.jdeps should be platform-independent
@dslomov can you expand on this? They're intermediate build artifacts that should only be used in the context of a single build, and a single build configuration. I wouldn't expect them to be more platform-independent than params files.
Ah I see so the contents of it is between JavaBuilder and JavaBuilder :) (or other Java tools)
I think the right fix is for Java tools to be robust and accept multiple valid path representations referring to the same path (if jdeps and command line point to the same path, they should be accepted as the same, and on Windows a/b/c.java and a\b\c.java are the same paths.
@dslomov :
/ is a supported path separator on Windows.
AFAIK that's not universally the case. API functions like CreateFile{A,W} only support / in paths if you don't use the \?\ prefix.
"But they do support it" (cue Jack Sparrow meme :)). For relative paths (as is the case here) '/' and '\' are supported and equivalent.
I think the right fix is for Java tools to be robust and accept multiple valid path representations
Are there disadvantages to also using the platform-dependent name separator in Bazel?
I'm wondering if we should be looking for a work-around.
+1. I would define this issue as release blocker and wait with releasing 0.8 until it is fixed.
@davido is it a regression from 0.7.0?
I don't think it is a regression, but, given that bazel build outcome differs on diferent platform and given that a workaround for "\" vs. "/" comaprisson mismatch should be trivial, it should be fixed prior 0.8 is released.
I'm wondering if we should be looking for a work-around.
@cushon As you suggested, the workaround is here: [1]. I tested it with my reproducer on both platforms Linux and Windows. It works as expected.
@davido, thanks for your patch!
I'm looking at changing the path separators for Windows. I expect it will break some tests I'm not yet aware of.
Our release policy is clear: not a regression => no cherry-pick and no release blockage (https://bazel.build/support.html#releases)
Given that this survived at least one release, the problem is not that acute.
I think the right fix is for Java tools to be robust and accept multiple valid path representations
Are there disadvantages to also using the platform-dependent name separator in Bazel?
The more uniform platform-independent commands are, the better (for remote execution, action cache, etc).
The present issue is in Java tools - Bazel gives valid (on Windows) command lines to Java tools, and Java tools do not handle them correctly- so let's fix it there.
I would define this issue as release blocker and wait with releasing 0.8 until it is fixed.
Our release policy is clear: not a regression => no cherry-pick and no release blockage (https://bazel.build/support.html#releases)
OTOH, The release policy is also claiming that:
Only critical, high-impact bugs will be fixed in a release candidate.
Given that nothing causes more pain, frustration, and disappointment than unfulfilled expectations. As mentioned in the original report to this issue,
Linux
$ bazel build java/com/google/gerrit/server/diff:diff
Pass
Windows
$ bazel build java/com/google/gerrit/server/diff:diff
broken
Needless to say, I was shocked to find out, that a simple java_library rule -- no any system specifics like genrule or what not involved -- works on Linux and is failing on Windows.
So my POV that we are talking about critical and high-impact bug: reliability and accuracy of build outcome. What else can compromise a build tool, when its outcome is flaky?
Very recently, Bazel team removed the notice from the FAQ, that Bazel on Windows platform is highly experimented, bleeding edge and may produce unexpected results.
Last but not least, Bazel has this slogan on its main page:
Build and test software of any size, quickly and reliably
I suggest, you remove reliably untill bazel build foo for simple java_library rule produces the same (!) results on all supported platforms. Or your stop claiming that platform X is 100% supported platform, unless you have opened issues (with existing workarounds that you are rejecting to cherry-pick) causing transitive classpath computation to be inherently broken. IOW, revert this CL.
I'm really not amused with your assessment of that issue (as you may have noticed). One reason for it, is because Gerrit Code Review project's founder promissed me in person that Bazel team is going to be very supportive for Gerrit Code Review issues, for reasons (that I'm not going to enter into here).
David, I agree this is an important bug to fix, and we will fix it. I think it should block the next (0.9) release.
On the other hand, every software project has bugs, and whenever the user is hit by a bug, that bug is, for that user, the most important thing and the biggest fault of the project ("the sky is falling").
In this case, we just have no evidence that many users are hit by this specific issue: the issue is not a regression, it has been present in more than one release, we build multiple large projects with Bazel very regularly (including on ci.bazel.build), we have multiple large customers with substantial Java codebases, and nobody yet complained. There are workarounds (disable errorprone) that will get you through until this issue is fixed. There is no justification whatsoever to block 0.8 release on this.
[...] the workaround is here: [1]. I tested it with my reproducer on both platforms Linux and Windows. It works as expected.
[1] https://bazel-review.googlesource.com/#/c/bazel/+/22730
I was looking into adding a test, for the CL above, as was requested in review. It seems that a similar test already exists. So that I was curious, whether or not this test still works with the diff applied.
Unfortunately, the java_tools don't work here. I wrote this issue: #4165.
@davido There is a internal change under review from @cushon, which will fix this problem in a more principle way, maybe we can just wait for that one. :)
@meteorcloudy @cushon
Thanks. I abandoned my CL.
The work-around was submitted as 0b2352de3101e87647d083f6089246079dda0f75.
The work-around was submitted as 0b2352d.
Works like a charm, thanks!
Update:
\ on Windows, but gave up because there are hundreds of call sites and I didn't want to go ahead and audit all, and also:I have no active proposal to use backslashes for Windows. In fact, I am
saying we should not do that unless absolutely necessary. Forward slashes
are put into command lines in (guesstimate) ~10,000 places in the bazel
code base, it would be very difficult to find them all, fix them all, and
keep them fixed.
We would have to vet every command line constructed and make sure artifacts
are used directly, or when path strings are used they would have to be
funneled through some object that knows what the execution platform is
going to be. We don't have such an object, the analysis phase has no idea
where actions are going to get executed. In skylark paths are stuck
straight on as strings, so there's simply no way to know what a path is.
The whole API would have to be remade to stop returning plain strings,
which isn't backwards compatible.
This can all be done, but keep in mind a decision to use backslashes
throughout will probably cost order of magnitude 1 SWE year.
On Tue, Nov 28, 2017 at 11:10 AM, László Csomor notifications@github.com
wrote:
Update:
- @cushon https://github.com/cushon submitted a principled fix to
JavaBuilder in 0b2352d
https://github.com/bazelbuild/bazel/commit/0b2352de3101e87647d083f6089246079dda0f75,
which fixed this bug.- I looked at changing the code to use '' on Windows, but gave up
because there are hundreds of call sites and I didn't want to go ahead and
audit all, and also:- @tomlu https://github.com/tomlu is working on a design proposal to
tighten path semantics inside Bazel, and I'd rather he did the work, not me
;)—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/bazelbuild/bazel/issues/4108#issuecomment-347573866,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABbnSrJiFyWzFk14icxWJWBXg-7OJKT9ks5s7DCGgaJpZM4Qh6tb
.