Bazel: CI,windows: //src/test/java/com/google/devtools/build/lib/skyframe:SkyframeTests is failing on Windows

Created on 2 Mar 2018  路  12Comments  路  Source: bazelbuild/bazel

This is only failing on the CI machine, not for me locally, so I'm debugging on the testing CI machine.

Given the track record of the last 2 test failures (https://github.com/bazelbuild/bazel/issues/4752, https://github.com/bazelbuild/bazel/issues/4723), I'll go out on a limb and say this could be another line ending issue.

Time: 45.609
There were 3 failures:
1) testExplicitFiles(com.google.devtools.build.lib.skyframe.FilesystemValueCheckerTest)
java.lang.AssertionError: Not true that <[FILE_STATE:[<absolute root>]/[/foo1]]> contains exactly <[FILE_STATE:[<absolute root>]/[/foo1], FILE_STATE:[<absolute root>]/[/foo2]]>. It is missing <[FILE_STATE:[<absolute root>]/[/foo2]]>
    at com.google.devtools.build.lib.skyframe.FilesystemValueCheckerTest.assertDiffWithNewValues(FilesystemValueCheckerTest.java:801)
    at com.google.devtools.build.lib.skyframe.FilesystemValueCheckerTest.testExplicitFiles(FilesystemValueCheckerTest.java:298)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at org.junit.runners.Suite.runChild(Suite.java:127)
    at org.junit.runners.Suite.runChild(Suite.java:26)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at com.google.testing.junit.runner.internal.junit4.CancellableRequestFactory$CancellableRunner.run(CancellableRequestFactory.java:89)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:138)
    at com.google.testing.junit.runner.junit4.JUnit4Runner.run(JUnit4Runner.java:112)
    at com.google.testing.junit.runner.BazelTestRunner.runTestsInSuite(BazelTestRunner.java:144)
    at com.google.testing.junit.runner.BazelTestRunner.main(BazelTestRunner.java:82)
2) testTraversalOfGeneratedDirectory(com.google.devtools.build.lib.skyframe.RecursiveFilesystemTraversalFunctionTest)
java.lang.AssertionError: Not true that <[RegularFile(path=[/workspace/out]/[dir/bar.txt]), RegularFile(path=[/workspace/out]/[dir/baz/qux.txt])]> contains exactly <[RegularFile(path=[/workspace/out]/[dir/bar.txt]), RegularFile(path=[/workspace/out]/[dir/baz/qux.txt]), RegularFile(path=[/workspace/out]/[dir/foo.txt])]>. It is missing <[RegularFile(path=[/workspace/out]/[dir/foo.txt])]>
    at com.google.devtools.build.lib.skyframe.RecursiveFilesystemTraversalFunctionTest.traverseAndAssertFiles(RecursiveFilesystemTraversalFunctionTest.java:283)
    at com.google.devtools.build.lib.skyframe.RecursiveFilesystemTraversalFunctionTest.assertTraversalOfDirectory(RecursiveFilesystemTraversalFunctionTest.java:492)
    at com.google.devtools.build.lib.skyframe.RecursiveFilesystemTraversalFunctionTest.testTraversalOfGeneratedDirectory(RecursiveFilesystemTraversalFunctionTest.java:538)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at org.junit.runners.Suite.runChild(Suite.java:127)
    at org.junit.runners.Suite.runChild(Suite.java:26)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at com.google.testing.junit.runner.internal.junit4.CancellableRequestFactory$CancellableRunner.run(CancellableRequestFactory.java:89)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:138)
    at com.google.testing.junit.runner.junit4.JUnit4Runner.run(JUnit4Runner.java:112)
    at com.google.testing.junit.runner.BazelTestRunner.runTestsInSuite(BazelTestRunner.java:144)
    at com.google.testing.junit.runner.BazelTestRunner.main(BazelTestRunner.java:82)
3) testCacheBypassingActionWithMtimeChangingInput(com.google.devtools.build.lib.skyframe.SkyframeAwareActionTest)
java.lang.AssertionError: Not true that <CLEAN> is equal to <BUILT>
    at com.google.devtools.build.lib.skyframe.SkyframeAwareActionTest.assertActionExecutions(SkyframeAwareActionTest.java:435)
    at com.google.devtools.build.lib.skyframe.SkyframeAwareActionTest.assertActionWithMtimeChangingInput(SkyframeAwareActionTest.java:502)
    at com.google.devtools.build.lib.skyframe.SkyframeAwareActionTest.testCacheBypassingActionWithMtimeChangingInput(SkyframeAwareActionTest.java:495)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at org.junit.runners.Suite.runChild(Suite.java:127)
    at org.junit.runners.Suite.runChild(Suite.java:26)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at com.google.testing.junit.runner.internal.junit4.CancellableRequestFactory$CancellableRunner.run(CancellableRequestFactory.java:89)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:138)
    at com.google.testing.junit.runner.junit4.JUnit4Runner.run(JUnit4Runner.java:112)
    at com.google.testing.junit.runner.BazelTestRunner.runTestsInSuite(BazelTestRunner.java:144)
    at com.google.testing.junit.runner.BazelTestRunner.main(BazelTestRunner.java:82)

FAILURES!!!
Tests run: 674,  Failures: 3

(https://buildkite.com/bazel/bazel-bazel/builds/447#254d39b2-459a-48e3-829d-b4ce62681246)

P1 misc > testing windows bug

Most helpful comment

Found the culprit:

public class Millis {
  public static void main(String[] args) {
    long start = System.currentTimeMillis();
    while (System.currentTimeMillis() < start + 100) {
      System.out.println(System.currentTimeMillis());
    }
  }
}

Output:

...
1520352722671
1520352722671
...
1520352722671
1520352722671
1520352722671
1520352722687
1520352722687
1520352722687
...
1520352722687

All 12 comments

/cc @buchgr @philwo @meteorcloudy

thanks @laszlocsomor 鉂わ笍

And I found the fix for it, except I don't yet understand why it fixes the test.

Here's what I know:

  • changing "path1" to "path2" in this line, as I believe it was meant to be but never was, makes the test pass on Windows, but has no effect on test success on Linux
  • this function ultimately calls this one, which on Linux reports that both files are dirty (even though we only overwrote "path1"), and on Windows it reports that "path2" is not dirty, which is wrong if you consider that its mtime is changed (though it's still empty)
  • the thing is, it's fine that the test is passing on Linux as it is now, because it's accidentally testing more than it meant to: it tests that both an "empty-to-nonempty" transition and an "empty-to-empty-with-different-mtime" transition are reported as dirty files; on windows the latter one isn't reported as dirty, and that may be a bug
  • I have to leave now so I can't debug further
  • being able to RDP into a testing machine is fantastic

/cc @janakdr and @haxorz : see my previous comment about Skyframe file state checking; maybe it rings a bell for you; if not, i'll keep debugging on Monday

It very much seems as if these tests are flaky on Windows.

@buchgr : why do you think? I could reliable repro this on the testing CI VM.

@laszlocsomor The postsubmit https://buildkite.com/bazel/bazel-bazel pipeline sometimes succeeds and else fails with e.g. 2/3 flaky test attempts.

A wild guess, but given the symptoms of issue #4734 there might be a bug on Windows in detecting file changes.

Found the culprit:

public class Millis {
  public static void main(String[] args) {
    long start = System.currentTimeMillis();
    while (System.currentTimeMillis() < start + 100) {
      System.out.println(System.currentTimeMillis());
    }
  }
}

Output:

...
1520352722671
1520352722671
...
1520352722671
1520352722671
1520352722671
1520352722687
1520352722687
1520352722687
...
1520352722687

/cc @haxorz : see my comment above

In the first comment on this bug:

  • testTraversalOfGeneratedDirectory() is failing because it relies on creating symlinks; I'm not sure what exactly causes the failure, maybe that Bazel fails to create the symlink or that it creates a copy of the target file instead and some later assertion fails; need to investigate
  • testCacheBypassingActionWithMtimeChangingInput() looks like a timing flake similar to testExplicitFiles(); what I know for sure is that SkyframeAwareActionTest.maybeChangeFile uses the TimestampGranularityMonitor incorrectly (because it calls neither .setCommandStartTime() nor .notifyDependenceOnFileTime(), although both are required for correct operation; see PR #4786 )
Was this page helpful?
0 / 5 - 0 ratings