Node: Flaky test-timers-block-eventloop

Created on 19 Oct 2017  路  10Comments  路  Source: nodejs/node

Test added in https://github.com/nodejs/node/pull/15072

Example failure:
https://ci.nodejs.org/job/node-test-commit-linuxone/9533/nodes=rhel72-s390x/console

not ok 1945 sequential/test-timers-block-eventloop
  ---
  duration_ms: 0.440
  severity: fail
  stack: |-
    assert.js:45
      throw new errors.AssertionError({
      ^

    AssertionError [ERR_ASSERTION]: eventloop blocked!
        at Timeout.mustNotCall [as _onTimeout] (/data/iojs/build/workspace/node-test-commit-linuxone/nodes/rhel72-s390x/test/common/index.js:571:12)
        at ontimeout (timers.js:478:11)
        at tryOnTimeout (timers.js:302:5)
        at Timer.listOnTimeout (timers.js:262:5)
  ...
CI / flaky test test timers

All 10 comments

/cc @nodejs/platform-s390

I guess the reported error is on RHEL on S390

@targos and @gireeshpunathil
I reproduce this at my laptop(thinkpad x240/ubuntu 16.04) with the following steps:

  1. start 4 processes to occupy the cpu while(1);
  2. run the test case every other second.
  3. the error may happen every 10 times running the case.

So I guess maybe the test server was busy when this happens.

Fix is easy: just change the t3 timeout to 200ms.

--- a/test/sequential/test-timers-block-eventloop.js
+++ b/test/sequential/test-timers-block-eventloop.js
@@ -11,7 +11,7 @@ const t2 = setInterval(() => {
   common.busyLoop(15);
 }, 10);

-const t3 = setTimeout(common.mustNotCall('eventloop blocked!'), 100);
+const t3 = setTimeout(common.mustNotCall('eventloop blocked!'), 200);

With this fix, failures does not happen after 1000 times running. Is a PR needed?

@zhangzifa - makes sense, the test seem to be walking on a ridge with high sensitivity on the timing of events that does not account into the environmental factors in the system. While we wait to hear from @nodejs/platform-s390 this experiment and the proposal thereon looks reasonable to me, thanks!

@jBarz can you take a look and comment.

It seems like we are depending on the following fs.stat call to callback within 50 milliseconds. Otherwise the test fails.

fs.stat('./nonexistent.txt', (err, stats) => {...}

Increasing the timeout is one way to fix this.
@zhangzifa , could we stat something like /dev/zero which won't depend on the speed of the filesystem mount?

@jBarz I tested just now with /dev/zero, anyway, it still fails occasionally when cpu is busy. Anyway, I will update the case to /dev/nonexistent

@zhangzifa a PR would be great.

@gibfahn 16314 is there now.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

willnwhite picture willnwhite  路  3Comments

ksushilmaurya picture ksushilmaurya  路  3Comments

danielstaleiny picture danielstaleiny  路  3Comments

sandeepks1 picture sandeepks1  路  3Comments

addaleax picture addaleax  路  3Comments