Node: test,flake: async-hooks/test-fseventwrap flaky on Travis

Created on 13 Jun 2018 · 8Comments · Source: nodejs/node

Version: master
Platform: Travis
Subsystem: test,async-hooks,fs

https://travis-ci.com/nodejs/node/jobs/128751722#L9568

not ok 17 async-hooks/test-fseventwrap
  ---
  duration_ms: 0.135
  severity: fail
  exitcode: 1
  stack: |-
    internal/fs/watchers.js:155
        throw error;
        ^

    Error: ENOSPC: no space left on device, watch '/home/travis/build/nodejs/node/test/async-hooks/test-fseventwrap.js'
        at FSWatcher.start (internal/fs/watchers.js:149:26)
        at Object.watch (fs.js:1234:11)
        at Object.<anonymous> (/home/travis/build/nodejs/node/test/async-hooks/test-fseventwrap.js:16:20)
        at Module._compile (internal/modules/cjs/loader.js:702:30)
        at Object.Module._extensions..js (internal/modules/cjs/loader.js:713:10)
        at Module.load (internal/modules/cjs/loader.js:612:32)
        at tryModuleLoad (internal/modules/cjs/loader.js:551:12)
        at Function.Module._load (internal/modules/cjs/loader.js:543:3)
        at Function.Module.runMain (internal/modules/cjs/loader.js:744:10)
        at startup (internal/bootstrap/node.js:267:19)
  ...

CI / flaky test async_hooks fs test

Source

refack

Most helpful comment

Hrmm... from what I'm finding it could also be that the maximum number of system watchers is too low. This could be checked with sysctl fs.inotify.max_user_watches. Perhaps we could try executing something like this before running the test(s):

echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p

or some other similarly larger value than what is currently being used on Travis.

mscdex on 29 Aug 2018

👍3

All 8 comments

I mean, is it flaky if the device is out of space?

apapirovski on 13 Jun 2018

And here I was thinking Travis would give us a stable infrastructure for lightweight test runs ¯_(ツ)_/¯

mmarchini on 14 Jun 2018

I mean, is it flaky if the device is out of space?

I think so, since this will result in failures on PRs. But are there any actions we can take to avoid this?

mmarchini on 14 Jun 2018

I mean, is it flaky if the device is out of space?

If the test does not run reliably and gives intermittent false negatives IMHO that's the definition of flaky.
Also, it's hard to say if ENOSPC this is the real error since it happened only at the beginning of the suite, and ~2000 test run after that.

Anyway this issue is here so that peeps seeing this can cross reference.

refack on 14 Jun 2018

According to Travis' documentation there should be about ~9GB of disk space available for the configuration we're currently using. If we instead used a VM (sudo: required and dist: trusty) we could double the amount of available memory and disk space, if disk space is what is the issue here.

Has anyone tried to rerun an instance with debug mode enabled so you can ssh in and verify disk space availability?

mscdex on 29 Aug 2018

According to Travis' documentation there should be about ~9GB of disk space available for the configuration we're currently using.

My guess is that it's not actual out-of-space, but some other fragility. We do run the tests, and pass, in our CI cluster on machines with ~1GB free disk space.

I was quick to skip since currently we use Travis only for sanity testing, and still require our own Jenkins run before we land.

refack on 29 Aug 2018

echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p

or some other similarly larger value than what is currently being used on Travis.

mscdex on 29 Aug 2018

👍3

Flaky on CI too:

https://ci.nodejs.org/job/node-test-commit-linux/22236/nodes=ubuntu1804-docker/console

22:32:46 not ok 2100 async-hooks/test-fseventwrap
22:32:46   ---
22:32:46   duration_ms: 0.208
22:32:46   severity: fail
22:32:46   exitcode: 1
22:32:46   stack: |-
22:32:46     internal/fs/watchers.js:173
22:32:46         throw error;
22:32:46         ^
22:32:46     
22:32:46     Error: ENOSPC: System limit for number of file watchers reached, watch '/home/iojs/build/workspace/node-test-commit-linux/nodes/ubuntu1804-docker/test/async-hooks/test-fseventwrap.js'
22:32:46         at FSWatcher.start (internal/fs/watchers.js:165:26)
22:32:46         at Object.watch (fs.js:1269:11)
22:32:46         at Object.<anonymous> (/home/iojs/build/workspace/node-test-commit-linux/nodes/ubuntu1804-docker/test/async-hooks/test-fseventwrap.js:16:20)
22:32:46         at Module._compile (internal/modules/cjs/loader.js:706:30)
22:32:46         at Object.Module._extensions..js (internal/modules/cjs/loader.js:717:10)
22:32:46         at Module.load (internal/modules/cjs/loader.js:604:32)
22:32:46         at tryModuleLoad (internal/modules/cjs/loader.js:543:12)
22:32:46         at Function.Module._load (internal/modules/cjs/loader.js:535:3)
22:32:46         at Function.Module.runMain (internal/modules/cjs/loader.js:759:12)
22:32:46         at startup (internal/bootstrap/node.js:303:19)
22:32:46   ...