https://ci.nodejs.org/job/node-test-commit-osx/10336/nodes=osx1010/console
not ok 167 async-hooks/test-callback-error
---
duration_ms: 60.132
severity: fail
stack: |-
timeout
I think we should make the test more robust by only running the spawnSync part if process.argv[2] is not set. The way it is currently written, If there is a bug it will cause a recursion resulting in a timeout.
Happened again today:
https://ci.nodejs.org/job/node-test-commit-osx/10360/nodes=osx1010/console
not ok 167 async-hooks/test-callback-error
---
duration_ms: 60.68
severity: fail
stack: |-
timeout
/cc @nodejs/async_hooks
It might be good to split the three test cases in that file into three separate tests/test files so that we can see which one is the one that's timing out. (I'm pretty sure it's test_callback_abort.)
could add console.log('staring X'/'stopping X')
Cooking now.
So finally I have data, seems like it is test_callback_abort. Maybe the time to create a coredump? anyway it fluctuates from 19s to over 60s. @AndreasMadsen what can we do? skip this case on macOS?
+ python tools/test.py -p tap --mode=release --timeout 35 async-hooks/test-callback-error
TAP version 13
1..1
not ok 1 async-hooks/test-callback-error
---
duration_ms: 31.297
severity: fail
stack: |-
2017-06-09T04:03:20.423Z init
2017-06-09T04:03:20.457Z required
2017-06-09T04:03:20.467Z required ./init-hooks
2017-06-09T04:03:20.467Z staring test_init_callback
2017-06-09T04:03:20.547Z finished test_init_callback
2017-06-09T04:03:20.548Z staring test_callback
2017-06-09T04:03:20.626Z finished test_callback
2017-06-09T04:03:20.626Z staring test_callback_abort
2017-06-09T04:03:51.210Z finished test_callback_abort
assert.js:586
assert.ifError = function ifError(err) { if (err) throw err; };
^
Error: spawnSync /Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/out/Release/node ETIMEDOUT
at exports._errnoException (util.js:1012:11)
at spawnSync (child_process.js:546:20)
at Object.<anonymous> (/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/test/async-hooks/test-callback-error.js:56:12)
at Module._compile (module.js:569:30)
at Object.Module._extensions..js (module.js:580:10)
at Module.load (module.js:503:32)
at tryModuleLoad (module.js:466:12)
at Function.Module._load (module.js:458:3)
at Function.Module.runMain (module.js:605:10)
at startup (bootstrap_node.js:158:16)
...
+ NOK=1
+ echo '1 OK: 0 NOT OK: 1 TOTAL: 100'
1 OK: 0 NOT OK: 1 TOTAL: 100
+ for i in '`seq $RUN_TIMES`'
+ python tools/test.py -p tap --mode=release --timeout 35 async-hooks/test-callback-error
TAP version 13
1..1
ok 1 async-hooks/test-callback-error
---
duration_ms: 19.5
...
+ OK=1
+ echo '2 OK: 1 NOT OK: 1 TOTAL: 100'
2 OK: 1 NOT OK: 1 TOTAL: 100
+ for i in '`seq $RUN_TIMES`'
+ python tools/test.py -p tap --mode=release --timeout 35 async-hooks/test-callback-error
TAP version 13
1..1
ok 1 async-hooks/test-callback-error
---
duration_ms: 18.279
...
+ OK=2
+ echo '3 OK: 2 NOT OK: 1 TOTAL: 100'
3 OK: 2 NOT OK: 1 TOTAL: 100
+ for i in '`seq $RUN_TIMES`'
+ python tools/test.py -p tap --mode=release --timeout 35 async-hooks/test-callback-error
TAP version 13
1..1
ok 1 async-hooks/test-callback-error
---
duration_ms: 17.398
...
+ OK=3
+ echo '4 OK: 3 NOT OK: 1 TOTAL: 100'
4 OK: 3 NOT OK: 1 TOTAL: 100
/cc @DavidCai1993
Skip this case on macOS?
We don't have a good reason to expect spawnSync to timeout on MacOS. Also, we do we get an ETIMEDOUT now, if it is not a recursion then I would expect the same error as before.
-1 on skip this case on macOS at this time, too. Seems that the problem is somehow related to process.abort() ...?
We don't have a good reason to expect spawnSync to timeout on MacOS. Also, we do we get an ETIMEDOUT now, if it is not a recursion then I would expect the same error as before.
I added a timeout on spawnSync
const c3 = spawnSync(process.execPath, a, { timeout: 30000 });
console.log((new Date()).toISOString() + ' finished test_callback_abort');
assert.ifError(c3.error);
and ifError so we'd see if and why, the call failed.
I could add:
if (common.isOSX && c3.error && c3.error.code === 'ETIMEDOUT') return common.skip()
Or simply return (no skip) so we acknowledge this might fail, but essentially that's ok.
I ran this test 1000 times on my Mac, all tests finished within 200-400 ms.
Thanks for explaining the ETIMEOUT. But I still think, we don't have a good reason to expect spawnSync to timeout on MacOS.
does duration_ms: 31.297 mean 31 seconds? That is confusing.
@Trott why does the abort tests not run on the CI? At least that is what the test/README.md says.
I think the idea was to avoid possibly creating a bunch of core files.
I see, then maybe we should move the test to that directory.
does duration_ms: 31.297 mean 31 seconds? That is confusing.
Yes, that's what it means, and yes it's confusing, but duration_ms is TAP format, not something we determined. So we may be kinda sorta stuck with it (unless/until TAP changes and the Jenkins plugin changes too).
I see, then maybe we should move the test to that directory.
I'd agree with that. Curious if we ever run the abort tests, like maybe before a release or something? @nodejs/release
Oh, by the way, another one on CI today: https://ci.nodejs.org/job/node-test-commit-osx/10391/nodes=osx1010/console
not ok 175 async-hooks/test-callback-error
---
duration_ms: 60.25
severity: fail
stack: |-
timeout
I'd agree with that.
Or at least moving the abort-on-uncaught-exception thing into its own test? I don't know what I think anymore.
Looks like what we have here may be a bug in Node.js master branch on macOS and not a problem with the test.
With this one-line test file called test.js:
throw new Error('foo');
If I run it with current master:
$ ./node --abort-on-uncaught-exception test.js
Uncaught Error: foo
FROM
Object.<anonymous> (/Users/trott/io.js/test.js:1:1)
Module._compile (module.js:1:1)
Object.Module._extensions..js (module.js:1:1)
Module.load (module.js:1:1)
tryModuleLoad (module.js:1:1)
Function.Module._load (module.js:1:1)
Function.Module.runMain (module.js:1:1)
startup (bootstrap_node.js:1:1)
bootstrap_node.js:1:1
Received signal 4 <unknown> 000100c84632
==== C stack trace ===============================
[0x000100c83442]
[0x7fff8200652a]
[0x000000000000]
[0x000100954b4b]
[0x0db5628843fd]
[0x0db562987c7a]
[0x0db56293be7c]
[end of stack trace]
Illegal instruction: 4
$
There is a looooong pause between printing [end of stack trace] and Illegal instruction: 4.
If I use Node.js 8.1.0, I get similar output, but there's no long pause.
Can someone try to replicate that just to confirm it's not some oddity in my own setup?
@Trott I cannot reproduce it on macOS with neither the master branch, nor 8.1.0. It just aborts immediately.
@aqrln Do you get the Illegal instruction: 4 as the last line of your output? (If I run as root, I don't get that line and it exits immediately.)
I can't reproduce on master, this is what I'm getting:
Uncaught Error: foo
FROM
Object.<anonymous> (/Users/Andreas/Sites/node/node-master/test.js:1:1)
Module._compile (module.js:1:1)
Object.Module._extensions..js (module.js:1:1)
Module.load (module.js:1:1)
tryModuleLoad (module.js:1:1)
Function.Module._load (module.js:1:1)
Function.Module.runMain (module.js:1:1)
startup (bootstrap_node.js:1:1)
bootstrap_node.js:1:1
Received signal 4 <unknown> 000100c3eff2
==== C stack trace ===============================
[0x000100c3de24]
[0x7fff8d423bba]
[0x7fff5fbfdea0]
[0x0001009205cb]
[0x0608333043fd]
[0x060833407c7a]
[0x0608333bbe7c]
[end of stack trace]
[1] 44955 illegal hardware instruction ./node --abort-on-uncaught-exception test.js
@Trott
Uncaught Error: foo
FROM
Object.<anonymous> (/Users/alex/projects/aqrln/node/test.js:1:1)
Module._compile (module.js:1:1)
Object.Module._extensions..js (module.js:1:1)
Module.load (module.js:1:1)
tryModuleLoad (module.js:1:1)
Function.Module._load (module.js:1:1)
Function.Module.runMain (module.js:1:1)
startup (bootstrap_node.js:1:1)
bootstrap_node.js:1:1
Received signal 4 <unknown> 000100c3e022
==== C stack trace ===============================
[0x000100c3ce54]
[0x7fffaedc8b3a]
[0x7fff5fbfd960]
[0x00010092014b]
[0x23ac401043fd]
[0x23ac40207c7a]
[0x23ac401bbe7c]
[end of stack trace]
[1] 32357 illegal hardware instruction ./node --abort-on-uncaught-exception test.js
Doesn't take long for me:
./node --abort-on-uncaught-exception test.js 0.04s user 0.74s system 46% cpu 1.669 total
Last line:
[2] 39788 illegal hardware instruction (core dumped) ./node --abort-on-uncaught-exception test.jsIllegal instruction: 4 (core dumped)However the generated cores are 1.3GB, so if your mac has an HDD then that might explain the delay.
➜ node git:(master) ✗ ❯ ls -lh /cores
-r-------- 1 gib admin 1.3G 9 Jun 19:11 core.39856
-r-------- 1 gib admin 1.3G 9 Jun 19:11 core.39868
Would be good to try on test-requireio-osx1010-x64-1 in the Build/CI cluster to see if it behaves like my machine or like (apparently) everyone else's. If it matters (and it might), I'm running OS version 10.11.6. And we're running 10.10 in CI, it appears. So if everyone else is running something newer, maybe that's part of the issue?
I'm running MacOS 10.12.3
Now trying on 10.12.5 and unable to reproduce. Anyone have OS X/macOS 10.11 or 10.10 lying around and can try the test.js thing above?
Finally got into the CI machine. It doesn't delay as long as my 10.11 machine, but there is a tiny delay that doesn't exist on my 10.12 machine. So, uh, inconclusive, I guess? Harumph.
Something changed on the CI machine from:
https://ci.nodejs.org/job/node-stress-single-test/1286/nodes=osx1010/console
TAP version 13
1..1
ok 1 async-hooks/test-callback-error
---
duration_ms: 18.83
...
+ OK=22
+ echo '25 OK: 22 NOT OK: 3 TOTAL: 100'
25 OK: 22 NOT OK: 3 TOTAL: 100
+ for i in '`seq $RUN_TIMES`'
+ python tools/test.py -p tap --mode=release --timeout 35 async-hooks/test-callback-error
TAP version 13
1..1
ok 1 async-hooks/test-callback-error
---
duration_ms: 17.609
...
+ OK=23
+ echo '26 OK: 23 NOT OK: 3 TOTAL: 100'
26 OK: 23 NOT OK: 3 TOTAL: 100
+ for i in '`seq $RUN_TIMES`'
+ python tools/test.py -p tap --mode=release --timeout 35 async-hooks/test-callback-error
to
https://ci.nodejs.org/job/node-stress-single-test/1287/nodes=osx1010/console
TAP version 13
1..1
ok 1 async-hooks/test-callback-error # skip test_callback_abort timed out
---
duration_ms: 0.319
...
+ OK=34
+ echo '34 OK: 34 NOT OK: 0 TOTAL: 100'
34 OK: 34 NOT OK: 0 TOTAL: 100
+ for i in '`seq $RUN_TIMES`'
+ python tools/test.py -p tap --mode=release --timeout 35 async-hooks/test-callback-error
TAP version 13
1..1
ok 1 async-hooks/test-callback-error # skip test_callback_abort timed out
---
duration_ms: 0.318
...
+ OK=35
+ echo '35 OK: 35 NOT OK: 0 TOTAL: 100'
35 OK: 35 NOT OK: 0 TOTAL: 100
+ for i in '`seq $RUN_TIMES`'
+ python tools/test.py -p tap --mode=release --timeout 35 async-hooks/test-callback-error
@refack Yikes, yes, that is a change (for the better).
I don't trust unexplained changes :shipit:
It is back 😨
not ok 177 async-hooks/test-callback-error
---
duration_ms: 60.75
severity: fail
stack: |-
timeout
https://ci.nodejs.org/job/node-test-commit-osx/10457/nodes=osx1010/
And now gone again after 45 minutes and three-out-of-three failures. Am I wrong that when it happens, it seems to happen consistently, all the time? Until it stops happening altogether?
Let's land #13559
And now gone again after 45 minutes and three-out-of-three failures. Am I wrong that when it happens, it seems to happen consistently, all the time? Until it stops happening altogether?
There is definitely a correlation.
And now gone again after 45 minutes and three-out-of-three failures. Am I wrong that when it happens, it seems to happen consistently, all the time? Until it stops happening altogether?
There is definitely a correlation.
That may be problematic in terms of running a stress test.
I just saw this too a little bit ago.
https://ci.nodejs.org/job/node-test-commit-osx/10467/nodes=osx1010/console
I got a good picture of the all elusive timeout:
https://ci.nodejs.org/job/node-stress-single-test/nodes=osx1010/1303/console
It's a bit longer than the ones above, and it demonstrates the pause between the end of stack (visible), and the last line .*Illegal instruction.* (didn't make it in time).
P.S. internal test timeout set to 15s.
TAP version 13
1..1
not ok 1 async-hooks/test-callback-error
---
duration_ms: 17.732
severity: fail
stack: |-
==== stdout ====
==== stderr ====
Error: test_callback_abort
at ActivityCollector.initHooks.oninit.common.mustCall (/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/test/async-hooks/test-callback-error.js:35:45)
at ActivityCollector.oninit (/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/test/common/index.js:517:15)
at ActivityCollector._init (/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/test/async-hooks/init-hooks.js:170:10)
at init (async_hooks.js:467:43)
at Object.emitInitS [as emitInit] (async_hooks.js:337:3)
at Object.<anonymous> (/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/test/async-hooks/test-callback-error.js:37:17)
at Module._compile (module.js:569:30)
at Object.Module._extensions..js (module.js:580:10)
at Module.load (module.js:503:32)
at tryModuleLoad (module.js:466:12)
1: node::Abort() [/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/out/Release/node]
2: node::Chdir(v8::FunctionCallbackInfo<v8::Value> const&) [/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/out/Release/node]
3: v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) [/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/out/Release/node]
4: v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) [/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/out/Release/node]
5: v8::internal::Builtin_Impl_HandleApiCall(v8::internal::BuiltinArguments, v8::internal::Isolate*) [/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/out/Release/node]
6: 0x2a86092043fd
7: 0x2a86092f7c49
Received signal 6
==== C stack trace ===============================
[0x000100c4402b]
[0x7fff8acf9f1a]
[0x7fff5fbfcf60]
[0x7fff882a59a3]
[0x000100ac4e0d]
[0x000100ac8a2e]
[0x000100216c92]
[0x0001002925d0]
[0x000100291b59]
[0x2a86092043fd]
[0x2a86092f7c49]
[end of stack trace]
assert.js:60
throw new errors.AssertionError({
^
AssertionError [ERR_ASSERTION]: Error: spawnSync /Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/out/Release/node ETIMEDOUT
at Object.<anonymous> (/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/test/async-hooks/test-callback-error.js:75:12)
at Module._compile (module.js:569:30)
at Object.Module._extensions..js (module.js:580:10)
at Module.load (module.js:503:32)
at tryModuleLoad (module.js:466:12)
at Function.Module._load (module.js:458:3)
at Function.Module.runMain (module.js:605:10)
at startup (bootstrap_node.js:158:16)
at bootstrap_node.js:575:3
...
+ NOK=154
+ echo '159 OK: 5 NOT OK: 154 TOTAL: 200'
159 OK: 5 NOT OK: 154 TOTAL: 200
+ for i in '`seq $RUN_TIMES`'
+ python tools/test.py -p tap --mode=release --timeout 30 async-hooks/test-callback-error
I tried disabling core dumps (sudo sysctl -w kern.coredump=0) but that did not make the delay go away, in case anyone else is having the same (probably naive) thought I just did.
also failed on OSX in https://github.com/nodejs/node/pull/13673
also in https://ci.nodejs.org/job/node-test-commit-osx/10580/ for #13760
Well... #13559 should stop the CI from going red, but we still don't know what's the root cause for the timeouts
I think we have concluded that neither async_hooks or child_process is at fault, but it has something to do with MacOS and --abort-on-uncaught-exception. I have updated the labels accordingly.
I think we have concluded that neither async_hooks or child_process is at fault, but it has something to do with MacOS and --abort-on-uncaught-exception. I have updated the labels accordingly.
I'm not so sure, don't we have a test for just that (--abort-on-uncaught-exception works, and finishes in reasonable time)? And if not that's a nice first contribution...
I'm not so sure.
The test in https://github.com/nodejs/node/issues/13527#issuecomment-307440797 is pretty conclusive.
don't we have a test for just that (--abort-on-uncaught-exception works, and finishes in reasonable time)? And if not that's a nice first contribution...
We have tests, but they are not executed on the CI because they are the special abort directory. https://github.com/nodejs/node/blob/master/test/abort/test-abort-uncaught-exception.js
@nodejs/platform-macos
Well... #13559 should stop the CI from going red, but we still don't know what's the root cause for the timeouts
If all we want is to stop the CI from going red, maybe we should have a commit that does nothing (or nearly nothing) but update the relevant .status file. That more extensive change appears to have broken make test on macOS. :-(
If all we want is to stop the CI from going red, maybe we should have a commit that does nothing (or nearly nothing) but update the relevant .status file. That more extensive change appears to have broken make test on macOS. :-(
(To be a bit more clear, though: No shame intended! A lot of great work has happened so far and some really important information has been uncovered. These things happen.)
This is interesting, I think:
Using Node.js 8.1.2 installed by nvm:
$ time node --abort-on-uncaught-exception -e 'throw new Error("foo")'
Uncaught Error: foo
FROM
[eval]:1:1
ContextifyScript.Script.runInThisContext (vm.js:44:33)
Object.runInThisContext (vm.js:116:38)
Object.<anonymous> ([eval]-wrapper:6:22)
Module._compile (module.js:569:30)
evalScript (bootstrap_node.js:432:27)
startup (bootstrap_node.js:139:9)
bootstrap_node.js:575:3
Illegal instruction: 4
real 0m0.222s
user 0m0.034s
sys 0m0.011s
222 ms execution time! No problem!
Same thing with Node.js compiled from master:
$ time ./node --abort-on-uncaught-exception -e 'throw new Error("foo")'
Uncaught Error: foo
FROM
[eval]:1:1
ContextifyScript.Script.runInThisContext (vm.js:1:1)
Object.runInThisContext (vm.js:1:1)
Object.<anonymous> ([eval]-wrapper:1:1)
Module._compile (module.js:1:1)
evalScript (bootstrap_node.js:1:1)
startup (bootstrap_node.js:1:1)
bootstrap_node.js:1:1
Received signal 4 <unknown> 000100c87282
==== C stack trace ===============================
[0x000100c86092]
[0x7fff8200652a]
[0x000000000000]
[0x000100954a1b]
[0x05ef3e4843fd]
[end of stack trace]
Illegal instruction: 4
real 0m12.560s
user 0m0.034s
sys 0m0.012s
More than 12.5 seconds execution time. Yikes!
Interesting that the stack traces are so very different....
I did a git checkout upstream/v8.x-staging && git checkout a021720 && ./configure && make -j4 to create 8.1.2, but it still takes a long time to exit:
$ ./node -v && time ./node --abort-on-uncaught-exception -e 'throw new Error("foo")'
v8.1.2
Uncaught Error: foo
FROM
[eval]:1:1
ContextifyScript.Script.runInThisContext (vm.js:44:33)
Object.runInThisContext (vm.js:116:38)
Object.<anonymous> ([eval]-wrapper:6:22)
Module._compile (module.js:569:30)
evalScript (bootstrap_node.js:432:27)
startup (bootstrap_node.js:139:9)
bootstrap_node.js:575:3
Illegal instruction: 4
real 0m12.163s
user 0m0.035s
sys 0m0.011s
Now the stack traces match at least, but the time difference is certainly troubling.
When run as root, there is no delay and there is no "Illegal instruction: 4".
I think the issue is in V8 (or perhaps more precisely, an interaction of macOS with V8 code--not sure if the bug is in one place, the other, or really neither). If I don't pass --abort-on-uncaught-exception on to V8, I don't get the ~12 second delay on exit. @nodejs/v8
cc @gsathya @ajklein
If I'm reading the history correctly, this seems like a bug with --abort-on-uncaught-exception and not async-hooks. Correct?
Re-routing to @schuay @fhinkel
If I'm reading the history correctly, this seems like a bug with --abort-on-uncaught-exception and not async-hooks. Correct?
Yes. Although I now have the confounding additional information that the issue manifests itself with only some users on my laptop. If I switch to other users, the issue goes away. It's entirely possible the problem is entirely my setup, but I suspect it's not quite that simple because we're seeing this in CI as well.
OK, it's almost definitely user setup at this point. When I run the program, while it's hanging, a process named ReportCrash goes way up to 99% CPU. Main thing at this point is probably to:
Although...if I run as root, things work just fine, no delay, etc. So maybe the real issue is to figure that out rather than disabling crash reporting... ¯\_(ツ)_/¯
Again, I wouldn't be going on and on about problems with my own personal setup if it weren't for the fact that we're seeing it in CI too, and we need to fix it there....
I wonder what the node::Chdir is about in the stack trace...
Logging on to the CI machine, I don't get the long delay with simple cases like that described in
https://github.com/nodejs/node/issues/13527#issuecomment-307440797. But I do get a long delay if I do what the test does from the command line:
test-nodesource-osx1010-x64-1:osx1010 iojs$ time ./node --abort-on-uncaught-exception test/async-hooks/test-callback-error.js test_callback_abort
Error: test_callback_abort
at ActivityCollector.initHooks.oninit.common.mustCall (/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/test/async-hooks/test-callback-error.js:30:45)
at ActivityCollector.oninit (/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/test/common/index.js:517:15)
at ActivityCollector._init (/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/test/async-hooks/init-hooks.js:170:10)
at init (async_hooks.js:467:43)
at Object.emitInitS [as emitInit] (async_hooks.js:337:3)
at Object.<anonymous> (/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/test/async-hooks/test-callback-error.js:33:17)
at Module._compile (module.js:569:30)
at Object.Module._extensions..js (module.js:580:10)
at Module.load (module.js:503:32)
at tryModuleLoad (module.js:466:12)
1: node::Abort() [/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/./node]
2: node::Chdir(v8::FunctionCallbackInfo<v8::Value> const&) [/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/./node]
3: v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) [/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/./node]
4: v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) [/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/./node]
5: v8::internal::Builtin_Impl_HandleApiCall(v8::internal::BuiltinArguments, v8::internal::Isolate*) [/Users/iojs/build/workspace/node-stress-single-test/nodes/osx1010/./node]
6: 0x3a7608e043fd
7: 0x3a7608ef7c49
Received signal 6
==== C stack trace ===============================
[0x000100c4402b]
[0x7fff8d42bf1a]
[0x000000000002]
[0x7fff8a9d79a3]
[0x000100ac4e0d]
[0x000100ac8a2e]
[0x000100216c92]
[0x0001002925d0]
[0x000100291b59]
[0x3a7608e043fd]
[0x3a7608ef7c49]
[end of stack trace]
Abort trap: 6
real 2m2.042s
user 0m0.072s
sys 0m0.051s
test-nodesource-osx1010-x64-1:osx1010 iojs$
And if I run top -o cpu in another terminal, it shows ReportCrash at the top, unsurprisingly.
I don't have root or sudo access on the machine so I can't see what happens if I run as root and I can't experiment with using launchctl to try to disable ReportCrash. (Running launchctl list without root doesn't show ReportCrash.) @nodejs/build
Assuming you're using test-requireio-osx1010-x64-1, then https://github.com/nodejs/build/pull/753#issuecomment-307750195 is the discussion around getting root.
@rvagg could you put the test key in the rvagg user? Presumably we'd also need the password to run sudo commands with it, IDK how we handle that normally.
...
real 2m2.042s
...
😵
If the plan is to move to MacStadium and not use requireio for Mac testing, then this may all be a moot point, at least in terms of CI. How soon can we move to MacStadium?
Good news! @rvagg reports that disabling ReportCrash with launchctl reduces the time from over two minutes to just 220ms when running time ./node --abort-on-uncaught-exception test/async-hooks/test-callback-error.js test_callback_abort. So this should be enough to turn CI green for OS X. All that's needed is to document this in the ansible scripts that build out the OS X hosts. That's a PR for the build repo.
This has been fixed since @rvagg ran these two commands on the host:
launchctl unload -w /System/Library/LaunchAgents/com.apple.ReportCrash.plist
sudo launchctl unload -w /System/Library/LaunchDaemons/com.apple.ReportCrash.Root.plist
Note that they need to be run from a GUI login or else they error out.
There's an issue open to get this into the provisioning for OS X machines. But for now, the problem is resolved, so I'm going to close this.
Most helpful comment
This has been fixed since @rvagg ran these two commands on the host:
Note that they need to be run from a GUI login or else they error out.
There's an issue open to get this into the provisioning for OS X machines. But for now, the problem is resolved, so I'm going to close this.