Seems like this test some times times out.
https://ci.nodejs.org/job/node-test-binary-arm/12927/RUN_SUBSET=6,label=pi3-raspbian-jessie/console
20:28:04 not ok 15 parallel/test-benchmark-path
20:28:04 ---
20:28:04 duration_ms: 240.61
20:28:04 severity: fail
20:28:04 stack: |-
20:28:04 timeout
Looks more like a bug within the benchmark tools
I think this one got resolved at some point? At least that's my recollection.
started to see this again, for example: https://ci.nodejs.org/job/node-test-commit-aix/nodes=aix61-ppc64/15848/consoleFull
I will see if I can recreate locally.
06:32:20 not ok 235 parallel/test-benchmark-path
06:32:20 ---
06:32:20 duration_ms: 120.191
06:32:20 severity: fail
06:32:20 stack: |-
06:32:20 timeout
Failing on AIX in parallel with a timeout suggests that perhaps it should be moved to sequential.
I see on successful runs that it still can take 25 seconds or so, and that's a lot for a test in parallel. Get the machine busy doing something else and have this test compete with just the wrong other tests in parallel, and it times out. Or at least, that would be the first theory.
So I'd say move this to sequential. Or check to see if there's a way to reduce the time it takes to run this test. But I did that and I don't think there's anything to do there unless we want to combine benchmarks into fewer files. It seems like launching processes is where the time is taken up.
PR to move this test to sequential: https://github.com/nodejs/node/pull/21393
Agree that it is a manifestation of parallelism, not much to do with the test as such. Running independantly, it comes out in ~9 seconds.
Do we know what are the tunables which decide the number of tests that run in parallel? I am just wondering if there is a mismatch in those tunables in relation to the resource limits in the failing AIX box: for example, if the number of tests are derived as a function of the number of CPUs but the amount of usable memory / user processes etc. are otherwise constrained through ulimit settings, the process can be seen as timing out? /cc @mhdawson
On a related note: what is the rationale behind running benchmarks in parallel? what inference one can derive, and what purpose they solve?
Do we know what are the tunables which decide the number of tests that run in parallel?
If JOBS is set as an environment variable, that's the number of parallel jobs that get run. If that variable is not set, then Makefile invokes test.py with -J and that causes the number of parallel jobs to be run to be determined by Python's multiprocessing.cpu_count().
On the AIX failure linked above (https://ci.nodejs.org/job/node-test-commit-aix/nodes=aix61-ppc64/15848/consoleFull), JOBS is explicitly set to 5. I don't know where that value comes from.
On a related note: what is the rationale behind running benchmarks in parallel?
The general practice is: If a test can be in parallel, then put it in parallel because that will keep the duration of test runs shorter. So, until a test causes problems, it stays in parallel. I'm not saying that's a good approach (although perhaps it is), but that's how we ended up with the current set of test/parallel/test-benchmark-* tests.
JOBS is explicitly set to 5
thanks for the info. This is a minimal value, and should not collide with the resource limitations that would have set upon a user. So I guess we can do nothing about it other than moving to sequential.
If a test can be in parallel, then put it in parallel
Sure, looks like a good stratgey, but the only issue is that such tests won't reveal any benchmark insights, instead only serve as unit tests or probably some stress?
such tests won't reveal any benchmark insights, instead only serve as unit tests or probably some stress?
Yeah, that's their purpose. The benchmark tests are not used to provide benchmark results. They are minimal tests to make sure that benchmarks included in source code aren't totally broken (which is definitely a thing that has happened). For example, it will catch a breaking change to an API that did not get propagated to a benchmark that uses the API.