Node: Flaky test-benchmark-path

Created on 19 Jan 2018 · 11Comments · Source: nodejs/node

Seems like this test some times times out.

https://ci.nodejs.org/job/node-test-binary-arm/12927/RUN_SUBSET=6,label=pi3-raspbian-jessie/console

20:28:04 not ok 15 parallel/test-benchmark-path
20:28:04   ---
20:28:04   duration_ms: 240.61
20:28:04   severity: fail
20:28:04   stack: |-
20:28:04     timeout

CI / flaky test arm benchmark

Source

BridgeAR

All 11 comments

Looks more like a bug within the benchmark tools

joyeecheung on 24 Jan 2018

I think this one got resolved at some point? At least that's my recollection.

apapirovski on 13 Apr 2018

started to see this again, for example: https://ci.nodejs.org/job/node-test-commit-aix/nodes=aix61-ppc64/15848/consoleFull

I will see if I can recreate locally.

gireeshpunathil on 18 Jun 2018

06:32:20 not ok 235 parallel/test-benchmark-path
06:32:20   ---
06:32:20   duration_ms: 120.191
06:32:20   severity: fail
06:32:20   stack: |-
06:32:20     timeout

gireeshpunathil on 18 Jun 2018

Failing on AIX in parallel with a timeout suggests that perhaps it should be moved to sequential.

I see on successful runs that it still can take 25 seconds or so, and that's a lot for a test in parallel. Get the machine busy doing something else and have this test compete with just the wrong other tests in parallel, and it times out. Or at least, that would be the first theory.

So I'd say move this to sequential. Or check to see if there's a way to reduce the time it takes to run this test. But I did that and I don't think there's anything to do there unless we want to combine benchmarks into fewer files. It seems like launching processes is where the time is taken up.

PR to move this test to sequential: https://github.com/nodejs/node/pull/21393

Trott on 19 Jun 2018

👍1

Agree that it is a manifestation of parallelism, not much to do with the test as such. Running independantly, it comes out in ~9 seconds.

Do we know what are the tunables which decide the number of tests that run in parallel? I am just wondering if there is a mismatch in those tunables in relation to the resource limits in the failing AIX box: for example, if the number of tests are derived as a function of the number of CPUs but the amount of usable memory / user processes etc. are otherwise constrained through ulimit settings, the process can be seen as timing out? /cc @mhdawson

gireeshpunathil on 19 Jun 2018

On a related note: what is the rationale behind running benchmarks in parallel? what inference one can derive, and what purpose they solve?

gireeshpunathil on 19 Jun 2018

Do we know what are the tunables which decide the number of tests that run in parallel?

If JOBS is set as an environment variable, that's the number of parallel jobs that get run. If that variable is not set, then Makefile invokes test.py with -J and that causes the number of parallel jobs to be run to be determined by Python's multiprocessing.cpu_count().

On the AIX failure linked above (https://ci.nodejs.org/job/node-test-commit-aix/nodes=aix61-ppc64/15848/consoleFull), JOBS is explicitly set to 5. I don't know where that value comes from.

Trott on 19 Jun 2018

👍1

On a related note: what is the rationale behind running benchmarks in parallel?

The general practice is: If a test can be in parallel, then put it in parallel because that will keep the duration of test runs shorter. So, until a test causes problems, it stays in parallel. I'm not saying that's a good approach (although perhaps it is), but that's how we ended up with the current set of test/parallel/test-benchmark-* tests.

Trott on 19 Jun 2018

JOBS is explicitly set to 5

thanks for the info. This is a minimal value, and should not collide with the resource limitations that would have set upon a user. So I guess we can do nothing about it other than moving to sequential.

If a test can be in parallel, then put it in parallel

Sure, looks like a good stratgey, but the only issue is that such tests won't reveal any benchmark insights, instead only serve as unit tests or probably some stress?

gireeshpunathil on 19 Jun 2018

such tests won't reveal any benchmark insights, instead only serve as unit tests or probably some stress?

Yeah, that's their purpose. The benchmark tests are not used to provide benchmark results. They are minimal tests to make sure that benchmarks included in source code aren't totally broken (which is definitely a thing that has happened). For example, it will catch a breaking change to an API that did not get propagated to a benchmark that uses the API.

Trott on 19 Jun 2018

👍1

Was this page helpful?

0 / 5 - 0 ratings