Stryker: Better default `--concurrency` for CI pipelines

Created on 8 Oct 2020  路  7Comments  路  Source: stryker-mutator/stryker

Is your feature request related to a problem? Please describe.
In v4, we've changed the default for --concurrency (previously called --maxConcurrentTestRunners) to n-1 (where n is the number of logical cores).

The reason for this is that we want to reserve 1 cpu for the main Stryker process to orchestrate the workers (both test runner and checker child processes). This is helpful for dev laptops with a lot of logical cores (for example 16), but on GitHub actions, where the number of cores is 2, it is more or less pointless (the main process won't have a lot to do).

Describe the solution you'd like
Use n-1 unless n <= 4, in that case use n.

Describe alternatives you've considered
Use n when process.env.CI === 'true', but I think this might be useful for all PC's.

Additional context
@mthmulders experienced a slow down after migrating from Stryker @3 to Stryker @4.

馃殌 Feature request

Most helpful comment

i want supercomputer :c ;D

While you're at it, get one for me too! :D I don't really need it, but it would be fun!

I fear that the high core count issues are hard to spot as a user of Stryker. A user doesn't know that there's a single orchestration thread. All they would see is disappointing scaling. Something that can be easily accredited to the law of diminishing returns.

In any case, it's not something we can test with our hardware right now. So we can't find out if it's an issue in the first place.

All 7 comments

I feel like first solution is better. Use n-1 unless n <= 4, in that case use n. Normally there should be no problem with n-1 since modern CPUs are super likely to have 4+.

Tho in CIs and VMs you can have 2-4 cores so it makes sense

Just some random thoughts:

What about scaling the other way? For example, if you're running a 64 core, 128 thread Threadripper. I can imagine the orchestration thread will get overwhelmed real fast trying to keep up with the 127 worker threads! These CPUs are definitely edge-cases right now. And I imagine it's not simple/impossible to scale up the number of orchestration threads. It might warrant a default max concurrency though.

Back to the point of low power situations. CPU manufacturers are starting to use big.LITTLE architectures in laptops. big.LITTLE is basically a combination of low-power cores and performance cores. It'll use the low-power whenever possible. Phones have used this for years. With these chips, you might have the orchestration thread being stuck on a low-power core. It'll be interesting to see if the orchestration thread will get overwhelmed in this situation. Thinking about it, this might be really bad on high-power big.LITTLE CPUs. The 2021 Intel chips are rumoured to have an 8+8 core model. A single low-power core can probably not keep up with 7 low-power and 8 performance cores. Again, edge-case for now.

Back to CI, wow I'm off-topic today. Do we actually know at what point we can skip the dedicated orchestration thread, or is 4 just a guesstimate? On Github, you get 2 cores, but at work, I get 4 in Azure Pipelines. If the sweet spot is 3 we're leaving performance on the table.

Unfortunatelly I think we cannot handle all edge cases. We are probably not going to support quantum or mega computers ;)
We should focus on these which are commonly used - so PCs and computers used in CI. And I think debate if we should use 3 or 4 cores in Azure Pipelines is a bit hard. Since 4 cores there and 4 cores on PC are different. Since we are working on logical cores, if we get 2x2 logical and 4 real core which should have better performance :/
We have options so we can change them. Defaults are defaults - they just should work, and should be compromise - just like normally you don't have games always in ultra nor in low settings (unless they check your computer, but lets skip this topic).
IMO current proposed setting is very optimal. Unless we have to support more complex structures, but then we will need better maths / more conditions

In order to make predictions about a high number of cores, we should have the hardware to test it out with the performance tests in place to make it easier.

For now, I want to keep it simple. I think choosing n (instead of n-1) for n<=4 is safe to do since that was the previous default in Stryker 3 and I didn't have a porblem with that in CI/CD, only on my dev laptop.

unless they check your computer, but lets skip this topic

Actually, that is what we're doing by checking the number of logical cores. It's also a part of why all these edge cases pop up.

For now, I want to keep it simple.

Agreed. Let's keep it simple for now and keep an eye out for potential issues like the ones above.

@Lakitna its just, it would be much easier if we had machines like these at home ;D (i want supercomputer :c ;D) + if someone has a problem with this, they can always open new issue, right? :)

i want supercomputer :c ;D

While you're at it, get one for me too! :D I don't really need it, but it would be fun!

I fear that the high core count issues are hard to spot as a user of Stryker. A user doesn't know that there's a single orchestration thread. All they would see is disappointing scaling. Something that can be easily accredited to the law of diminishing returns.

In any case, it's not something we can test with our hardware right now. So we can't find out if it's an issue in the first place.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

kmdrGroch picture kmdrGroch  路  19Comments

j-truax picture j-truax  路  20Comments

anthony-telljohann picture anthony-telljohann  路  19Comments

VincentLanglet picture VincentLanglet  路  31Comments

Lakitna picture Lakitna  路  42Comments