Cucumber-js: Add custom work partitioning for parallel runtime.

Created on 2 Mar 2018 · 7Comments · Source: cucumber/cucumber-js

I've looked at the experimental parallel testing feature, and I think it may need a mechanism for controlling how how different slaves are given work.

Looking at cucumber-js/src/runtime/parallel/master.js:110, I can see that slaves are given work from a queue. That sounds great in the general case, but I couldn't make use of it as-is. If my case is similar to others, I'd like to suggest that it should be possible to change the work allocation strategy.

I'm running e2e tests using Selenium, and different tests access several different databases. So my features look like;

Scenario: I can log in to the db1 database
 When I log in to the "db1" database
 Then ...

Scenario: I can create a user in the db2 database
 When I log in to the "db2" database
 Then ...

Those databases are shared resources, and are restored to a server, used, and dropped during the tests. If two slaves were to try to access the same database at the same time, they'll fail.

On my end, I can partition feature files out to slaves appropriately;

| feature         | database | slave |
| test1.feature   | dbA      |     1 |
| test2.feature   | dbB      |     1 |
| test3.feature   | dbC      |     2 |
| test4.feature   | dbC      |     2 |
| test5.feature   | dbC      |     2 |

But I don't see a mechanism where the parallel test runner can ask me where I'd like to run the scenario.

Would it be possible to have an extension point, a 'custom work partitioner', which would ask something like;

getSlaveAffinity(featureFilePath, numberOfSlaves);

which would return the slave to put the feature file on?

enhancement

Source

stevecooperorg

👍2

Most helpful comment

What is the current status with this feature? I'm happy to contribute time to implementing a POC if you can point me in the right direction. I like the idea of using tags on scenarios to define which resources they require a mutex on to execute.

ncjones on 10 May 2019

👍3

All 7 comments

Hmm, so currently it is using a queue of pickles (a scenario or one example from a scenario outline), not features. And it appears the main goal is that no two conflicting scenarios run at the same time. What slave they run on shouldn't matter.

Thoughts on an API like

const {setParallelCanAssignFn} = require('cucumber')

// testCase = {uri, pickle}
// runningTestCases = [{uri, pickle}, ... ]
setParallelCanAssignFn((newTestCase, runningTestCases) => {
  // return true if newTestCase can be alongside the currently running test cases, false otherwise
})

Then when a slave is free, it looks down the list of pickles and assigns it the first one this function returns true for. The default function always returns true.

charlierudolph on 29 Mar 2018

👍1

I like it! I have a lot of little thoughts...

It definitely seems like a more powerful option, yes. It has the effect of balancing out the work between the workers in the pool nicely. In my earlier suggestion, I had to estimate the time cost of running a test, and tried to balance things out; not as efficient or easy as assigning to idle workers!

I also prefer the use of pickles - my partitioner worked by gobbing to find feature files, loading them via the Gherkin parser and examining the ASTs for relevant steps - so almost all my code is recreating some of the internals of cucumber. Since you have them to hand and can pass them as parameters, that makes my life much easier! :) I notice this API is close to but not the same as for formatters; in a formatter I can write

    options.eventBroadcaster.on('test-case-started', ({ sourceLocation }) => {
        const { gherkinDocument, pickle } = options.eventDataCollector.getTestCaseData(sourceLocation);

Could the test case passed to the function get the same parameters or test case structure ({ gherkinDocument, pickle }) as formatter events? I'm thinking about a simple case where you need to partition based on tags (say @uses(DB1) which I'd find on the gherkinDocument. Also, a similar API may mean less documentation and more code re-use?

Just to clarify, the setParallelCanAssignFn would be designed to run on the master ('who should get the work?'), rather than the slaves ('can I accept this work?') An implementer would not need to worry about multi-threading, right, as the master would have sole, single-threaded responsibility for assigning work?

I should also mention sequence of execution, I think. This mechanism lets you reject a test then accept it later. So a feature file written

Feature: X
    Scenario: A
    Scenario: B

Could actually be executed in the order [B,A]. That's likely to break someone's test suite somewhere in the world. Not that it should - the order of execution should not matter - but this seems like the first time cucumber would allow scenario to be executed out of feature-file order.

Anyway, thanks for thinking about it!

stevecooperorg on 30 Mar 2018

ncjones on 10 May 2019

👍3

Hello,

I am in the same situation of database-reset situation as @stevecooperorg and would love as well the ability to parallel tests with tags.

At least i could do a "no database" group and a "using database" one.

adrien-carre on 18 Jul 2020

Yup! I'm also really desiring this functionality. I'm going to whip up a POC (unless something is incoming). It'll be my first public contribution. Lets see how that goes. Guess I'll quickly browse the guidelines first.

eman2673 on 24 Feb 2021

@eman2673 Just to let you know : i resolved the parallel issue by starting one instance per tag thrue host's CLI (scaleway for me).

Absolutly not cucumber-js native but it's very efficient and allow us to have a full test on each merge on master.
And with this method, i could also do one instance per test file if i'd like.

adrien-carre on 24 Feb 2021

@adrien-carre Thanks for the heads up. I'm looking for a little more control. I want to throw all the workers at everything and they magically only work on test cases that don't conflict resource-wise. I also want to be able to tag scenarios as utilizing multiple resources (typically 1 or 2 tables). Trying to implement the approach @charlierudolph mentioned above.

eman2673 on 25 Feb 2021

Was this page helpful?

0 / 5 - 0 ratings