While working with parallel test execution in JUnit 5.3.1, I came across a mystery that I have not yet solved wherein I can trick JUnit into running more concurrent tests than it is configured to run as a side-effect of sleeping a thread inside of a CompletableFuture i.e.
CompletableFuture.runAsync(
() -> {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
})
.join();
When this code is present in a test, JUnit will run an additional concurrent test. For example, if junit.jupiter.execution.parallel.config.fixed.parallelism=4, one expects at most 4 tests to run concurrently; however if this CompletableFuture code snippet is present in all running tests, I observe at most 8 concurrent tests (assuming the total number of tests is greater than 8).
Maven project gilday/junit-parallelization-issue contains code to reproduce this issue. It includes a counter class HighWaterMark which observes at most 8 concurrent tests running when junit.jupiter.execution.parallel.config.fixed.parallelism=4.
I dug into this a bit. JUnit's ForkJoinPoolHierarchicalTestExecutorService uses ForkJoinTask.getSurplusQueuedTaskCount() to hold the "current level of parallelism" count when deciding to start another concurrent test. The value returned by ForkJoinTask.getSurplusQueuedTaskCount() is affected by other ForkJoinTask instances run during the test (such as CompletableFuture). I can imagine a couple of solutions for this issue, but before sending a PR I first want to ask if the current behavior is intentional.
I am trying to control the number of _concurrent_ tests executing using JUnit's _parallelism_ setting, but as we know concurrency != parallelism. I think one could argue that JUnit's parallelism support is working as expected, and if users need to control concurrent tests they should do so themselves (for example using a semaphore). Thoughts?
Thanks for the analysis! So this only happens for repeated, dynamic or parameterized tests? "Static" tests are executed using the invokeAll() method of ForkJoinPoolHierarchicalTestExecutorService.
Whatever we decide, I think it's fair to say that, as you mentioned, _parallelism_ is not about controlling the guaranteed maximum _concurrency_. So, if you need guarantees, I would agree that you should use an explicit synchronization mechanism such as a semaphore.
I cloned your sample project and tried it out. The reason the ForkJoinPool spawns additional workers is not the call to Thread.sleep() but the one to CompletableFuture.join() which calls ForkJoinPool.managedBlock(). Since the thread is about to be blocked, it spawns an additional "spare thread" to "ensure sufficient parallelism while the current thread is blocked" (from the method's Javadoc).
Good catch re ForkJoinPool.managedBlock().
This behavior is repeatable with both static tests and repeated / parameterized tests; I just used the @Repeated support in the example for brevity.
If I configure the ForkJoinPool that JUnit creates in its ForkJoinPoolHierarchicalTestExecutorService to use asyncMode = true, then the ForkJoinPool executes CompletableFutures that tests create and join before executing more tests, though I don't 100% understand why so I'm not sure if perhaps this is an abuse of asyncMode. Assuming there are no obvious downsides to asyncMode = true, it feels intuitive to make a best effort (avoid guarantees) to execute any asynchronous tasks started by a single test before executing the next test, so I propose we make this change.
After re-reading the documentation on the async-mode it doesn't sound like it's a good fit for our general use case. We do use fork _and_ join to ensure all children are executed before finishing a container etc. But I will do a few experiments and report back here.
@marcphilipp I feel like we were building towards some consensus that JUnit should not make attempts to control concurrency and that users should implement their own means for doing so. Do you agree? Do you think it's worth documenting this explicitly, or shall we close this out without any changes?
I think we should document that parallel execution uses a ForkJoinPool and its worker threads so there might be unexpected side effects, such as the one in this issue, when tests use threads as well. Moreover, as you mentioned, we should indeed state that the parallelism level is a hint, not a strict maximum number of concurrent threads and that tests need to use their own means of controlling concurrency should they so desire.
I've rephrased the issue title accordingly.
I added a note in ab1707479c.
@gilday Is that clear enough?
Just seeing this now, sorry. Yes, that's definitely clear, thanks for adding the note
Most helpful comment
Just seeing this now, sorry. Yes, that's definitely clear, thanks for adding the note