Version Used:
Steps to Reproduce:
Interestingly, tests never hang on macOS, but they consistently hang on ubuntu and windows.
The code basically uses the following code to run source generators:
ISourceGenerator generator = new AvatarSourceGenerator();
var driver = CSharpGeneratorDriver.Create(generator);
// Make sure we don't hang forever during compilation.
var cts = new CancellationTokenSource(10000);
driver.RunGeneratorsAndUpdateCompilation(compilation, out var output, out diagnostics, cts.Token);
Interestingly, the cancellation token with 10s auto-cancel doesn't kick on.
I found the workaround through an asp.net core issue. The key issue is that locally, things run just fine. I never get the hangs on windows while I always get it on CI. There must be something different in the local run vs CI run (even though I use the exact same command line argument,
Additionally, hangs started happening after I moved some tests to use a theory instead. You can see that after that commit all subsequent builds failed: https://github.com/kzu/avatar/commits/feature/hang
Expected Behavior:
No hangs
Actual Behavior:
Hangs, consistently.
Suboptimal Workaround:
Add this assembly attribute to the test project:
[assembly: Xunit.CollectionBehavior(MaxParallelThreads = -1)]
Note that the value -1 for max paralell threads isn't even a documented valid value in xunit (that I could find), but setting it to 1 (supposedly the right way to force single-threaded execution) instead brings back the hangs 馃槙.
https://xunit.net/docs/configuration-files#maxParallelThreads:
Set this to override the maximum number of threads to be used when parallelizing tests within this assembly. Use a value of
0to indicate that you would like the default behavior; use a value of-1to indicate that you do not wish to limit the number of threads used for parallelization._JSON schema type: integer_
_Default value: the number of logical processors in your PC_
By setting it to -1, you are saying "run as many Tasks as you want without limit".
The fact that 1 exacerbates your problem tells me you almost certainly have an async deadlock happening because of the use of unsafe Task calls like .Result, .GetAwaiter().GetResult(), Task.WaitAny, Task.WaitAll, etc.
Yep, the fact that the source generator driver generates code and updates the compilation, apparently in a synchronous fashion looked fishy from the beginning. As a consumer of everything internal it hides behind that class, there isn't much I can do :(. Hopefully this can serve as a repro for the team to look into it more deeply.
Do you have any gut feeling as to why this might happen consistently on CI but consistently not happen on local runs? That was quite puzzling...
Do you have any gut feeling as to why this might happen consistently on CI but consistently not happen on local runs?
Fewer and/or slower cores in CI?
Makes sense... probably the hosted pool is just a bunch of VMs/docker containers sharing scarce resources with other concurrent builds :(
@bradwilson analysis matches my intuition here. I've hit this behavior several time inside the Roslyn test base when effectively spawned a number of threads greater than the max allowed threads and ended up calling Wait, Result.
@chsienki do u think an async version of this API is appropriate here? Most of the compilation driving APIs are synchronous but this is more in a hosting layer.
I haven't had a chance to dive into this fully yet, but I would point out that we have some known issues with the cancellation tokens, and we don't fully utilize it during generation: https://github.com/dotnet/roslyn/issues/46528.
@jaredpar My immediate gut is that it shouldn't be async. We're not doing any disk bound I/O; it should all be compute bound, so it wouldn't really gain us anything. Interestingly we're not parallelized yet inside the driver either, so I'm not sure how anything we're doing could be causing a deadlock.
Interestingly we're not parallelized yet inside the driver either, so I'm not sure how anything we're doing could be causing a deadlock
xUnit manages the amount of threads available for the unit tests. Consider a situation where it's constrained to a single thread. If the driver, or any code it calls, spawns a thread and then does .Wait then we will deadlock in xUnit.
I'm fairly aware of this problem cause ... I'm a bad dev, and I've done this a lot in xUnit :smile:
If the generator driver isn't spawning threads then I don't think we should be doing async here. The repo though strongly suggests that something in this chain is spawning a thread and doing some form of synchronous wait.
One possible cause for the use of Parallel.For in the compiler. That is effectively doing a sync for / join operation with threads. If so and the unit test wants to consume us in an environment hard limited to one thread then parallelization should be disabled in the compilation options. Enabling parallelization in the compiler and keeping us to one thread is going to have issues.
To clarify a bit: xUnit.net does not limit the number of threads available in the system; our limitation is around the size of pool of threads available to run Tasks for tests. A test is free to spawn as many threads as it likes (and/or spawn new tasks outside the current sync context) and xUnit shouldn't care.
@bradwilson yeah my wording was a bit stronger than the actual model. Thanks for the clarification.
@bradwilson that thread pool size limitation includes the thread running the test correct?
Closing as this does not appear to be an issue with our code but more the environment that it is being hosted in.
Most helpful comment
https://xunit.net/docs/configuration-files#maxParallelThreads:
By setting it to
-1, you are saying "run as many Tasks as you want without limit".The fact that
1exacerbates your problem tells me you almost certainly have an async deadlock happening because of the use of unsafeTaskcalls like.Result,.GetAwaiter().GetResult(),Task.WaitAny,Task.WaitAll, etc.