Docs: "Exceptions in Parallel Loops" is misleading

Created on 7 Feb 2020  Â·  11Comments  Â·  Source: dotnet/docs

Issue description

How to: Handle Exceptions in Parallel Loops has a misleading description of behavior when exceptions are thrown.

The Parallel.For and Parallel.ForEach overloads ... resemble regular for and foreach loops ... an unhandled exception causes the loop to terminate immediately.

This is confusing. In this context, it seems like loop refers to the loop inside the Parallel.ForEach(...) command which is iterating through each object in 'source' and using it to execute 'body'. Following this assumption, the following code should throw an exception:

var bools = new List<bool> { true, false };

Parallel.ForEach(
    source: bools,
    body: (b) =>
    {
        while (true) { if (b) throw new Exception(); }
    });

However, it does not - it runs infinitely, even after the first thread throws an exception. This is misleading and should be corrected.

I'm assuming what the author meant was that an exception thrown during any executions of the 'body' action will act like a normal exception. While this is true inside the body of the action, it is not true for how it is handled outside of the action by the Parallel.ForEach(...) code.

[Edited by gewarren to add links in Document Details]


Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

Area - .NET Guide Pri2 doc-bug dotnet-standartech dotneprod

All 11 comments

However, it does not - it runs infinitely, even after the first thread throws an exception.

As far as I understand, as soon as an exception is thrown in any call to the body, the ForEach loop waits for all current individual runs of body to finish and throws after that.

In the example it would wait forever for

while (true) if (false) ...

to finish.

The following throws:

using System;
using System.Collections.Generic;
using System.Threading.Tasks;

public class FlagsEnumExample
{
    public static void Main()
    {
        var bools = new List<bool> { true, false };

        Parallel.ForEach(
            source: bools,
            new ParallelOptions { MaxDegreeOfParallelism = 1},
            body: (b) =>
            {
                while (true) if (b) throw new Exception();
            });
    }
}

because there is no chance to start processing the false item before true is processed.

@pkulikov I think this could be described better as the issue author has indicated. Perhaps change

an unhandled exception causes the loop to terminate immediately.

to

an unhandled exception causes the loop to terminate once all iterations currently running finish.

I'm not exactly sure what to change the text to, but the original

The Parallel.For and Parallel.ForEach overloads do not have any special mechanism to handle exceptions that might be thrown. In this respect, they resemble regular for and foreach loops (For and For Each in Visual Basic); an unhandled exception causes the loop to terminate immediately.

paragraph is demonstrably incorrect. There is a mechanism to handle exceptions being thrown - at the very least to aggregate them and re-throw when all parallel tasks are completed. And an unhandled exception does not cause the loop to terminate immediately.

It follows that I believe @Thraka's suggested correction and PR #17035 to be insufficient. I would recommend further clarifying the matter, with a change similar to below:

The Parallel.For and Parallel.ForEach overloads handle exceptions differently to typical loops. If an exception is thrown by any iteration of the body of the loop during execution, both overloads will wait for _all current parallel executions of the body to complete_. Upon completion, both overloads will throw an AggregateException that will contain all exceptions thrown by completed executions of the body.

And, as an additional note:

It should be noted that if an iteration of the body of the loop does not complete (say for example, it contains its own inner-loop that does not terminate), the parallel overload will never throw an AggregateException. In such case, any exceptions that were thrown in other iterations of the body will not be reported.

Thanks for clarifying @jeremy-collette I'll reopen this.

Thanks for clarifying @jeremy-collette I'll reopen this.

Thanks @BillWagner. Do you have any comments on my suggested modification?

Cheers

@jeremy-collette

The first edit is fine. I'd change "will wait" to "waits". Same with "will contain" to "contains".

In the second one, is that specifically for infinite loops? "does not terminate for some reason" leads to that question. Or could it be waiting for an asynchronous process that never completes?

@BillWagner

In the second one, is that specifically for infinite loops? "does not terminate for some reason" leads to that question. Or could it be waiting for an asynchronous process that never completes?

It is not specifically for infinite loops - any Task that does not complete will cause this behavior. I was just trying to give an example. Any suggestions for modification there?

Cheers

Thanks for clarifying @jeremy-collette Here's my suggestion:

If an iteration of the body of the loop does not complete (for example a Task that doesn't complete, an infinite loop, or some other condition ), the parallel overload never throws an AggregateException. In such case, any exceptions that were thrown in other iterations of the body aren't reported.

@BillWagner hmm I'm just worried that sentence might be a bit long. How about something like this?

Note that if an iteration of the body of the loop does not complete, the parallel overload never throws an AggregateException and will continue to run indefinitely. An example of this is an iteration that is awaiting a Task that doesn't complete, an infinite loop, or a deadlock. In such a case, any exceptions that were thrown in other iterations of the body will not be reported.

Reading further down the article, it appears there are more issues:

When you add your own exception-handling logic to parallel loops, handle the case in which similar exceptions might be thrown on multiple threads concurrently, and the case in which an exception thrown on one thread causes another exception to be thrown on another thread.

It has already been demonstrated that throwing an exception in iteration of the body will not affect other iterations of the body. It will wait for all current iterations to complete and then it will throw an AggregateException. Exceptions thrown in one iteration of the body will not cause an exception in another thread.

You can handle both cases by wrapping all exceptions from the loop in a System.AggregateException. The following example shows one possible approach.

The Parallel overloads already do this. Doing so manually is superfluous.

Looking further down at the example, it is over complicated and unnecessary. I think we could demonstrate everything we need with a couple of short examples. Firstly, an example of an AggregateException being thrown:

Parallel.For(
    fromInclusive: 0,
    toExclusive: 10,
    parallelOptions: new ParallelOptions { MaxDegreeOfParallelism = 5 },
    body: (i) =>
    {
        if (i % 2 == 0)
        {
            throw new ArgumentException(message: $"{i} is an even number!");
        }
    });

This will throw an AggregatedException with _up to_ 5 ArgumentExceptions.

And then an example that demonstrates a call that never throws an exception:

Parallel.ForEach(
    source: new List<int> { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 },
    body: (i) => 
    {
        while (true) 
        {
            if (i % 2 == 1)
            {
                throw new ArgumentException(message: $"{i} is an odd number!");
            }
        }
    });

The second example could have the following note appended:

Note that in the second example if we called Parallel.ForEach with the optional parallelOptions argument and set the MaxDegreeOfParallelism property to 1, the ForEach overload would throw an AggregateException after the first iteration of the body threw an Exception (as no other iterations of the body would be running in parallel).

It seems very confusing that this whole Data Parallelism chapter keeps referring to Parallel.For and Parallel.ForEach as "Loops". This breaks the abstraction of executing tasks in parallel.

Then this article's title uses "Parallel Loops" to mean, not "loops running in parallel", but rather "Parallel.For/ForEach executions", which adds to the confusion.

My opinion is that parallel execution should be _contrasted_ with "loops", not _compared_ with them.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mekomlusa picture mekomlusa  Â·  3Comments

Manoj-Prabhakaran picture Manoj-Prabhakaran  Â·  3Comments

JagathPrasad picture JagathPrasad  Â·  3Comments

sdmaclea picture sdmaclea  Â·  3Comments

ite-klass picture ite-klass  Â·  3Comments