Roslyn-analyzers: New rule: Detection of PLINQ nops

Created on 4 Sep 2019  Â·  13Comments  Â·  Source: dotnet/roslyn-analyzers

A split off issue from corefx/issues/40739 to track the suggested rule

PLINQ nops. Using .AsParallel() at the end of a LINQ query, e.g. foreach (var item in src.Select(…).Where(…).AsParallel(…)), is a nop and should either be removed or the AsParallel moved earlier in the query. I’ve even seen developers write foreach (var item in src.AsParallel()) thinking it parallelizes the foreach loop, which it doesn’t… it’d be good to warn about such misuse.


The detection for this proposal should be simple - just find any places where AsParallel is being enumerated in a foreach while being the last LINQ element.

Questions:

  1. Can we generalize the rule to detect all situations where the type being foreached is ParallelQuery or its descendant?
  2. Automated fix might not be as simple here, the variable being iterated over might come from another method, tracing this to a place where the AsParallel actually originates might be expensive. Can we go best effort and only offer a quick fix if the AsParallel is used directly in the foreach?
  3. As this most likely won't cause an issue in the code (being a no-op), default severity should be warning, yes?
Approved-Rule Feature Request help wanted

All 13 comments

@stephentoub @@danmosemsft @sharwell @terrajobst Are we planning/encouraging creating separate issues in this repo for each proposed analyzer in https://github.com/dotnet/corefx/issues/40739 right now or waiting for some more feedback on the original codefx issue and scheduling time/resources to work on it first? Just trying to understand the logistics here...

@mareklinka Can we wait on confirmation from the corefx folks about the logicstics before opening more issues in this repo?

@mavasani Of course, wasn't planing on opening up any more of these anyway. I'll wait for the result of the logistics discussion.

@mareklinka Thanks. By the way, I have no pushback on splitting off these into separate issues in this repo - it seems reasonable to avoid cluttering the primary corefx issue with discussion/brainstorming about each individual analyzer and having separate issues. Just want to double check how/when/where the corefx team would prefer the splitting off to happen.

Can we generalize the rule to detect all situations where the type being foreached is ParallelQuery or its descendant?

No, that would effectively ban all use of PLINQ, e.g.
```C#
foreach (var item in src.AsParallel().Where(...).Select(...)) { ... }

is perfectly valid.

> Can we go best effort and only offer a quick fix if the AsParallel is used directly in the foreach?

Yes, what I had in mind was to flag cases where .AsParallel() is literally used in the `foreach` at the end of the source, e.g. it would flag:
```C#
foreach (var item in src.AsParallel())

but not:
C# var srcp = src.AsParallel(); foreach (var item in srcp)
and the fix offered would be to remove the .AsParallel(); in addition, if it was a case like foreach (var item in src.Where(...).Select(...).AsParallel()), a fix could be to move the .AsParallel() to the beginning of the sequence rather than the end, e.g. foreach (var item in src.AsParallel().Where(...).Select(...).

It's possible more could be done than this, but this should be simple and safe and always correct, and would catch complete misuse like https://dotnettips.wordpress.com/2019/07/08/performance-tip-for-vs-foreach-in-microsoft-net/.

Yeah, handling the simple case should be a good starting point.

I'm not sure if the Roslyn API allows this (haven't tried yet), but we could potentially offer both the Move AsParallel to the start and Remove AsParallel code fix. If it's just foreach (var x in col.AsParallel()), we'd just offer the remove fix.

This might serve as a hint to the user as well. If we have a chain of LINQ calls, moving the AsParallel call to the beginning will effectively change the semantics/behavior of the code and maybe the user does not want that. In that case just removing the AsParallel call would be a more suitable solution as it keeps the original, no-op behavior.

As the official docs state:
However, parallelism can introduce its own complexities, and not all query operations run faster in PLINQ. In fact, parallelization actually slows down certain queries.

@mavasani

@stephentoub @@danmosemsft @sharwell @terrajobst Are we planning/encouraging creating separate issues in this repo for each proposed analyzer in dotnet/corefx#40739 right now or waiting for some more feedback on the original codefx issue and scheduling time/resources to work on it first

I'd start with a list first as that's easier to manage than separate issues. Once we flesh them out, separate issues make more sense IMHO.

I have an analyzer and a code fix provider implemented for this scenario. The analyzer checks for foreach statements where the last call is either AsParallel() or AsParallel() followed by either ToList or ToArray. The fix provider simply removes both of those statements. If this is available I would like to work on this

@jeffhandley Can you please transfer this issue to dotnet/runtime repo for triage?

They already have this rule there in this issue

@Mrnikbobjeff Thank you pointing that out. @jeffhandley @terrajobst are we open to contributions for .NET API analyzers targeted in future milestone?

Ping @jeffhandley @terrajobst and @mavasani

Adding @carlossanlop and @buyaa-n for their visibility on this as well. Yes, we are open to new analyzers for the future milestone and we're starting to take a look at the proposals that have come in.

Was this page helpful?
0 / 5 - 0 ratings