Roslyn: Fix CA1827 (DoNotUseCountWhenAnyCanBeUsed) violations in Roslyn.sln

Created on 13 Aug 2019  路  17Comments  路  Source: dotnet/roslyn

Attempting to migrate Roslyn.sln to latest version (2.9.4) of FxCop analyzer packages leads to a bunch of CA1827 violations where the code is using Count() Linq invocation when Any() could be used. This CA rule was added in 2.9.4 analyzer package. A ruleset suppression would be added for this rule and this issue tracks fixing these violations and removing the ruleset entry.

Area-Infrastructure Blocked

All 17 comments

You'll want to be cautious with this one. In both .NET Framework and .NET Core, Enumerable.Count special-cases ICollection<T> and ICollection, such that using Count on an instance that implements those will avoid allocating, but Enumerable.Any doesn't check for such interfaces and will always end up allocating an enumerator. We could consider changing Any to do similar checks (with the pros and cons that come with that), but such a change would only impact .NET Core.

It calls into question the validity of the added rule.

Thanks @stephentoub. Tagging @paulomorgado who added this rule very recently.

@stephentoub, I would say that Enumerable.Any should be changed. As a developer, when I receive an IEnumerable<T> from an API call, I don't know which it is. It might be coming from a LINQ query.

By the way, ImmutableArrayExtensions has an Any extension that use the internal array length. But not Count (corefx/#39007).

I would say that Enumerable.Any should be changed

As I noted, that might be something to do for .NET Core. It will not be changed in .NET Framework. So the rule is problematic on .NET Framework at the very least.

There are also downsides to changing it, which is why it hasn't been done thus far, e.g. the extra interface checks add overhead, which is less impactful when the alternative is a potentially long iteration through every element, and more impactful when at most one element will be considered.

@stephentoub, are your comments only related to Enumerable.Count<TSource>() vs. Enumerable.Any<TSource>(), or are they also related to Enumerable.Count<TSource>(Func<TSource, bool> predicate) vs. Enumerable.Any<TSource>(Func<TSource, bool> predicate)?

Just the ones that don't take a predicate.

So. this is still a valid analyzer/fixer for those that take the predicate, right?

Yes

Sorry, one more question: this also doesn't apply to Queryable.Count<TSource>(), right?

Tagging @sharwell for his views.

The analyzer attempts to avoid an O(n) amount of work during the switch from Count() to Any(). This is a more acute problem for general-purpose code than avoiding an O(1) allocation on a fast path. I would further argue that code sensitive enough to performance where the special-case behavior matters would be using more concrete types than IEnumerable<T>. I do not see any issues with the rule as-presented which would alter or special-case its behavior.

And this will catch most of them using Enumerable.Count<TSource>(IEnumerable<TSource> source).

@paulomorgado - can we change the analyzer to do the following:

  1. Detect all uses of Count() on ICollection or ICollection<T> receivers and recommend using Count or Length as appropriate (your current PR seems to do this)
  2. Only recommend replacing Count() with Any() on receivers of type IEnumerable or IEnumerable<T> if it does not implement ICollection or ICollection<T>, given that the former rule will already cover this case.

This should address @stephentoub's concern while also avoiding both the above diagnostics on certain code patterns, when clearly accepting the first diagnostic/fix is preferred.

FYI, I did submit https://github.com/dotnet/corefx/pull/40377, however a) this will be post-.NET Core 3.0, and b) it will not change .NET Framework.

Given @stephentoub's fix and @sharwell's point about performance sensitive code using more concrete types then IEnumerable<T>, I think it should be fine to even keep @paulomorgado's analyzer as-is and ignore my suggestion above https://github.com/dotnet/roslyn/issues/37959#issuecomment-523015275. @stephentoub does that seem reasonable?

Going forward, we will involve the corefx team in the early design/triage phase for any new analyzers being added to Microsoft.NetCore.Analyzers and Microsoft.NetFramework.Analyzers

It's up to you. I'm more conservative when it comes to code change recommendations, and so unless something is strictly better 99.9% of the time, I hesitate to suggest something that someone might blindly auto-fix. In this case, on .NET Framework and on .NET Core 3.0 and earlier, it is still the case that Any() will allocate in many situations where Count() won't, and it'll be impossible for the analyzer to always know the right answer, nor whether Count() will be O(N) or O(1), nor whether the cost of an allocation from Any() will be more or less impactful than iteration that might happen as part of Count(). It's fine if you want to keep the analyzer, but I suggest then that the associated text/description/docs make it clear that it's a trade-off.

We can exclude from this analyzer/fixer the use cases covered by pull/2736 and add replacement of Any() alongside with Count() to pull/2736.

Performance wise, it should be the best compromise.

As for users, it might be a bit more confusing, but they might learn from it.

When the framework changes, we'll change the analyzers/fixers accordingly.

Was this page helpful?
0 / 5 - 0 ratings