Roslyn: IOperation missing IsExpression / IsStatement

Created on 16 Nov 2017  路  25Comments  路  Source: dotnet/roslyn

Version Used:
master

Recently IsExpression and IsStatement were removed from IOperation in https://github.com/dotnet/roslyn/commit/a289e370526a0051e83f8c75a6e1844d34b3498e. I've been trying to upgrade my existing analyzer pack, which contains about 40 analyzers, half of them using IOperation.

I have not yet found a way to fix an existing analyzer that counts the number of statements in methods. A few weeks ago I was struggling with IThrowExpression (a replacement/merge of IThrowStatement), but had to revert back to syntax to differentiate between expression and statement form. Now that the distinction no longer exists, I would need to revert back to syntax for all possible statements in the language.

Another case is an analyzer that checks for block scope in if statements, where I have do deal with the merge to IConditionalOperation. The expression form should be filtered out by this analyzer.

So my request is to bring back IsExpression and IsStatement in some form in the 15.5 timeframe. If that's not possible, any advise on how to proceed would be welcome.

Tagging @mavasani

Area-Analyzers Discussion Feature - IOperation Feature Request _Product-level triaged

Most helpful comment

Thanks @bkoelman for digging through this! I will take a look at your implementation in detail later today.

Regardless, given the non-trivial nature of this detection logic, I believe we should expose IOperation.IsExpression and IOperation.IsStatement properties in Roslyn. They are trivial to implement in our operation factory/generator and trivial to validate in the test operation dumper for each operation node based on syntax checks.

All 25 comments

Adding @dotnet/analyzer-ioperation @DustinCampbell @tmat @jinujoseph @AlekseyTs @cston

I presume the revert was done as part of https://github.com/dotnet/roslyn/pull/22774, where we decided not to expose IOperation.IsExpression and IOperation.IsStatement until we receive customer requests. 15.5 is probably at a very high bar to enable this, but something to surely consider for 15.6

A few weeks ago I was struggling with IThrowExpression (a replacement/merge of IThrowStatement), but had to revert back to syntax to differentiate between expression and statement form.

Probably related to https://github.com/dotnet/roslyn/issues/23152?

any advise on how to proceed would be welcome

A simple heuristic that statements are usually immediate children of blocks and expressions aren't might be sufficient. Of course, children of complex statements 'if', loops, etc might need to be specially recognized too.

You can also check for IOperation.Type. Should be null for statements and initializers and non-null for expressions (except for expressions with ConstantValue null).

Thanks for the suggestions, will try them at a later time.

I wonder though whether checking for IOperation.Type being null would work for assignment statements like:

isEnabled = x > 0;

because they (used to?) surface as an IExpressionStatement.

Also I'm thinking to use reflection, if possible. Future versions of my analyzer can then probe at runtime for the to-be-added-back properties and fallback to reflection for earlier API versions.

I'm still hoping this can be made part of 15.5. Is it being triaged?

Unfortunately, there's no way that this will make it into 15.5, 15.6 is the earliest we can hope for here. @jinujoseph for triage.

ISimpleAssignmentOperation, used as you showed above, is still inside an IExpressionStatementOperation. The ExprStatement has a Type of null, and the ISimpleAssignmentOperation has a type of bool.

@333fred Thanks for the clarification.

The .Type check works for all except IThrowOperation, because its expression form also returns null for its Type. I'll use Syntax in that case then.

@bkoelman you can check Parent for that.

I ran into another case: IVariableDeclarationOperation.Type is null when used in expression form. For example:

using (IDisposable i = j = null) { }

This seems like an actual bug. Filed https://github.com/dotnet/roslyn/issues/23400.

It's still unclear to me how to see that int j = 0 is an expression and int k = 4 is a statement. They both have:

  • operation.Type == null
  • operation.Parent is of type IForLoopOperation (not IBlockOperation, because { } are missing)
  • .Syntax does not tell the difference either.
for (int i = 0; i < 10; i++)
    for (int j = 0; j < 20; j++)
        int k = 4;

It's still unclear to me how to see that int j = 0 is an expression and int k = 4 is a statement.

int j = 0 is neither an expression, nor a statement, it is a declaration, which is a part of the for. That can be detected by checking if the node is the Body of the loop.

Thanks @bkoelman for digging through this! I will take a look at your implementation in detail later today.

Regardless, given the non-trivial nature of this detection logic, I believe we should expose IOperation.IsExpression and IOperation.IsStatement properties in Roslyn. They are trivial to implement in our operation factory/generator and trivial to validate in the test operation dumper for each operation node based on syntax checks.

Also tagging @AnthonyDGreen

@mavasani would be great to have these properties builtin in vNext. Meanwhile, did you see anything noteworthy in my code?

I'd like to start work on a PR to add IsExpression and IsStatement to IOperation in the master branch. It will probably become a large PR. Anything I should be aware of before putting effort into this?

@bkoelman First, we should get consensus from the design team on this API. Adding @AlekseyTs @DustinCampbell @tmat @gafter @333fred @jinujoseph @heejaechang for their views.

Once approved, I already had a PR/commit for this work, which was reverted. I think we can start from there.

I think this can and should be implemented as an extension helper method with common implementation for both languages. This should not require anything stored inside nodes themselves and should not be part of the IOperation interface

For example, see how control flow graph builder deals with this. It does this from the top, but it should be possible to follow the same logic going from the bottom up.

My personal thoughts are:

  1. What purpose do we need this for?
  2. What defines a statement or an expression?
  3. What do you do when the languages disagree? For example, assignment is an expression in C#, but a statement in VB. Perhaps you expose these as extensions, and then the language itself has final say over what it thinks are statements vs expressions?
  4. What do you do for constructs that are both statements and expressions? i.e. IThrowOperation?
  5. What do you do when the language changes, and things that were previously statements are not expressions?

The OP lists this as a use case:

I have not yet found a way to fix an existing analyzer that counts the number of statements in methods.

However, that's not a super compelling example for me.

  1. I need this information in multiple analyzers by now (see https://github.com/bkoelman/CSharpGuidelinesAnalyzer/search?q=isstatement&unscoped_q=isstatement). With help of @AlekseyTs and @mavasani I was able to come up with heuristics in the past that seem to work. My project also has lots of unit tests (https://github.com/bkoelman/CSharpGuidelinesAnalyzer/blob/4614080258bfa6544d4a425d4ca84393956d3048/src/CSharpGuidelinesAnalyzer/CSharpGuidelinesAnalyzer.Test/Specs/OperationIsStatementSpecs.cs) for statement forms if that helps.
  2. The language spec lists which are possibly statements. The usage determines when an operation is a statement or an expression. Example: IConditionalOperation is a statement for if but an expression when used as ?:.
  3. C# wraps them in an ExpressionStatement if I recall correctly. Context determines we are dealing with statement form. Not sure about VB.
  4. IThrowOperation would be a statement or an expression, depending on how it is used. Because this is tricky to get right, having an API for that is valuable.
  5. Then implementation of these properties changes along with the language. Both properties stand on their own and may always be false today and may become conditional later.

Hope this helps.

THanks for the response. My feedback to the individual points.

  1. I need this information in multiple analyzers by now (see https://github.com/bkoelman/CSharpGuidelinesAnalyzer/search?q=isstatement&unscoped_q=isstatement).

Looking at teh use cases... i'm not sure why you need this. I mean, your code already goes back to syntax depending on if things are a statement or not. i.e.: https://github.com/bkoelman/CSharpGuidelinesAnalyzer/blob/97c8bca9201bd8c57da2b3525ec3b2f8427103b5/src/CSharpGuidelinesAnalyzer/CSharpGuidelinesAnalyzer/Extensions/OperationExtensions.cs#L220-L224

If you're going to use the syntax anyways, then why not just do: "operation.Syntax is StatementSyntax"?

Similarly, i'm very wary of use cases like this: AvoidMemberWithManyStatementsAnalyzer

Basically, that code would could an if-statement as a statement and count that against the user. But if the user instead used a conditional ?: expression, it wouldn't count against them. But certainly each are just as complex as they both involve branching and whatnot.

Indeed, it seems like a better way to determine complexity would be to precisely stick with the operations, and not the weak statement/expression analysis. That way the code correctly understand that my code is still as complex, even if i do a bunch of cute things to build a huge expression instead of having a bunch of individual statements.

IThrowOperation would be a statement or an expression, depending on how it is used.

As before, i'm actually more skeptical that this is valuable. The idea is to move people away from the language's idea of 'statements/expressions'. If you want that, i would actually go back to the syntax as that really is going to be accurate and in line with the individual languages.

--

Note: i'm not hugely against this. As these could just be props that could change independently, and could be different per language, or per context. However, it feels oogy to me, and i'm def at the point where i do question if it's the right thing to do.

THanks!

Counting statements is for a coding guidelines document that I did not write. The guidance there describes number of statements, not (another indication of) complexity. Given that the document has 324 stars, I am confident there is value in it. But the analyzer should probably be implemented using a syntax walker, which is what I am currently working on.

Some usages can be easily replaced by a syntax check, yes. I am just reusing the bits I created for other scenarios.

That leaves the assignments-per-statement analyzer. Given we have no way to mix operations and syntax in walkers, how do you feel about that?

Was this page helpful?
0 / 5 - 0 ratings