Roslyn: Usage analysis ignores references in documentation comments

Created on 15 May 2015  路  47Comments  路  Source: dotnet/roslyn

_Migrated from CodePlex_

Consider the following class.

namespace SampleNamespace
{
    using System;
    using System.Threading.Tasks;

    /// <summary>
    /// Some stuff is at
    /// <see cref="Task.ContinueWith(Action{Task, object}, object)"/>.
    /// </summary>
    public class SampleClass
    {
    }
}

The behavior of Organize Usings varies by whether or not the current project is configured to generate an XML documentation file with the output.

  • If XML documentation file is enabled: The Organize Usings feature will not make any changes to the file shown above.
  • If XML documentation file is disabled: The Organize Usings feature will remove both of the using declarations in the file shown above.

At least for the purpose of analyzing using declarations, the behavior of Roslyn should not depend on whether or not the project is currently configured to produce an XML documentation file during the build.

Area-Compilers Bug Resolution-Fixed

Most helpful comment

@gafter That seems a worse experience, in my mind.
There was some analysis (above) of parsing the documentation in all cases (but without emitting extra files). It looked like the performance hit was minimal. Why not do that?

All 47 comments

The "unused using" diagnostic is emitted by the compiler. It looks like the compiler is not even looking at doc comments when it doesn't need to generate a xml doc file and so doesn't include that code in it's analysis.

We have an option to "Diagnose" xml doc comments but not emit them, but that also triggers all the other warnings and errors, and so I don't think VS integration sets it.

We might need some finer grained control over the compiler's handling of xml doc comments. I remember discussing this with @amcasey a LONG time ago, but I don't think we resolved anything.

+1

Hitcount++ from internal bug 1167079

If removing the using breaks a cref, it should not be suggested. Please fix.

Currently this results in bouncing back and forth between ReSharper complaining that you should add the using for the XML documentation and Roslyn complaining that you should remove it. It's a bad user experience.

If the only correct solution is truly to enable XML doc generation, then please stop suggesting removal of the using and start suggesting enabling XML documentation output.

This may be a moonshot, but in an ideal world, I think I should be able to leave XML documentation output off and still have all the syntax checking as though it was on, minus the warning about publicly visible types missing comments.

@jaredpar @jcouv This issue has been raised so often that I'm moving this issue to milestone 16.0 so it will get attention during triage.

also reported at Link

wow, I didn't expect to see this issue is raised since 2015, I thought it should be a minor fix.

@MkazemAkhgary It was actually originally filed 25 July 2014, before Roslyn was even on GitHub. 馃榿

@MkazemAkhgary Do you want to see how far you get implementing it?

@jnm2 is that a threat? 馃樃

@MkazemAkhgary Don't give me ideas 馃構

I was just a little surprised, but I guess I'm cool :)

Seriously, surprised at what? Given the help wanted label, I'm genuinely asking whether you want to get involved. @CyrusNajmabadi and @sharwell and others would would give you plenty of support.

Oh, I thought that's a sarcasm, my apologize.

yes, I would like to see where to start, like some references where I should look into. I will spare some time for that.

The basic problem is that for performance, the compiler doesn't bother parsing and binding xml documentation comments if you aren't going to generate an xml doc comment file.

Without the compiler having done the binding, the IDE feature doesn't recognize that the using is required.

@Pilchie Have perf mitigations been considered? What about only binding comments in scopes with otherwise-unused using directives?

if generating a file looks expensive, why not keeping it memory? then if option for generating doc file is enabled also store that as a file.

@jnm2: That's an interesting possibility, though it means binding those documents twice. The way unused usings works, is that the compiler basically tracks what usings get touched as it binds things.

To do what you propose, we'd first need to bind the whole document as is, determine that a using was unused, and then bind it again including xml doc comments to see if it was still unused.

@MkazemAkhgary: It's not the cost of writing the file that is the concern, it's the cost of parsing the xml doc comments and binding their identifiers itself that is the concern.

I think what we really need is an extra doc comment option that the IDE can use between Diagnose and Parse that does the binding for the purposes of answering the unused diagnostics, but doesn't emit diagnostics related to to the doc comments themselves. (Or maybe change the behavior of Parse so that it does this).

bear with my Ignorant, but it seems like intellisense is already parsing the docs? like the way its displayed, highlighted. how does that work?

well intellisense and compiler are two separate components but can you somehow make this connection? like if there is cref usage (or other similar references) then somehow tell compiler to do the parsing and binding.

@Pilchie A lite-diagnostic mode was the next thing I was going to suggest. =) Can you flesh that out enough for a community member to start implementation?

@MkazemAkhgary Yes, the IDE always parses the syntax. What it doesn't always do is bind the syntax to a semantic model (which is needed to determine whether a using is really used). It only does the latter if you are outputting an XML doc file.

In performance traces I've seen for users that have doc comments enabled, the majority of the performance impact is emitting the XML file itself. Restricting the operation to only binding should not cause observable problems.

鈿狅笍 It's likely the binding cost is indistinguishable in these performance traces, so it's also possible the overhead is significant but I don't have the differential data necessary to analyze it specifically.

Full diagnostics would actually be great, thinking of a long-standing peeve of mine. People end up putting invalid crefs in projects that don't output an XML file.

Restricting the operation to only binding should not cause observable problems.

This sounds good to me. It feels like we could accomplish this by augmenting CompilationOptions so that Xml handling is not just on/off. instead there would be separate knobs for binding vs emit (with emit implying binding). IDE would always set at least the 'bind' flag. The 'emit' flag would be set if that was how things were specified in the proj file.

--

There is an interesting wrinkle to consider though. if we do this (naively), it's likely that you would start getting squiggles for xml doc comments, even if you didn't have the 'emit' option on. I don't think we'd want that. I base this on using several projects that actually have errors in this xml doc comments. While it could be argued it would be nice if those were fixed, it's also an onerous burden we're placing on people who may not want it. So we might need to be careful to make sure that this binding did not affect the error list/squiggles.

@sharwell What are your thoughts here?

@jnm2 - I'm not familiar enough with the compiler code to answer that. @jaredpar / @jcouv - who knows the most about xml doc comment stuff these days?

@CyrusNajmabadi We already have the Parse and Diagnose options. We need another option somewhere in between, or to update the meaning of Parse based on the assumption that the IDE is the only one that ever uses it.

Ah, gotcha! Yes, that would make sense. Having 'Parse' take on new meaning is somewhat distasteful to me. Esp as 'Parse' is so clear, and maybe other clients want that behavior to stay the same. Having a 'Bind' (or 'Analyze' or some other good name) option added to the enum makes sense as a way to bridge between the behaviors. Diagnose would imply 'Bind' (as it implies 'Parse' already). 'Bind' would imply 'Parse'. Parse would only guarantee the parse tree. Bind would guarantee semantic information. Diagnose would guarantee diagnostics. That makes sense to me.

I would love for the default to change so documentation comments are bound by the compiler but associated warnings (e.g. CS1591) have severity None. This would allow analyzers to operate without having to warn users to turn on these comments.

That's an even better idea!
You're right. I hadn't considered that the lack of diagnostics jsut means we're punting other problems down the road. We should have diagnostics. But having them have None severity makes sense for the user who is saying "do not emit", but who still wants IDE features to work well in doc comments.

if we do this (naively), it's likely that you would start getting squiggles for xml doc comments, even if you didn't have the 'emit' option on. I don't think we'd want that. I base this on using several projects that actually have errors in this xml doc comments. While it could be argued it would be nice if those were fixed, it's also an onerous burden we're placing on people who may not want it.

The times I've run into this, the people who caused the invalid crefs wanted to know and fix them and were surprised that they weren't checked.
I really want squiggles at the most visible level that back-compat will allow.

I'm on the fence. Introducing squiggles has often been viewed as problematic. however, that's usually in teh context of "build" where people may have "build as error" on and then find all their code no longer builds.

It's unclear to me what is means more for IDE contexts.

That said, i know someone could def be annoyed if they upgrade and get 1k warnings in the task list they then need to go fixup.

--

Note: i'm not saying that this means we shouldn't do this. Just that we need to be conscientious of this possible fallout and how it may impact people. It's possible, in balance, that it's worth doing. It should just not be done haphazardly.

+1 to @CyrusNajmabadi's suggestion.

@jnm2 - we could consider an IDE option to show suggestions/infos/etc, even if the compiler wouldn't emit on build. Tagging @kuhlenh for that.

The warning severity can always be altered. In theory a .editorconfig setting could set the severity for all such warnings about existing comments, but leave warnings about missing comments hidden (e.g. CS1573, CS1591).

I would wish for errors in the IDE even if the compiler can't report anything. I believe it's highly surprising to people whenever crefs are not validated.

It's really a shame that we have to assume warnings-as-errors folks are people who, in a seeming contradiction, _wouldn't_ actually want to see build failures when they upgrade their compiler. I interpret the setting as asking for pedantic mode and would welcome build failures due to new warnings. If fixing all invalid crefs happens to be too much of a burden, that means it's time to go back to non-pedantic mode for the time being, for that warning at least.
The upshot of this is that there is no way for the compiler to give diagnostic information about the compilation without us being afraid someone will get mad.

@sharwell

In performance traces I've seen for users that have doc comments enabled, the majority of the performance impact is emitting the XML file itself. Restricting the operation to only binding should not cause observable problems.

I thought it would be interesting to test this, so I created a Benchmark.Net benchmark for this. The benchmark compiles the largest C# project with documentation comments I had sitting around, Microsoft.CodeAnalysis.CSharp, using the Roslyn API. It compares the different DocumentationModes and also what effect does emitting with and without specifying xmlDocumentationStream have. The results:

| EmitMode | DocumentationMode | Mean | Error | StdDev |
|----------- |------------------ |--------:|---------:|---------:|
| None | None | 12.19 s | 0.1834 s | 0.1716 s |
| None | Parse | 12.52 s | 0.1907 s | 0.1784 s |
| None | Diagnose | 12.86 s | 0.1969 s | 0.1842 s |
| Null | None | 25.32 s | 0.2050 s | 0.1601 s |
| Null | Parse | 24.91 s | 0.3018 s | 0.2823 s |
| Null | Diagnose | 26.37 s | 0.3224 s | 0.3016 s |
| NullStream | None | 25.36 s | 0.3234 s | 0.3025 s |
| NullStream | Parse | 25.33 s | 0.2122 s | 0.1985 s |
| NullStream | Diagnose | 26.02 s | 0.3476 s | 0.3252 s |

Based on this, when you're just creating a compilation, setting DocumentationMode does have a small, but measurable effect (3% increase for Parse and 5% for Diagnose).

The results for also emitting the compilation are harder to make sense of (DocumentationMode.Parse is faster than None?), but they imply that generating XML documentation causes a slowdown of at most 4 % (writing it to disk might take more, I haven't measured that).

@svick are you able to test this on tunnelvisionlabs/dotnet-threading?

FWIW in thinking about the tradeoffs, the workflow I generally use is to configure debug builds to be as fast as possible, so I leave XML comments off. I rely on a release build, with all diagnostics, including static code analysis and XML comments, to be run often enough to catch deeper problems in a timely fashion.

It's OK for the diagnostic build to be slower. However, it's not OK when I see that the usings in a C# file have gotten jumbled and so press Ctrl+R, Ctrl+G (Remove and Sort Warnings), to subtly introduce warnings or errors into the next diagnostic build.

@dotnet/roslyn-ide @dotnet/roslyn-analysis @Pilchie @DustinCampbell I'm working on this issue now. My proposal is that the compiler will not produce any "unused using" diagnostics when the syntax trees are parsed with DocumentationMode.None, because in that case we have no way of knowing if they were used or not. I believe the IDE uses DocumentationMode.Parse, which gives the compiler enough information to determine if the usings are used or not, so this should not detract from the IDE experience. The IDE experience after my fix would be that we now properly recognize that usings (imports for VB) are used in doc comments.

Are you OK with this?

@gafter That seems a worse experience, in my mind.
There was some analysis (above) of parsing the documentation in all cases (but without emitting extra files). It looked like the performance hit was minimal. Why not do that?

I think the same way as @jcouv.

There was some analysis (above) of parsing the documentation in all cases (but without emitting extra files). It looked like the performance hit was minimal. Why not do that?

Because we get a different syntax tree, and that is a compat break for an unknown number of clients. Are you suggesting that we Obsolete DocumentationMode.None, which parses doc comments as ordinary comments?

I am +1 on @gafter's proposal as it retains the existing API and parse semantics for all documentation comment modes, does not increase the command line build time (however negligible) and improves the user experience for IDE scenario.

API and parse semantics are very explicit on the DocumentationMode enum:

    public enum DocumentationMode : byte
    {
        /// <summary>
        /// Treats documentation comments as regular comments.
        /// </summary>
        None = 0,

        /// <summary>
        /// Parses documentation comments as structured trivia, but do not report any diagnostics.
        /// </summary>
        Parse = 1,

        /// <summary>
        /// Parses documentation comments as structured trivia and report diagnostics.
        /// </summary>
        Diagnose = 2,
    }

None should parse doc comments as regular comments, which means unused using diagnostics can _never_ be produced reliably. IDE does not use this mode and command line compilation would have never output these diagnostics as they have hidden severity. This is also likely the most commonly used mode in command line compilation, and I am not sure if it is worth risking any performance hit here by trying to parse documentation comments.

Parse should parse documentation comments, and not report any documentation comment diagnostics. We should fix this mode to correctly consume documentation comments for determining unused usings. I don't think this can realistically break any external API consumers, as we not going to produce any new diagnostics, just stop producing incorrect unused using diagnostics. Hence, I don't think we need any new documentation mode such as Bind.

Hurray! Finally! Thanks @gafter!

Was this page helpful?
0 / 5 - 0 ratings