We currently have following related dispose rules that ensure that disposable objects are disposed by the creating method and/or containing type:
1) CA2000 (Dispose objects before losing scope): A local object of a IDisposable type is created but the object is not disposed before all references to the object are out of scope.
2) CA2213 (Disposable fields should be disposed): A type that implements System.IDisposable declares fields that are of types that also implement IDisposable. The Dispose method of the field is not called by the Dispose method of the declaring type.
So, a disposable object created in a method body, which is not saved into an instance field and/or escaped via either calling into an external library or returned by the method, must be disposed in a method. If assigned to a field of a disposable object, then the Dispose method (or call graph rooted at Dispose) of the containing type must dispose the field. If escaped via returning the disposable object, then the caller takes over the ownership to dispose the object.
These rules seem appropriate, except for the cases where a _dispose ownership transfer_ takes place from method that creates a disposable object to a wrapper object that holds onto this disposable object into a field and disposes the field when the wrapper is disposed. For example, see ModuleMetadata. According to above dispose design, the callers of the ModuleMetadata constructor have dispose ownership of the passed in dispsoable PEReader argument, and hence are all flagged. In practice, ModuleMetadata is wrapper that creates a PEModule object that wraps the PEReader object and disposes it when the PE module is disposed. Dispose ownership is transferred from creators of ModuleMetadata, to the ModuleMetadata object to the PEModule object. We need a complete inter-procedural dataflow analysis to figure out this dispose ownership transfer and ensure that either PEModule.Dispose disposes the PE reader (CA2213) or the methods creating PE reader disposes it (CA2000). Legacy FxCop as well as our current implementation do not perform inter-procedural analysis, and generates false negative CA2000 instances for these cases.
Design choices:
I believe this issue captures one of the core design issues around dataflow analysis -
I prefer 3 as it also provides an explicit (pseudo) documentation/contracts and allows user configuration for cases that analyses just cannot detect. Primary downside would be if such annotations or user configuration just grow too much and get out of sync with actual user code.
Putting some more thought, I think approach 3., i.e. interprocedural analysis, to track how the disposable instance moves around shouldn't be super expensive as the instance generally does not get passed around multiple times (given it is a disposable object!), and we can enhance CA2000 and CA2213 to do a lot better job at reducing false positives and also enhance CA2213 (currently it misses out some true leaks due to its design: CA2213 fires only if a disposable field was assigned to a locally created disposable object in one of the containing type's methods. This causes us to miss reporting cases where there is an actual dispose violation.)
Tagging @dotnet/roslyn-analysis
I would be interested in seeing some improvements to the heuristics involved with intra-procedural analysis.
A warning should be reported if code flow results in ownership transfer of an object not owned by the current method/instance. This covers multiple transfers of the same object as well as unintentional transfer of ownership e.g. via an argument. This rule is intended to expose false negatives.
The first such rule would involve transfer of ownership via object construction. This pattern occurs when an argument is passed to a constructor under the following conditions:
The caller is assumed to transfer ownership of an object when passing an argument to the constructor with the properties given above.
A constructor is assumed to have gained ownership of an object when passing an argument under the conditions above.
The first such rule would involve transfer of ownership via object construction
I am strongly opposed to adding any heuristic for dispose ownership transfer. There are multiple reasons:
Dispose rules are much special then any rules - lots of false positives are fine, but we should not mask _true_ leaks. This heuristic guarantees that we are masking cases where passing an object to a constructor does not lead to eventual dispose of that object.
Legacy FXCop implementation did not have such a heuristic for a very specific reason - it was designed to be more aggressive because customers asked for such a behavior. When we have tried to cut down implementation of dispose analyzers citing high level of noise as the underlying reason, there has been an extremely strong opposition from the community. In fact it has been explicitly stated multiple times that noise level is not an issue with dispose as it allows users to understand and/or improve their own code structure, and if it indeed turns out be a false report, suppressing is no big deal. See examples here: https://github.com/dotnet/roslyn-analyzers/issues/695#issuecomment-182399857, https://github.com/dotnet/roslyn-analyzers/issues/695#issuecomment-224253449 and https://github.com/dotnet/roslyn-analyzers/issues/695#issuecomment-232643339.
If the false positive rate is indeed concerning we have approach 2 and 3 available to us - A quick prototype can help estimate the improvements from 3, with any related performance implications (which I believe should not be too much as disposable objects do not get down around 100 successive method calls or such). If inter-procedural is not what we want to go, we should take approach 2 and allow users to explicitly define dispose ownership contracts to avoid in-source suppressions.
Overall, I think the least preferred approach for dispose ownership rule is to add any heuristic to mask real failures. Current state gives us parity with FXCop and hence we can wait on implementing 2. or 3 for v2 of DFA rule, i.e. once we have ported all DFA rules and also moved to compiler's CFG. @jinujoseph to confirm, but I think we have had parity as our v1 benchmark since starting work in this area.
@sharwell I also just read your comment here: https://github.com/dotnet/roslyn/pull/25179/files#r171921736
We can avoid masking by pairing the transfer heuristics - each case where signatures treat the code as a transfer of ownership, both the caller and callee agree this is the case
This is not possible without inter-procedural analysis. To identify who owns or disposes an object that moves around methods needs inter-prodedural analysis. Otherwise, we are going to have false positives either at the caller or callee. Legacy FXCop implementation decided to retain the false positives in CA2000 (i.e. the caller) and your recommendation will move to them to the callee (CA2213). I don't think we gain anything by moving away from original FXCop design as any design without inter-procedural analysis OR annotations will lead to false positives. In fact we introduce a big risk of introducing a barrage of new false positives for CA2213. Legacy FXCop customers now end up with needing to suppress them at caller (done in past for CA2000 false positives) and now at callee (your proposed heuristic).
don't we need some kind of annotation at the end, anyway? we don't need annotation for parity but to be better than FxCop, I think we said we need annotation before? since .net or language itself doesn't have concept (or a way to expresses explicitly) of ownership transfer ?
we don't need annotation for parity but to be better than FxCop, I think we said we need annotation before
We should surely add something here to improve over FXCop - no arguments to this. I think we want to at least prototype or design an inter-procedural approach that needs least user work and understand the performance and implementation costs. If these are indeed reasonable, then we want to try it out. Otherwise, we can always implement the annotation strategy, whose cost is already pretty well understood.
I am strongly opposed to adding any heuristic for dispose ownership transfer ...
I don't think we gain anything by moving away from original FXCop design as any design without inter-procedural analysis OR annotations will lead to false positives.
The rule already operates on heuristics, and already produces a non-trivial number of false positives. Furthermore, it is trivial to demonstrate that some intra-procedural approaches produce more false positives than others. To demonstrate this, we simply remove the treatment of return values as a transfer of ownership. Given that we have not tried any alternatives to these rules in practice, it is not unreasonable to think that other approaches could produce more accurate results without resorting to to inter-procedural analysis or annotations not already in place.
An ideal solution would have the following characteristics:
The inter-procedural approach may produce better overall results in the end, but relying on it too early will limit our ability to meet the third goal, or worse, the second goal. Annotations (attributes or otherwise) have a similar impact on the fourth goal.
I prefer 3 as it also provides an explicit (pseudo) documentation/contracts
Yes, I agree. Don't forget that it is not only the analyzer that needs to understand the code, but users who reads the code needs to understand how to interact with it. I would prefer not only have to rely on tooling for that.
Primary downside would be if such annotations or user configuration just grow too much and get out of sync with actual user code.
Analyzers will be able to check that annotations are in sync in many places.
users who reads the code needs to understand how to interact with it.
That's actually the case right now. Users have to read the docs to know whether a return value must be disposed or not.
We have had the same discussion with @JohanLarsson on https://github.com/DotNetAnalyzers/IDisposableAnalyzers/issues/126 (and on the IDisposableAnalyzers' gitter), and we came to the conclusion that annotations were needed to make it correct, but the adoption would be limited by the fact that the Annotations package would need to be implemented in every project.
We then ended to the conclusion that something needed to be done in .NET core directly, and this is why I proposed https://github.com/dotnet/corefx/issues/37873 . Opinions welcomed there 馃槂
Most helpful comment
https://github.com/dotnet/corefx/issues/37873