Roslyn: [Proposal] enable code generating extensions to the compiler

Created on 30 Sep 2015 · 190Comments · Source: dotnet/roslyn

Often when writing software, we find ourselves repeatedly typing similar logic over and over again, each time just different enough from the last to make generalizing it into an API impractical. We refer to this type of code as boilerplate, the code we have to write around the actual logic we want to have, just to make it work with the language and environment that we use.

One way of avoiding writing boilerplate code, is to have the computer generate it for us. After all, computers are really good at that sort of thing. But in order for the computer to generate code for us it has to have some input to base it on. Typical code generators are design-time tools that we work with outside of our codebase, that generate source that we include with it. These tools usually prefer their input to be XML or JSON files that we either manipulate manually or have some WSIWYG editor that lets us drag, drop and click it into existence. Other tools are build-time, that get run by our build system just before our project is built, but they too are driven by external inputs like XML and JSON files that we must manipulate separately from our code.

These solutions have their merits, but they are often intrusive, requiring us to structure our code in particular ways that allow the merging of the generated code to work well with what we’ve written. The biggest drawback, is that these tools require entire facets of our codebase to be defined in another language outside of the code we use to write our primary logic.

Some solutions, like post-build rewriters, do a little better in this regard, because they operate directly on the code we’ve written, adding new logic into the assembly directly. However, they too have their drawbacks. For instance, post-build rewriters can never introduce new types and API’s for our code to reference, because they come too late in the process. So they can only change the code we wrote to do something else. Even worse, assembly rewriters are very difficult to build because they must work at the level of the IL or assembly language, doing the heavy lifting to re-derive the context of our code that was lost during compilation, and to generate new code as IL and metadata without the luxury of having a compiler to do it. For most folks, choosing this technique to build tools to reduce boilerplate code is typically a non-starter.

Yet the biggest sin of all, is that all of these solutions require us to manipulate our nearly unfathomable build system, and in fact requires us to have a build system in the first place, and who really wants to do that. Am I Right?

Proposal: Code Injectors

Code injectors are source code generators that are extensions to the compiler you are using, as you are using it. When the compiler in instructed to compile the source code you wrote, code injectors are given a chance to exam your code and add new code that gets compiled in along with it.

When you type your code into an editor or IDE, the compiler can be engaged to provide feedback that includes the new code added by the code generators. Thus, it is possible to have the compiler respond to your work and introduce new code as you type that you can directly make use of.

You write a code injector similarly to how you write a C# and VB diagnostic analyzer today. You may choose to think of code injectors as analyzers that instead of reporting new diagnostics after examining the source code, augment the source code by adding new declarations.

You define a class in an assembly that gets loaded by the compiler when it is run to compile your code. This could easily be the same assembly you have used to supply analyzers. This class is initialized by the compiler with a context that you can use to register callbacks into your code when particular compilation events occur.

For example, ignoring namespaces for a moment, this contrived code injector gives every class defined in source a new constant field called ClassName that is a string containing the name of the class.

[CodeInjector(LanguageNames.CSharp)]
public class MyInjector : CodeInjector
{
    public override void Initialize(InitializationContext context)
    {
        context.RegisterSymbolAction(InjectCodeForSymbol);
    }

    public void InjectCodeForSymbol(SymbolInjectionContext context)
    {
        if (context.Symbol.TypeKind == TypeKind.Class)
        {
            context.AddCompilationUnit($@”partial class {context.Symbol.Name} 
                      {{ public const string ClassName = “”{context.Symbol.Name}””; }}”);
        }
    }
}

This works because of the existence of the C# and VB partial class language feature.
Of course, not all code injectors need to be in the business of adding members to the classes you wrote, or especially not adding members to all the classes you wrote indiscriminately. Code injectors can add entirely new declarations, new types and API’s that are meant to simply be used by your code, not to modify your code.

Yet, the prospect of having code injectors modify the code you wrote enables many compelling scenarios that wouldn’t be possible otherwise. A companion proposal for the C# and VB languages #5292 introduces a new feature that makes it possible to have code generators not only add new declarations/members to your code, but also to augment the methods and properties you wrote too.

Now, you can get rid of boilerplate logic like all that INotifyPropertyChanged code you need just to make data binding work. (Or is this that so last decade that I need a better example?)

Subjects not covered in this proposal but open for discussion too

Ordering of Injectors – this concerns the order of injectors, which are run first etc., and the order of the new sources as presented to the compiler. The is of interest to the supersedes feature proposed in #5292
Callback events – beyond callbacks for type symbols declared in source, what other callback patterns would be useful for code generators, keeping in mind that these will likely need to be invoked by the IDE as well.
Having multiple injection event handlers lead to the generation of the same source code, only once, and being smart about it.
Recursion - Can generated code trigger additional code injection events for the newly injected declarations? I’d rather the answer be No, since this will make the system much simpler.
More?

3 - Working Area-Compilers Area-Language Design Feature Request New Language Feature - ReplacOriginal

Source

mattwar

👍46 ❤6 😄1

Most helpful comment

@jods4 all of this you can already do, if you run roslyn "from code", like in, wire up everything on your own, call the emit API yourself, if you _really_ want to. At Stack Overflow we do (using StackExchange.Precompilation), because we _really really_ want to do as much as possible at compile-time. For example we have compilation hooks, that bake localized strings into the assembly, doing additional optimizations when we detect that a parametrized string (we have something similar to string interpolation, but with additional localization features, like pluralization based on the value of multiple numerical tokens, and markdown formatting) is used in a razor view, which is pretty straightforward as soon as you have the semantic model. There, we avoid allocating the full string by directly calling WebViewPage.Write on it's tokens. The concept is the same as in my blog post where I discuss how to replace StringBuilder.Append($"...") calls with StringBuilder.Format.

The _actual_ problem here is, that there's no streamlined interface for such hooks in the command-line & dotnet CLI tools. I really hope that at the end of this, we get something akin to analyzers in terms of pluggability, but with the power to modify the Compilation before it hits Emit.

Here are some open points that I think should be considered:

One of the big caveats is, that as soon as you modify a SourceTree, the SemanticModel you had becomes invalid. The way we worked around it, was by calculating all the modifications first in form of TextChanges, and then applying them _in batch_ via SourceText.WithChanges. It would be nice to have an API around that.
Another one is, how to get debugging to work correctly, basically masking the rewrite from the end-user/developer stepping through code in VS. Basically what razor does to enable it to step through your code in views, utilizing #line directives. I would be nice to get such support out of the box from the API above
We'll probably need a to introduce compile-time-only references, in case your compilation wants to reference something you don't need at run-time.

I'm OK with not solving the question above, a low-level thing we can just hook in and replace the Compilation would do just fine for starters. We can always add nice APIs on top of it at a later time.

m0sa on 19 Apr 2016

👍16

All 190 comments

Looking forward to seeing where this discussion goes. The last time I remember metaprogramming being discussed, or more generally compile-time hooks, it sounded like the team wanted a little more time to see how things shook out (https://github.com/dotnet/roslyn/issues/98#issuecomment-71780597). Hopefully now that it's 8 months later and the library and tooling is in the wild it's a good time to revisit.

I personally really like the idea of using code diagnostics and fixes to drive this functionality. We already have great tooling around developing and debugging them, and Roslyn already knows how to apply them. Once developed, there's also already a mechanism for explicitly applying them to target code to see how they'll look.

I can envision a variety of use cases for this concept, from simple one-off fixes to things like entire AOP frameworks based on applying code fixes in the presence of attributes that guide their behavior.

daveaglick on 30 Sep 2015

👍1

I wrote the code generation portions for the C# targets of ANTLR 3 and ANTLR 4 using a pattern similar to XAML. Two groups will shudder at this statement: developers working on MSBuild itself, and the ReSharper team. For end users working with Visual Studio, the experience is actually quite good. There are some interesting limitations in the current strategy.

Reference limitations

It is possible for C# code to reference types and members defined in the files which are generated from ANTLR grammars. In fact, from the moment the rules are added to the grammar (even without saving the file), the C# IntelliSense engine is already aware of the members which will be generated for these rules.

However, the code generation step itself cannot use information from the other C# files in the project. Fortunately for ANTLR, we don't need the ability to do this because the information required to generate a parser is completely contained within the grammar files.

Undocumented build and IntelliSense integrations

The specific manner in which a code generator integrates with the IntelliSense engine (using the XAML generator pattern) is undocumented. This led to a complete lack of support for the code completion functionality described in the previous section in other IDEs and even in ReSharper.

sharwell on 30 Sep 2015

👍1

Let's deal with the problem of undefined order of such transformations. In normal OO code, this pattern is conceptually similar to decorator pattern. Take a look at this code:

var logger = new CachingLogger(new TimestampedLogger(new ConsoleLogger()));

var logger = new TimestampedLogger(new CachingLogger(new ConsoleLogger()));

The nice thing is that we have to manually specify the order of wrapping the class. This seems like the most obvious answer: Let the programmer specify the order.

I think this would be simple to do cleanly with attributes which would be somehow wired up to cause this transformations

[INoifyPropertyChanged, LogPropertyAccess]
public class ViewModel
{
   //gets change notification and logging
   public int Property { get; set; } 
}

They could be specified at assembly/class/member level to represent all kinds of transformation scope.

The problem is that until now, attributes serve as metadata only option - now instead they modify source code. Maybe user defined class modifiers would be better:

public notify logged class ViewModel
{
    public int Property { get; set; }
}

Where notify and logged are user defined somehow.

ghord on 1 Oct 2015

@ghord That's a good idea. If we limit the code generators to only working on symbols with custom attributes explicitly specified, we can order the code generators by the order of the attributes as specified in the source.

mattwar on 1 Oct 2015

@mattwar @ghord While I think the use of attributes to guide the code generation process could work (it's worked well for PostSharp, for example), I'd love to see a more general solution that isn't directly tied to a specific language syntax or feature. That's why I mentioned being able to apply analyzers and code fixes as a possible approach.

The way I would envision this working is that the compiler would be supplied with a list of analyzers and code fixes to automatically apply just before actual compilation. It would work as if the user had manually gone through the code and applied all of the specified code fixes by hand before compiling.

Benefits

I suspect that this could be achieved with a minimal amount of changes to existing Roslyn, at least functionality-wise (though it may take some serious refactoring - I have no idea). Of course, the compiler would need a mechanism for specifying the analyzers and code fixes and applying them during compilation. Note the following:

We already have the concept of _diagnostics_ to describe those portions of code we're interested in changing. Perhaps a new DiagnosticSeverity is needed to indicate potential code generation, or maybe a DiagnosticSeverity of Hidden could be used with some other indication. Regardless, diagnostics can be used to identify where code generation should take place.
We also already have the concept of _analyzers_ to figure out where such diagnostics should apply.
We also already have the concept of _code fixes_ to specify, in as flexible a way as possible, what should be done to the code.
The distinction between analyzers, diagnostics, and code fixes is a nice separation that could also be leveraged for code generation. For example, I could write a code generation code fix that would fix up some built-in diagnostic on every build.
Visual Studio already has support for developing analyzers, diagnostics, and code fixes. We can scaffold them with a template, debug them on real code, etc.
Visual Studio also already has support for applying analyzers and code fixes so that code generation authors and users can see exactly where their code generation will apply and can preview what it will do (and even apply it beforehand if needed).

There would also be some synergy with this approach between existing authors of conventional analyzers and code fixes and those intended to be used for code generation. Existing code fixes could also be adapted or possibly applied wholesale during the code generation stage (if specified). The tooling and process would be the same so skills could be leveraged for either.

Challenges

I do see the following questions or complications with this approach:

How to supply the analyzers and code fixes to the compiler? Would it be a command-line argument? An external file? Something in the .config file (or equivalent).
There's still the problem of ordering. What if one code fix changes the code is such a way that the next one to be run no longer applies? How could that be reported to the user? Would chaining too many code fixes together create unexpected behavior (though I think this is a challenge with any automatic code generation process)?
Debugging. How can the newly generated and/or changed code be debugged? How to make sure that the .pdb or other debugging artifacts can still trace back to the original code?
Analyzers and code fixes would have to be totally decoupled from Visual Studio. My understanding right

An example that uses attributes

Getting back to the use of attributes, one of the big examples of this approach that I've been thinking about is using it to build out a full AOP framework similar to what PostSharp does. In this case, an analyzer would be written that looks for the presence of specific attributes as defined in a referenced support library. When it finds them, it would output diagnostics that a code fix would then act on. The code fix would then apply whatever code generation is appropriate for the attribute.

My favorite PostSharp _aspects_ is OnMethodBoundaryAspect, which allows you to execute code defined in your aspect attribute class before method entry and after method exit. Something similar could be constructed by having a code fix inject calls to methods contained in a class derived from a specific attribute for any method that has said attribute applied to it.

You could potentially build up an entire AOP framework by creating analyzers and code fixes that act on pre-defined attributes and their derivatives. The point, though, is that you wouldn't have to. The code generation capability could be as flexible and general as analyzers and code fixes themselves, which because they directly manipulate the syntax tree can do just about anything.

daveaglick on 1 Oct 2015

👍1

Very happy to see a proposal for this on the table. Are attributes how you envision applying a Code Injector? I think specifying the syntax for applying them is needed in the proposal.

Is the idea that CodeInjections all take place prior to build so that you can see and possibly interact with the members it generates? If so, I think being able to interact with the generated code is a another huge benefit that you should mention in your proposal. When using PostSharp, anything you have it generate doesn't exist until build time, so you can't reference any of it in your code.

@ghord The problem with your proposal on ordering is that you might not define the injections in the same place. For example, you could have an code-injection attribute on a class NotifyPropertyChanged and then a code-injection attribute on a method Log. Which one should be applied first? I think you need a way of explicitly specifying an overall ordering when you invoke a CodeInjector (probably just an integer).

MgSam on 1 Oct 2015

👍1

Using a property on an attribute to specify order is not new to the framework:

DataMemberAttribute.Order Property

But I think that, if order is important, then there's something wrong.

Notifying property change is something that is expected form the consumers of an object, not that the object expects itself. So, as long as it's done, the order doesn't matter.

Logging is the same thing. If you want to log the notification, than that is not logging the object but logging the notification extension.

Is there any compeling example where one extension influences the other and order matters and it can still be considered a good architecture?

paulomorgado on 2 Oct 2015

👍1

@paulomorgado Yes, there are any number of use cases. For example, you want to have some authorization code run before some caching code. PostSharp has several documentation pages about ordering.

MgSam on 2 Oct 2015

👍1

Rather than ordering at use site, why not let injectors specify their dependencies using something akin to those PostSharp attributes @MgSam is linking to? (Or OrderAttribute in VS.) Depending on the order of the attributes at the use site seems very brittle to me and prevent using them at different scopes.

MrJul on 2 Oct 2015

There are some issues with attributes which we will have to overcome for this to work:

Partial classes and members in separate files: attributes from which part of the class/member have the priority?
Assembly attributes: attributes from which file have the priority?

We could make the order alphabetical according to file names. I'm pretty sure that in 99% cases the order won't matter, but leaving undefined behavior such as this in the language is very dangerous - application could crash or not depending on the applying order of transformations.

ghord on 2 Oct 2015

@ghord, what in this proposal influences assembly attributes?

paulomorgado on 5 Oct 2015

I think code generation support for the compiler would be fantastic. I'd love to be able to do something similar to what PostSharp provides.

PostSharp's more limited free version, and the requirement to submit an application to get an open source project license makes me unwilling to look at it for anything but larger projects at work that we would invest money in.

I'd like to be able to have great AOP tools for everyday/hobby projects without additional hassle.

@daveaglick For debugging, if code generation is only happening after things are sent to the compiler, wouldn't inserting line directives into the syntax tree preserve the integrity of the debugging experience?

I did make a syntax tree rewriter for Roslyn to implement a simple method boundary aspect. I used a Roslyn fork to get this hooked in during compile time. Line directives ensured there was no issue with debugging. It was an interesting experience and an example of something I'd like to be able to do without jumping through hoops.

One issue I had though was the fact that I was working at the syntax tree stage deprived me of information that was needed from the bound tree stage. Is there a way to know about type relationship information at this point? When you see an attribute on a class how will you know that it subclasses MethodBoundaryAspect or whatever?

Inverness on 21 Oct 2015

👍2

Is this like F#'s type providers?

AdamSpeight2008 on 20 Nov 2015

It's great to see this being proposed, I remember asking a while back if this was being considered.

I think that it should be possible to have the _modified_ source code written out to a temp folder, to make debugging easier. Either by default or controllable via a flag.

I also think that having to apply an attribute to the parts of the source code that can be re-written is a nice idea as it makes the feature less magical and it's easier reason about.

mattwarren on 24 Nov 2015

@AdamSpeight2008 I don't think so, I see this feature more as a compiler step that lets you modify code before it's compiled. But crucially this isn't meant to be seen by the person who wrote the code, it happens in the background when the compiler runs.

My understanding of type providers is that they integrate more into the IDE and help you when you are writing code that works against a particular data source (by providing intellisense, generate types that match the contents of a live database, etc)

mattwarren on 24 Nov 2015

Hi Team,

Just thought I'd share my thoughts regarding this, coming from a DNX/ICompileModule background.

Since the introduction of ICompileModule it enable me a few avenues for tackling some common runtime-problems as part of the compilation process. The key areas I've used compile modules so far include:

Module composition - Instead of using reflection to identify my module types, I was able to use a compile module to discover any types that implement my IModule interface, and generate at compile time a ModuleProvider which was given a static list of types. I still maintain the ability to reference new modules, and then through compilation that dynamic set of modules becomes a static set of modules. This was achieved by doing this workflow
- For the project being compiled - using the semantic model - find any classes that implement the interface
- For any assembly references - using the cached semantic model provided by Roslyn, derive any times that implement the interface
- For any project references for the current project apply the above steps and recurse through all project references
- With the derived set of available modules, generate a class, and inject the static list where appropriate.
EF DbContext composition - Because the framework I am creating is modular by design, so to must the DbContext stuff. I've done a considerable amount of customisation of the EF7 stack to enable better support for modular DbContext, including support for multiple contexts with shared entities, and baking in support for cross-database navigation properties. This enables modules to define a DbContext, add appropriate DbSet<Entity> properties, regardless of the source module. A ICompileModule is provided to:
- Discover any DbSet<Entity> properties of a context, and for each of those entities:
- Walk the semantic model to understand other candidate entities (navigation properties, both one to one and one to many)
- With a now fixed set of entities that require configuration to work with the DbContext, I then search for appropriate instances of an IEntityConfiguration<Entity> type for the given entity, which wraps up the IModelBuilder grunt work in isolation. This enables me to compose the configuration of a DbContext at compile time, thus this is one of the foundation aspects of my modular DbContext approach.
As an extension of the dynamic DbContext work, the standard EF migrations wouldn't work because they are not designed with multi-tenancy in mind. So, I've had to roll my own for migrations, and again, using an ICompileModule, I was able to look through the items marked as resources at compiler/resources/data/<version>/**.sql and generate a class at compile time that provided a descriptor of a versioned migration, e.g. it would generate a (potential) series of classes like MigrationToV1_0_0, MigrationToV1_0_1 and a custom build DNX command would allow me to deploy those migrations against a target.

In all instances this has allowed me to improve the runtime experience by reducing the moving parts of my application. In hindsight, not having access to compile modules through the early evolution of DNX was perhaps best, as now I've come to depend/expect a certain level of functionality, because I now know _what I have already acheived_, so obviously my expectation of the replacement to ICompileModule is now set quite high.

I've already taken the approach of branching my code and implementing a reflection-based alternative to all functionalities I've mentioned above, but obviously I'd still prefer to tackle these sorts of tasks at compile time, because I want to make my framework as performant as possible. I've done this because when ASP.NET 5 ships, it would have migrated from DNX through to dotnet CLI, and therefore until there is metaprogramming support for Roslyn, I have to provide an alternative.

Broadly summing up, what I'd like from Roslyn's implementation for metaprogramming is:

The ability to look through the references to a compilation either through a new semantic model, or a cached IAssemblySymbol from references.
The ability to look at any resources included in the compilation as embedded resources
The ability to generate code based on declarative configuration, as well as attributed (as described in previous replies in this thread).

My last point really relates to how ICompileModule instances are currently configured (and this may be more of a dotnet CLI issue rather than Roslyn), currently you have to drop code files at compiler/preprocess and a dynamic preprocess assembly is generated and compiled ahead of main compilation, referenced and executed. It would be a nice experience through the project.json file to simply have a key/value pair, something like precompile: [<identifier-or-path-to-module>]. Again, that's probably more of a CLI issue.

cc @davidfowl

Antaris on 15 Dec 2015

It would be great if we could enable those extensions using NuGet packages.

fubar-coder on 4 Jan 2016

ICompileModule seems to be a worthwhile standard to adopt from DNX.

Inverness on 4 Jan 2016

            context.AddCompilationUnit($@”partial class {context.Symbol.Name} 
                      {{ public const string ClassName = “”{context.Symbol.Name}””; }}”);

So we have to write a HUGE "interpolated string", without real-time compiler verification, autocompletion, etc? What about typesafe, debuggable macros?

alrz on 20 Jan 2016

@alrz What if we combine this, with T4 and #174

``` c#
template ComplilationUnit foo ( SymbolContext context )
{
partial class <#= context.Symbol.name #>
{
public const string ClassName = "<#= context.Symbol.Name #>" ;
}
}

```

AdamSpeight2008 on 20 Jan 2016

👍2 😕1

@AdamSpeight2008 That would be nice actually, I don't know why macros are off the table per @gafter's comment,

we do not want to add a macro system to C# of VB.net.

But this one's a little horrible.

alrz on 20 Jan 2016

Why isn't this also considered a macro system?

AdamSpeight2008 on 20 Jan 2016

A "macro system" would make implementing this kind of libraries a piece of cake which is not limited to that case (REST APIs).

alrz on 21 Jan 2016

When the compiler in instructed to compile the source code you wrote, code injectors are given a chance to exam your code and add new code that gets compiled in along with it.

This proposal is also "injecting" at compile-time, so what is the difference? @alrz
In my addition to this proposal it have the "template" (or injected code) be a separate construct, ie it isn't a string. ( #174 see comment 5 )

AdamSpeight2008 on 21 Jan 2016

@alrz you don't have to write a huge interpolated string, you just have to produce a string and pass it to this API. It would probably be perfectly fine to use T4 or some other template engine to do this. You could also use Roslyn syntax building API's.

mattwar on 21 Jan 2016

👍2

@alrz and @mattwar see #8065.

AdamSpeight2008 on 21 Jan 2016

@mattwar It'll become huge and error-prone if you actually use this feature. If we have to fallback to T4 or some other template engine what's the the point of this proposal then? It provides Roslyn API sure, but the fact that we have to build up a "string" for injecting code doesn't feel right.

@AdamSpeight2008 I'm not against of macros, I think they would be more powerful and safer to use compared to interpolated string injection!

alrz on 21 Jan 2016

@AdamSpeight2008 Something like that is certainly interesting, but outside the scope of this proposal.

mattwar on 21 Jan 2016

@alrz You have to at some point produce text for the compiler to consume. That's what a template like T4 is doing. This proposal is not tied to any particular means of producing that text.

mattwar on 21 Jan 2016

👍1

@mattwar If so, what's the advantage of this feature over macros (which seemingly have been dismissed in favor of this proposal)?

alrz on 21 Jan 2016

Just to be clear #8065 is not a macro system, it is just a way of producing the "code / expression" string.
_(True a macro system could be built using them)_

AdamSpeight2008 on 21 Jan 2016

@alrz I don't think macros are related to this proposal.

mattwar on 21 Jan 2016

For AOP, this is pretty much wrong idea. Devs want to write code, instead provide string with code to some internals within compilers. Roslyn is big and complicated tool, and average dev won't write AOP code this way. Beside of this, there is no way to debug this provided code.

5292 is much better

vbcodec on 13 Feb 2016

+1 for DNX style preprocessing

The dotnet cli currently just ignores the preprocess sources - https://github.com/dotnet/cli/issues/1812

m0sa on 11 Mar 2016

👍2

@vbcodec when you refer to devs do you mean the devs writing the aspects or the devs using the aspects? Why do you think you won't be able to debug the code injectors? You certainly will be able to debug them since they are just dotnet libraries. So, for example, you can debug the compiler and step through the individual code injectors. Or are you worried you won't be able to debug the injected code, because you can do that too. The injected code will appear as source that the debugger can find. The debugger will be able to step though and set break points in that as well. Which is a far cry better than most existing AOP solutions.

mattwar on 11 Mar 2016

👍3

As mentioned in https://github.com/dotnet/cli/issues/1812#issuecomment-196079253, this feature is going be the underlining provider for aspnet core view precompilation. Because of this fact, I think this should be high priority. Aspnet core won't see any serious usage without view precompilation, simply because it promotes compile-time errors from view files to run-time errors.

m0sa on 14 Mar 2016

👍4

I'd like some clarification as to whether or not the code generators feature being worked on will allow existing code to be replaced entirely, as opposed to only adding new code using partial classes and the new _replace_ keyword.

I see SourceGeneratorContext on the features/generator branch only offers AddCompilationUnit() for mutation.

Inverness on 25 Mar 2016

@Inverness The approach in the features/generator branch allows replacing a method implementation using replace. The original method is still part of compilation, but not callable outside of the replacement.

cston on 26 Mar 2016

👍1

@cston Yes I looked over the code and saw this. There are two things I'd like clarified:

Does this work for other type members like fields and properties?

Do these code generators function independent of Visual Studio? Specifically, can I run the compiler from command line and have code generators be applied? I assume this is the case for build server scenarios.

Inverness on 27 Mar 2016

And also, does it work for replacing whole classes, not just their members?

m0sa on 27 Mar 2016

@m0sa the code generators just add new source files to the compilation. The supersedes (now replace/original) language feature allows you to replace existing members in a class from a declaration in the same class (though realistically its from a separate partial declaration of the same class.)

mattwar on 27 Mar 2016

@marttwar can you add more than just source files?

davidfowl on 27 Mar 2016

@davidfowl what kind of files would you like to add?

mattwar on 28 Mar 2016

@Inverness It should be possible to replace methods, properties, and event accessors. Other type members can be added but not replaced. And generators would be executed by the command-line compilers.

cston on 28 Mar 2016

@mattwar adding resources (replacing resgen), adding/removing references.

davidfowl on 28 Mar 2016

@cston That is useful for typical code gen scenarios, but why limit code generation to that? Simply allow the Compilation instance to be replaced. This would allow both the editing of existing syntax trees, and the editing of references as @davidfowl suggested.

Inverness on 28 Mar 2016

I'm probably going a little crazy, but I would consider using this to replace C# features with efficient codegen, in particular LINQ and string interpolation.

There are (several) issues on Github about the perf of these features, but they are hard to optimize in general because of edge cases in their specs that they have to support (e.g. interpolation has to handle the culture properly and formats everything, including strings).

With such a facility in place, the developer could opt-in optimized codegen at specific points.

LINQ could be turned into optimized loops, similar to LinqOptimizer but at compile-time (would still love to see the JIT do that, though).
String interpolation could be turned into optimal String.Concat code. When possible, interpolation could be replaced by a constant string at compile-time. (Those optimizations have to be opt-in because they don't faithfully implement the spec.)

jods4 on 19 Apr 2016

👍3

Here are some open points that I think should be considered:

One of the big caveats is, that as soon as you modify a SourceTree, the SemanticModel you had becomes invalid. The way we worked around it, was by calculating all the modifications first in form of TextChanges, and then applying them _in batch_ via SourceText.WithChanges. It would be nice to have an API around that.
Another one is, how to get debugging to work correctly, basically masking the rewrite from the end-user/developer stepping through code in VS. Basically what razor does to enable it to step through your code in views, utilizing #line directives. I would be nice to get such support out of the box from the API above
We'll probably need a to introduce compile-time-only references, in case your compilation wants to reference something you don't need at run-time.

m0sa on 19 Apr 2016

👍16

@m0sa Thanks! That's interesting and useful.
As you noted, today setting this up is complicated. Where I work we would not accept that, at least not on "normal" projects. Being able to do it as easily as adding analyzers to your solution (i.e. pulling a Nuget package) would drastically change the game.

What we do instead is that we avoid C# productivity features on hot paths. LINQ can be handwritten code, same for string interpolation. It's sad to pass on those but what we loose in productivity and readability we win on perf.

Of course, your needs seem much more stringent than ours!

jods4 on 19 Apr 2016

👍1

@m0sa Regarding your first remark "_One of the big caveats is, that as soon as you modify a SourceTree, the SemanticModel you had becomes invalid"_: is this not solved by the DocumentEditor which allows for multiple modifications to a single Document? Or how does this not apply to your scenario?

Vannevelj on 19 Apr 2016

@Vannevelj we are not using Workspaces. IIRC DocumentEditor is only around for code fixes, so it requires VS to be around. SE.Precompilation's contract is somewhat similar to DNX's CompileModule/BeforeCompileContext. It has to create the whole compilation using only command-line arguments for csc.exe, you don't get the .csproj/.xproj/.sln. I guess I could try and create an AdHocWorkspace and fiddle around with that. Good pointer though.

m0sa on 19 Apr 2016

https://github.com/dotnet/cli/issues/1812#issuecomment-213028113

The Roslyn source generator feature work is actually well under way. The progress can be followed on this branch:

https://github.com/dotnet/roslyn/tree/features/generators
Live design document of the implementation:

https://github.com/dotnet/roslyn/blob/features/generators/docs/features/generators.md

m0sa on 21 Apr 2016

@jaredpar is there another, _official_, issue with discussions around the whole design etc?

m0sa on 21 Apr 2016

Hi,

if I understand the current work in progress correctly, this should allow many AOP-like features.

In practice, I'd imagine the most common use cases to heavily rely on Attributes, much like the current AOP libraries work. Decorate your class/method/field with certain attributes ([Logging], [ImplementPropertyChanged] or whatever), and the generator library will generate its code based on this.

For this, I think it would be very useful to extend the AttributeUsage-class. It might be very helpful to be able to specify that an attribute must be applied to a target which is replacable; i.e. something like
[AttributeUsage(AttributeTargets.Method | AttributeTargets.Replacable)]
would imply the attribute can only be applied to replacable methods - i.e. wouldn't be applicable to methods defined non non-partial classes, or methods that are for some other reason non-replacable.

Reason is pretty obvious - it wouldn't make much sense to apply something like a [Logging] attribute to a non-replacable target. And it'd certainly be nicer to have immediate compiler feedback of wrong attribute usage, rather than some runtime Exception the generator library would throw or so.

fabianoliver on 5 May 2016

@BogeyCode

While I agree that such a warning/error would be useful, I think it should be enforced by the generator itself. Attribute usage is limited to those values defined by the CLR. Expanding that would require modifying the CLR. Since partial classes don't actually exist as far as the CLR is concerned it doesn't make a lot of sense to add a language feature to a part of the run time itself.

Check out #6953.

HaloFour on 5 May 2016

Does this feature allow to generate new temporary source files, and use them in compilation instead of original sources ?

vbcodec on 5 May 2016

@HaloFour

Thanks, did'nt see the discussion in #6953 . Will post follow up suggestion/discussion in there.

fabianoliver on 5 May 2016

I imagine that for many AOP scenarios that a generator would be paired with an attribute in order to declare which members are to be replaced/augmented. Is there any consideration for allowing that generator to remove that attribute during the compilation process so that the resultant assembly isn't required to take a dependency on whatever assembly defined that attribute?

HaloFour on 5 May 2016

👍3

Is there any consideration for allowing that generator to remove that attribute during the compilation process so that the resultant assembly isn't required to take a dependency on whatever assembly defined that attribute?

I'm imagining most generators that see use will allow assemblies to self define the necessary attributes as internal.

bbarry on 5 May 2016

@bbarry

I'm imagining most generators that see use will allow assemblies to self define the necessary attributes as internal.

I don't get how that would work. If the attribute is internal how could anyone actually annotate their source with it? And even if that attribute was public wouldn't that require that the project take a direct dependency on the generator assembly itself? Unless references would work very differently when it comes to generators?

Update: Oh, nevermind, I misread that. You meant that the consuming assembly would define the attribute themselves internally. That is better, but not great.

HaloFour on 5 May 2016

internal sealed class NameAttribute : Attribute
{
    public NameAttribute(string name){ }
}

public partial class Program
{
    [Name("fdsa")]
    public static void Thing() {}
}

This compiles just fine. There is no reason I can see that a source generator working on a syntax tree of the Program class cannot see the Name attribute and use that as a trigger to generate some code.

bbarry on 5 May 2016

I suppose the generator would still support adding diagnostics, right? So if it comes across it's target marking attribute, and the target is not partial it can emit an appropriate diagnostic itself.

m0sa on 5 May 2016

@vbcodec

Does this feature allow to generate new temporary source files, and use them in compilation instead of original sources ?

It allows for generation of source files in addition to the ones originally specified in the compilation

@m0sa

I suppose the generator would still support adding diagnostics, right?

The current design allows for generators to create diagnostnics.

jaredpar on 5 May 2016

@HaloFour

Is there any consideration for allowing that generator to remove that attribute during the compilation process so that the resultant assembly isn't required to take a dependency on whatever assembly defined that attribute?

That is already supported today. Just apply the Conditional("UNDEFINED_CONSTANT") attribute to your code gen attribute and it is removed during compilation. If the only references to the assembly declaring the code gen attribute are such attribute usages, then the reference to the assembly will also be removed during compilation. That's how the JetBrains.Annotation NuGet package works, for instance.

axel-habermaier on 6 May 2016

👍1

@axel-habermaier

That is already supported today. Just apply the Conditional("UNDEFINED_CONSTANT") attribute to your code gen attribute and it is removed during compilation. If the only references to the assembly declaring the code gen attribute are such attribute usages, then the reference to the assembly will also be removed during compilation. That's how the JetBrains.Annotation NuGet package works, for instance.

Wouldn't that cause the attribute class to be generated or not and, for code using the attribute, either have the dependency to the attribute (and its assembly) in the assembly of the compiled code or the code failing to compile?

paulomorgado on 6 May 2016

Once I wrote T4 which used NRefactory(now it would be Roslyn) which access Host, parses some files in project and generates some new files into project. I would be interesting to access running Roslyn (and whatever he already knows and parsed) via Host. May be it is already possible via some reflection?

Googling T4 host roslyn shows similar usages.

dzmitry-lahoda on 24 May 2016

We were discussing this in amongst all the .NET Core talk at NDC Oslo last week. One of the most exciting things about Core is the AOT native statically-linked compilation, but with that comes the loss of runtime, JIT-based code-gen, and that's used all over the place, not just for ASP.NET Core View compilation. JSON.NET, Dapper, EF (I think), Simple.Data... basically any general-purpose library that does reflection is going to do code-gen as an optimization eventually.

A good example from another ecosystem is Rust's rustc_serialize module, which provides a couple of traits (Encodable and Decodable) that can be automatically implemented by the compiler just by adding a #[derive(RustcDecodable, RustcEncodable)] annotation to your struct.

So: I'm looking forward to JSON.NET becoming a compiler extension.

markrendle on 15 Jun 2016

👍2 😕1

Is there any information on what is going on with the source generation API right now? I see there is some new ITypeBasedSourceGenerator interface being added. I also see that the AddCompilationUnit method only accepts the source string but not a syntax tree.

Finally, I wanted to ask again, why is it not an option to replace the Compilation instance directly? What if the user wants to add references or resources to the compilation?

Inverness on 3 Aug 2016

👍1

@Inverness: Interesting questions; additionally, I'd really like to know if it is still planned to ship source generates with C# 7? I'm kind of looking forward to this feature more than to any other proposals made for C# 7...

As for why AddCompilationUnit only accepts source strings, I assume it has to do with the fact that Roslyn is only able to compile syntax trees that it has parsed itself. Otherwise, the tree might be conceptually valid but of a form that violates certain assumptions made by the compiler, resulting in all sorts of problems. Thus, even if you construct syntax trees using Roslyn APIs in your source generator, you have to serialize them into strings and let Roslyn parse them again. It's a bit inefficient, but I understand the reasons.

axel-habermaier on 3 Aug 2016

I would like to second that. I like the other features currently being developed but this is the one I am most excited about.

christianklauss on 4 Aug 2016

@axel-habermaier The latest status can be found here: https://github.com/dotnet/roslyn/blob/master/docs/Language%20Feature%20Status.md

As you can see, source generation is not in C# 7.

gafter on 4 Aug 2016

@gafter
Original still in C# 7 / VB 15 ?
Typeswitch in VB 15 ? (I have read rumors that may be skipped)

vbcodec on 4 Aug 2016

@gafter That's very disappointing to hear.

Source generation would have had a greater positive impact on my programming experience than anything in C# 7.

Inverness on 4 Aug 2016

👍4

Disappointing but not entirely unexpected considering the activity on the feature branch. Still crossing my fingers that the release modalities will change in the future so we won’t have to wait till VS15+ to get access to this feature (and maybe get it in a point release VS15 update) since it enables some interesting use cases.

christianklauss on 4 Aug 2016

👍1

Yeah, I was looking to it as well. Meanwhile we have Fody and Postsharp.

MihaMarkic on 8 Aug 2016

👍3

I am interested in this sort of feature for customizing the auto implementation of properties. Instead of there being just one auto-implementation, why not let it be extensible? This could be useful for further making class definitions more declarative, which I guess is the purpose of auto-implemented properties to begin with.

I created a library for reactive properties which could benefit from this if my library, as it generates the replacement code, could get the property value, written with this C# 6 feature, as a LINQ expression.

HappyNomad on 10 Aug 2016

I would love to have a code generation API that allows me to basically consume a Compilation and produce a modified Compilation which Roslyn would then use to emit. This would save a lot of repeated work and headaches over a solution which involves running a custom code generator executable that calls Roslyn before running csc as part of the build process.

RikkiGibson on 2 Sep 2016

👍5

This could replace tools like ccrewrite from Code Contracts (in fact, making your own re-writer will be trivial). It would be great to see this feature implemented.

Dennis-Petrov on 21 Nov 2016

Would a generator allow, for example:

Detect anywhere a bool b is cast to object and replace with BoxCache.Box(b) where BoxCache is defined as the following?

public static class BoxCache
{
    private static readonly object BoxedTrue = true;
    private static readonly object BoxedFalse = false;

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static object Box(this bool b)
    {
        return b ? BoxedTrue : BoxedFalse;
    }
}

benaadams on 3 Dec 2016

@benadams

I imagine that you could do that only if you replace any member that performed such a cast and completely rewrote it without calling the original member. The fact that each type would have to be marked partial might make that a bit tedious.

HaloFour on 3 Dec 2016

I still hope that one day we'll get full get/set access to the compilation
object, like we had in dnx icompilemodules. Crazy things (totally valid in
some use cases) like the one @benadams suggest were perfectly doable there.
What's the point of having the whole compiler model available, but not
beeing able to trivially hook into it and modify the compilation, other
than forking csc.exe? Instead we come up with new language level constructs
that sort of allow you to do a subset of things that are actually possible
with the api at compile time, out of the box...

On Sat, Dec 3, 2016, 22:56 HaloFour notifications@github.com wrote:

@benadams https://github.com/benadams

I imagine that you could do that only if you replace any member that
performed such a cast and completely rewrote it without calling the
original member. The fact that each type would have to be marked partial
might make that a bit tedious.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/dotnet/roslyn/issues/5561#issuecomment-264668199, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAevljA9AarjtBL7j7jVVj2aAKUHfeDnks5rEeV9gaJpZM4GGIaO
.

m0sa on 3 Dec 2016

👍2

Would a generator allow, for example:
Detect anywhere a bool b is cast to object and replace with BoxCache.Box(b) where BoxCache is defined as the following?

Most likely yes.

CyrusNajmabadi on 3 Dec 2016

👍3 ❤2 🎉1

I'm agreeing with @m0sa and @RikkiGibson a lot here.

I would like an explanation as for why full access to the compilation object is not being provided.

Inverness on 4 Dec 2016

I would like an explanation as for why full access to the compilation object is not being provided.

The moment we give access to something, it becomes an API that we cannot change. That is an enormous burden and we are simply not at the point where we can provide that.

We are trying to expose more of the Roslyn compilation model through the IOperation work. But even starting that work, we discovered an enormous amount of work that would be necessary to do it properly. The testing burden alone would be massive.

Imagine if someone came to you and said: "make everything in your library public. And now never break any of that surface area." Would such a thing be trivial for your project? :)

CyrusNajmabadi on 4 Dec 2016

👍4

@CyrusNajmabadi I don't understand the argument because CSharpCompilation already has many members that are part of the public API.

How would allowing the code generator to use methods like AddReferences() then return the new CSharpCompilation instance be creating an enormous burden?

Inverness on 4 Dec 2016

I don't get it either. Even Dustin Campbell demonstrated a prototype of it a year or two ago AFAIK.

MihaMarkic on 4 Dec 2016

I don't get it either. Even Dustin Campbell demonstrated a prototype of it a year or two ago AFAIK.

There's a big difference between a prototype and a real product. And as we worked more on the feature we discovered that it was larger than expected. i.e. it would take many more devs across many more teams to do this properly. There wasn't enough time for all of that for VS17. Hopefully we'll be able to fit it into a future release.

CyrusNajmabadi on 4 Dec 2016

Or, in other words, it's much simpler to make something that just has to work for a demo scenario. When you start dealing with teh complexities around making that scale up well to humongous projects, and work well with debugging, and work well with versioning, and work well with editing, and work well with refactoring, then things get much much more difficult.

CyrusNajmabadi on 4 Dec 2016

Again, I don't understand your argument. What is the burden created by allowing the code generator to return a new CSharpCompilation instance that would be used by subsequent code generators and then the compiler? CSharpCompilation already has public API's.

Inverness on 4 Dec 2016

👍2

I've explained the difficulties. We already discovered how complex and challenging the space was, when we only gave the ability to add source files. Now we have to compound that with the difficulties that would arise if we exposed all the capabilities of the compilation.

Let's just look at a simple case. Say we exposed the full compilation and you used it to remove a file from the compilation. We now have to design and implement a sensible intellisense model that works in such a system. i.e. if you try typing in that file, we need to give sensible intellisense results, even though that file doesn't exist in the final compilation where we want to get symbols/semantic information from.

There's also just the issue of how you even present this information to the user in an IDE setting. With the initial design, we felt it would be fairly easy to show users the 'generated files' (possibly in a special sub-folder of the project). This was someone could easily see the world pre-generation (just ignore the 'generated files' folder), or post-generation (don't ignore them). In a world with arbitrary mutations, this gets much more challenging, and user confusion around understanding what's going on becomes a primary concern.

Solutions can be designed/attempted for all these issues. But that raises the cost. A lot. And the cost of only generating source was already too costly for us for VS17.

TL,DR: The feature was already too complex and costly to make it to VS17. Your question boils down to "why didn't you make it more complex and costly?"

CyrusNajmabadi on 4 Dec 2016

❤1 👍1

@CyrusNajmabadi , IMO, you're making things too complex.
Are you sure, that someone wants to remove source files (especially, ones added by user)?
Could you consider much more simple scenario?

I think, we need something like one can do with Expressions and their visitors: to get some tree as input, visit it to build new one and return it as the result to compiler. No Intellisense, no design-time, no file/project/solution changes - personally, it will be a madness to me, if some plugin will re-write/add/remove my source files when I type code without prompting me.

Dennis-Petrov on 5 Dec 2016

👍1

@Dennis-Petrov That's exactly what i was saying. This is why we don't want to expose the full roslyn compilation API. Just representing what it would mean if someone removed a file is extremely difficult.

No Intellisense, no design-time

I don't think we're going to ever do a system unless it fits cleanly into the IDE. If you just want a system that goes and transforms C# files, then you can already do that today with Roslyn. The point of us making this a first class feature would be precisely so that we could ensure that things like the editing/debugging/navigating/refactoring scenarios worked well.

CyrusNajmabadi on 5 Dec 2016

@CyrusNajmabadi I am looking for a Fody/Postsharp replacement. Compile time AOP without any adding or removing of files - just modifying the trees and injecting / replacing code.

These existing tools are extremely valuable - but come at a cost (slower build times (because assemblies need to be re-processed), debugging issues because of incorrectly written pdb metadata etc.)

If we could consume the compilation of roslyn, before emitting assemblies, we could build tools similar to Fody/Postsharp, but without the overhead and complexity of re-parsing/updating/writing assemblies.

But If i understand you correctly, this will never happen? :/

davidroth on 5 Dec 2016

👍1

I was under impression of AOP as well, same as @davidroth. Hence I can't figure out what is a big deal about it from implementation side (editor, intellisense, etc.). Except for debugging.

MihaMarkic on 5 Dec 2016

👍1

If you just want a system that goes and transforms C# files

No, I just want a system, that allows to inject code at compile-time, like AOP-tools/Code Contracts/etc doing (yes, I know, that they are rewriting assemblies). The same as @davidroth and @MihaMarkic mentioned. And I want it because of the same reason - why one should re-write IL (which at least slowly), if he can re-build tree and give it to compiler?

Debugging - yes, there are details to think about, but support from editor side isn't required, really.

Dennis-Petrov on 5 Dec 2016

😕1

Except for debugging.

This alone, is a huge deal.

I'm not sure what else i can say at this point. We worked on this. It turned out to be too large to fit into the VS17 schedule. So it was cut from that release. It may make it into a future release if we feel like we can design these experience well enough (including debugging, intellisense, refactoring, etc.), and we have the resources to fit it in.

CyrusNajmabadi on 5 Dec 2016

@Dennis-Petrov Again That's exactly what i was saying. This is why we don't want to expose the full roslyn compilation API. Just representing what it would mean if someone removed a file is extremely difficult.

Please go back and read the conversation.

@Inverness asked:

I would like an explanation as for why full access to the compilation object is not being provided.

And

Again, I don't understand your argument. What is the burden created by allowing the code generator to return a new CSharpCompilation instance that would be used by subsequent code generators and then the compiler?

As i've explained multiple times, the reason why we would not provide the full compilation object is precisely because people could then do things (like remove files) that we feel would be very hard to design a good developer experience around.

You seem to be under the impression that i'm stating that removal of files is a good thing. I'm not. I'm simply explaining why giving out the full compilation model, and allowing full access to its capabilities is something we are not likely to do.

CyrusNajmabadi on 5 Dec 2016

👍2

@CyrusNajmabadi I am looking for a Fody/Postsharp replacement. Compile time AOP without any adding or removing of files - just modifying the trees and injecting / replacing code.

Great. That doesn't require being able to provide a new compilation that roslyn would then use (i.e. what @Inverness was asking for). That sounds very much like what we wanted to make possible with SourceGenerators. My hope is that we will be able to eventually ship this. But it will take a lot of design work, and efforts across several teams to do well.

CyrusNajmabadi on 5 Dec 2016

👍2

And I want it because of the same reason - why one should re-write IL (which at least slowly), if he can re-build tree and give it to compiler?

Because everything has a cost. These sorts of feature requests aren't cheap at all. We actually started doing development on this. And almost immediately kept running into issues all over the place.

If you literally just want to be able to "re-build tree and give it to compiler", and you don't care at all about the remainder of the development experience, then you can already do that today. Just write a tool that uses the roslyn API to read in code, create a compilation, rewrite it as you see fit, and then emit the compilation. All the APIs are there for you to do this. Indeed, here's the rough sketch of what you would do:

c# void Main() { var args = CSharpCommandLineParser.Parse(...); var trees = ParseTheTrees(...); var compilation = CSharpCompilation.Create(assemblyName, trees); var newCompilation = RewriteAllTheThings(compilation); newCompilation.Emit(...); }

That's all you really need. Indeed, it's one of the main reasons that we've created and exposed so many compiler APIs today.

The reason we would take on this feature would be precisely because we'd want to actually make sure the full experience would be great. Otherwise, why bother? If it's just taking in a compilation, rewriting it, and emitting it, then that's easy to do today. The benefit of a fully integrated approach would be that you could trust all our tooling to work well with it.

CyrusNajmabadi on 5 Dec 2016

👍2

Hence I can't figure out what is a big deal about it from implementation side (editor, intellisense, etc.). Except for debugging.

Here's a very simple example:

c# void Foo() { ... int i = ... Bar(i); ... }

Say i want to extract out those two statements into their own method. Today, extract-method will decide what to do based on how 'i' is used in that method. If there is AOP code that changes how 'i' is used (for example, for contract checking), then extract-method may behave improperly.

Any sort of diagnostic/analysis tool we write/run would also potentially report false positives/negatives for code where some AOP generation was going to do its work. Customers would not be happy if we added a new code-manipulation system that then ended up making large scale development difficult because all their common tooling now behaved in a subpar manner on that code.

CyrusNajmabadi on 5 Dec 2016

Here's a very simple example:

Well, but refactoring comes before AOP. AOP works on refactored code. It isn't the other way round. So it is AOP that has to pay attention, not design time refactoring.

Any sort of diagnostic/analysis tool we write/run would also potentially report false positives/negatives for code where some AOP generation was going to do its work.

This could be true, but OTOH, customers have to know what AOP is and what it does. You simply can't control it. Look, are you familiar with Fody? If not, go check it out. It does wonders and happy customers for years. Heck, it can even produce an invalid IL. Users of Fody are well aware of the possible problems but still use it. If you enabled what we are asking for, AOP would be much more robust and easy - the situation would improve a lot.

MihaMarkic on 5 Dec 2016

Add AOP support similar to postsharp and that will be enough for 95% needs. Look like team prefer convoluted workaround with source generators, replace/original and pluggable compiler in future. Do you really think that average developer, will create their own code generator (sepearate for C# and VB) just to hook into methods / properties ? This is pretty naive.

vbcodec on 5 Dec 2016

👎4 👍1

Postharp level AOP can be built on top of Fody-like level.

MihaMarkic on 5 Dec 2016

So it is AOP that has to pay attention, not design time refactoring.

I don't think we can make that statement. Someone may have had code that worked correctly before, and used an AOP system that did an post-invariant check. Then, they do a simple refactoring, and now the post-check fails because the refactoring overly aggressively did something.

Other customers will not see this as a problem with their AOP system. Once we pull this into the main product, they'll see it as a problem with the tooling not properly understanding it.

CyrusNajmabadi on 5 Dec 2016

This could be true, but OTOH, customers have to know what AOP is and what it does. You simply can't control it.

Saying "you can't control it" is likely not going to cut it. Same as not addressing things like debugging. Again, if you want this in the core product, then people are going to expect that the tooling around it works well.

If you don't care about tooling working well. Then you don't need this in the core product. Just use the current Roslyn APIs to do rewriting and emitting.

CyrusNajmabadi on 5 Dec 2016

👍1

@vbcodec I don't know what you're asking for.

CyrusNajmabadi on 5 Dec 2016

Heck, it can even produce an invalid IL. Users of Fody are well aware of the possible problems but still use it. If you enabled what we are asking for

Please be specific. At this point, i've heard multiple different statements about what people want.

What precisely do you want provided? What features do you need that are not available to you today with teh Roslyn API? What exactly are you "asking for"?

One of the few actual requests i can make out is @dennis-petrov stating "why one should re-write IL (which at least slowly), if he can re-build tree and give it to compiler?" But that functionality is already available today. So i don't know what's being asked for.

CyrusNajmabadi on 5 Dec 2016

@CyrusNajmabadi

What precisely do you want provided?

I think #15649 is relevant. Would we be able to do that sort of thing with generators?

alrz on 5 Dec 2016

@alrz I don't know what this means:

But I'd greatly appreciate it if we could intercept the compilation pipeline and access to all the information that the compiler produces.

What information are you looking to access?

CyrusNajmabadi on 5 Dec 2016

But that functionality is already available today. So i don't know what's being asked for.

@CyrusNajmabadi Does that mean I can build something like Fody/Postsharp as of today which would work out of the box when someone clones my solution without shipping a forked roslyn compiler?

I only found this solution so far, but it requires forking the compiler which doesn`t seem like a production solution:
https://github.com/russpowers/roslyn/commit/7094454b4544c127bccb332a8372beba9a0d690c

davidroth on 5 Dec 2016

@CyrusNajmabadi Can I build something like Fody/Postsharp as of today so that it works out of the box when someone clones my solution without shipping a forked roslyn compiler?

Sure. I don't see what would stop them. As solutions can make use of things like pre/post steps, or complex build tasks, you could easily ship what was necessary with your solution to make things work, regardless of forking.

I mean, Roslyn already does something like this today. We have tools which take code of one form and translate it into code of another form. We happen to have not written those tools with Roslyn itself (since we needed them early on when we started Roslyn itself, and bootstrapping would have been hard), but i'd certainly like to port them over to be Roslyn based in the future.

CyrusNajmabadi on 5 Dec 2016

@CyrusNajmabadi :

So i don't know what's being asked for

I (we? :) ) want:

create new project (say, console app) using Visual Studio
add nuget package with some custom code re-writer (say, "my super AOP tool")
add following method in my code (assuming, that we're currently in default console app code):

void Foo([NotNull]object o)
{
    Console.WriteLine(o);
}

when I hit F5, I want compiler to detect, that "my super AOP tool" has a plugin, that must be called before compilation will run;
run this plugin. It will re-write code (not real CS file, just a tree) like this:

void Foo(object o)
{
if (o == null)
throw new ArgumentNullException(nameof(o));
Console.WriteLine(o);
}
compile new Foo implementation.
this must be seamlessly. User must not know details about how the code was re-written. There must not be need to run any external tool or other magic.

That's all.

Dennis-Petrov on 5 Dec 2016

👍2

@CyrusNajmabadi

What information are you looking to access?

Like syntax tree of a member body and then replace nodes with something else. Plus semantic information as what this nodes refer to e.g. in case of a method invocation etc.

alrz on 5 Dec 2016

@Dennis-Petrov Why do you need anything added to Roslyn to do this? You can already do all of that yourself today with the functionality that Roslyn has provided. Just write a tool that looks for the attributes you care about, transforms them however you want, and emits the final code.

CyrusNajmabadi on 5 Dec 2016

@alrz

Like syntax tree of a member body and then replace nodes with something else.

Why are the existing Roslyn APIs not suitable for this? We already expose SyntaxTrees. We already have a wealth of APIs dedicated to manipulating that syntax (i.e. visitors, rewriters, editors, etc.). What new functionality do you need that we do not have today?

The main problem with the solution i've outlined above is that it is basically awful from a tooling perspective. i.e. debugging things is just enormously difficult. The purpose of SyntaxGenerators was to expose this existing functionality in a manner that did not make tooling terrible. i.e. you'd actually be able to debug properly. You'd be able to do things like 'find references' properly, even if those references were in generated code. etc. etc.

If you don't need or care about a good tooling experience, then it's not clear to me why you can't just use the existing functionality Roslyn has for manipulating trees and emitting IL.

CyrusNajmabadi on 5 Dec 2016

@CyrusNajmabadi:

Why do you need anything added to Roslyn to do this?

Because it would be great, if this feature will be part of regular development process.
The same as code analyzers do: you can write your own analyzer, and it can propose you code fixes. You don't need to run some hacky external tool - just follow the specification, and you'll get what you want.

This is the same feature, but it has different goal - to make some final tuning of the code before it will be compiled by the same good Roslyn, but hide changes from the user. These changes usually will emit some boilerplate code, like logging, InotifyPropertyChanged, pre- and post-conditions checking. It's undesirable to show these changes to user.

Dennis-Petrov on 5 Dec 2016

It must be some build-in functionality in default VS build function, so people will start to write some useful rewriters to the Nuget. And other people (not so good in Roslyn API) will start to learn and use this rewriters commonly.

xiety on 5 Dec 2016

@xiety I believe everything i've mentioned is built-in today.

CyrusNajmabadi on 5 Dec 2016

Because it would be great, if this feature will be part of regular development process.

I still literally do not know what you are asking for. Right now it seems to be like you are asking for the ability to rewrite trees, then compile them. I'm stating that such capabilities are available today. Roslyn already exposes the APIs to do this. And through bog standard .net project capabilities, you can put this together.

CyrusNajmabadi on 5 Dec 2016

You don't need to run some hacky external tool

I don't understand how it's hacky to write a tool that uses the existing Roslyn API. But it's not hacky to write a library that uses some new Roslyn API.

In each case you're writing code that integrate into the Roslyn object model, manipulates it, and produces results.

Because it would be great, if this feature will be part of regular development process.
The same as code analyzers do: you can write your own analyzer, and it can propose you code fixes

This is not the same. In particular, there was no existing API for producing code fixes. There are existing APIs for the manipulation of syntax trees, and subsequent emitting of them.

CyrusNajmabadi on 5 Dec 2016

@CyrusNajmabadi

@vbcodec I don't know what you're asking for.

I mean small facility that invoke my provided code, when function / property / event is called. I do not want to work with any IL code reweriters or source generators or replace/original feature, as this small facility will make almost everyone happy.

vbcodec on 5 Dec 2016

I still literally do not know what you are asking for. Right now it seems to be like you are asking for the ability to rewrite trees, then compile them. I'm stating that such capabilities are available today. Roslyn already exposes the APIs to do this. And through bog standard .net project capabilities, you can put this together.

I think what's being asked is that just like generators, when you build, those rewriters should be triggered i.e. pre-build rewriters which operate on trees compared to post-build rewriters which operate on il.

alrz on 5 Dec 2016

👍2

I still literally do not know what you are asking for.

@CyrusNajmabadi, I assume that when you are saying that we can do this already today, it is by rewriting a small csc frontend to Roslyn, right? (though in order to integrate this, you have also to plug into the msbuild system directly)

What @Dennis-Petrov is asking is the existing csc accepting an option like csc --rewriter:MyAOPRewriter.dll (coming from nugets rewriters installed in the project)

xoofx on 5 Dec 2016

👍4

@vbcodec I see nothing stopping you from doing that today :)

CyrusNajmabadi on 5 Dec 2016

@CyrusNajmabadi :

OK. Could you provide simple step-by-step guide, what should I do today to use re-writer from my post from above? (void Foo([NotNull]object o))? Maybe, I just didn't understand you.

I mean, imagine, that I wrote that re-writer. How can I plug it into some console app project to get it fired on F5 hit?

Dennis-Petrov on 5 Dec 2016

@CyrusNajmabadi
how ?
I want my code to remain unchanged

vbcodec on 5 Dec 2016

@alrz

I think what's being asked is that just like generators, when you build, those rewriters should be triggered i.e. pre-build rewriters which operate on trees compared to post-build rewriters which operate on il.

That seems to be already possible today.

As mentioned already, Roslyn already does something like this when you download and build it. The build triggers tools to go and produce code which then feed into the compilation pipeline.

CyrusNajmabadi on 5 Dec 2016

@CyrusNajmabadi, It is possible but not so obvious and not common. Look at the Fody with it's great community and enormous number of addins. If everyone will need to write it's own tool for post build step, it will never be widespread.

xiety on 5 Dec 2016

@Dennis-Petrov Just as they would have to add an analyzer reference, you would have to package your tool up with a build task that invoked it and allowed you to process the source code and do whatever rewriting you wanted.

CyrusNajmabadi on 5 Dec 2016

If everyone will need to write it's own tool for post build step, it will never be widespread.

And yet, if it's a .dll instead of a .exe, then everyone will? :)

--
My core point is this:

Right now you're not asking for anything new from Roslyn. If you want such a capability, it seems like you (or someone else in the community) could just invest in writing a new tool called "STody". STody would be used for the same sorts of things that Fody is used for. Except it would operate on SyntaxTrees. You could have a plugin model and then see this get widespread adoption.

I even outlined precisely how the tool would be written. We've given all the building blocks in Roslyn to enable it. And we exposed these building blocks so that the community could pick up these sorts of tasks, instead of us just continually jamming things into the core compiler. That's why Roslyn is useful as an actual API, instead of just wanting it to be a compiler that people add more and more stuff into.

Now, we recognize that what i discussed above is pretty easy. The hard part comes when you want the above to have a great tooling experience in IDEs like VS. When you get to that level of difficulty, that's when we believe that we should really invest and do this sort of work as it's just so much harder to do externally.

CyrusNajmabadi on 5 Dec 2016

👍1

Whether I completely agree or not, I think what @CyrusNajmabadi is saying is there's no point making it a simple IDE experience to add rewriters, automatically hooked up like analyzers via NuGet, unless you're going to go all the way and have the IDE respond to the changes made so that for instance flow analysis knows that the null check has already been done because of the [NotNull].

jnm2 on 5 Dec 2016

Now, we recognize that what i discussed above is pretty easy. The hard part comes when you want the above to have a great tooling experience in IDEs like VS. When you get to that level of difficulty, that's when we believe that we should really invest and do this sort of work as it's just so much harder to do externally.

Exactly. I'm thinking that this is not a compiler request, rather it's about Roslyn itself and generators' API in particular. Though a similar IDE experience would be nice, just like what we would have with generators.

alrz on 5 Dec 2016

On the other hand, no one wants to be modifying their project files to msbuild script the rewriting. That seems reasonable. I myself see that as an onerous and bug-prone job, and I'm not even in the anti-msbuild camp.

Is there a middle ground?

jnm2 on 5 Dec 2016

Wouldn't you have to modify the project no matter what? First, to add the nuget package reference. Second, to tell the compiler to use the 'generator' plugins from that package (like how you have to tell it to use the 'analyzer' plugins).

CyrusNajmabadi on 5 Dec 2016

Yes. But declaring a reference is different than implementing the logic for the feature in msbuild.

jnm2 on 5 Dec 2016

I really have to agree iwth @CyrusNajmabadi here. As much as I want a feature like this (and I really do want it, as soon as possible): I think the team should take the time and do this properly, with the tooling experience one knows from Microsoft, instead of hastely hacking together a half-hearted (or half-featured) solution. Do this right instead of doing it fast.

MartinJohns on 5 Dec 2016

👍2

@jnm2 In one case you add <BeforeBuild>. In another, you add <Analyzer> :)

CyrusNajmabadi on 5 Dec 2016

This is too good an opportunity for me to miss:

I too agree with @CyrusNajmabadi here. Generators, properly integrated into the IDE/intellisense/debugger as well well as the compiler chain, are what's needed. Not a "hastily hacked together and half-hearted solution".

DavidArno on 5 Dec 2016

👍6 😄1

@CyrusNajmabadi I'm trying to follow. I know I'm not much worse than average intelligence. :) What does the <BeforeBuild> that you add do in order to accomplish the rewrite? If it invokes a tool, how does that tool pass a modified syntax tree to Roslyn?

In an ideal world, I'd rather have all this stuff hidden behind an <Analyzer> or <Rewriter> declaration.

jnm2 on 5 Dec 2016

So, let me step back:

I want Roslyn to work on features that are, for all intensive porpoises, far too difficult for the community to provide on their own. Things can be too difficult for many reasons, including (but not limited to):

The appropriate information is just not exposed from the system.
The feature would require work from teams outside of Roslyn (like the debugger team). Getting such work to happen without Roslyn directly driving things is extremely difficult.
The feature cross cuts large swaths of the Roslyn IDE experience (i.e. it would take updating IntelliSense, refactoring, etc.). In this case, it's just an enormous undertaking, and i woudl not expect the community to have to shoulder that work.

If something is simple, and is just a matter of writing up a little tool with existing shipped features of Roslyn, there's much less motivation on my part to have Roslyn take on that work.

Indeed, as ecosystems like Fody have shown. If you have the APIs and capabilities exposed, it's a perfect opportunity for the community to pick things up and encourage others to enhance.

CyrusNajmabadi on 5 Dec 2016

@jnm2

If it invokes a tool, how does that tool pass a modified syntax tree to Roslyn?

Roslyn is just a set of APIs :) The "csc.exe" executable is pretty non-special. It just uses Roslyn's exposed APIs to effective do nothing more than:

Read in command line arguments.
Parse source files and create a Compilation object.
Add metadata references to that Compilation.
Compute diagnostics for that Compilation.
Emit that Compilation

That's really it. Because these are all just APIs, you can call all of them yourself.

Or, alternatively, if you didn't want to explicitly emit. You could just write out the source files to some location (likely the 'obj' dir). Your msbuild task would then add those generated items as <Compile> items to the csc task.

CyrusNajmabadi on 5 Dec 2016

@CyrusNajmabadi If I emit, I have to hook into all of the references and resources etc which msbuild .targets gather. That sounds nontrivial. If I simply transform .cs files and output to a temporary location, it still requires knowledge of the msbuild system which I don't have to get the csharp .targets to ignore the original sources and use my temp folder instead. And still get resources etc correct.

jnm2 on 5 Dec 2016

it still requires knowledge of the msbuild system which I don't have.

In order to use any of the systems being discussed, you would need to gain knowledge. :)

The point is taken that one might not want to learn anything about how to do this in MSBuild. But, presumably that's going to be done by the main group providing this system. Just as i presume you can use something like Fody where all the heavy lifting has been done, you would use STody in a similar manner.

I just don't see a lot of value in Roslyn going and doing this. The reason that we produce APIs, and why MSbuild produces APIs, is precisely so that the developer ecosystem out there can utilize them and combine them in these sorts of ways. If basic jobs like this are not something the community is even willing to do, then we'll find ourselves in Roslyn doing nothing but biting off these small projects instead of working on the things that would be really quite difficult for the community to actually provide.

CyrusNajmabadi on 5 Dec 2016

@CyrusNajmabadi Standardization exactly. Say this whole thing relied on a group maintaining an alternate .targets. They will forever be playing catchup with the normal msbuild csharp .targets. There will be duplication of work and inconsistencies and time wasted. I think it would be better for features like this to be given thought by Microsoft with every update.

jnm2 on 5 Dec 2016

I'm happy to have the community do that. The lack of standardization does not seem to have been a hindrance to tools like Fody.

Furthermore, i actually worry about us 'standardizing' this. Let's take a simple case: there are two rewriters that want to update a method body. What order should they run in? How do you express constraints between them? Should writers run until you reach some sort of fixed point? etc. etc. etc.

The moment we pick anything ourselves, we cannot change it or break it in the future. The community has a far better ability to go with different ideas and rev on this stuff.

CyrusNajmabadi on 5 Dec 2016

I do not think Roslyn shoudl be in the job of taking any combination of .Net tools and just adding it into the core compiler. Every time anyone in the ecosystem does this sort of thing, one could argue "roslyn should do this, so it is standardized".

CyrusNajmabadi on 5 Dec 2016

@CyrusNajmabadi at a minimum, you could unblock the ability to hook in and transform syntax trees in-process right before compilation, without maintaining duplicated targets, and still leave all the responsibility with the community.

jnm2 on 5 Dec 2016

👍1

@CyrusNajmabadi If I emit, I have to hook into all of the references and resources etc. which msbuild scripts gather. That sounds nontrivial.

The plumbing tasks Fody handles

Injection of the MSBuild task into the build pipeline

https://github.com/Fody/Fody

I do not see why STody could not do the same.

CyrusNajmabadi on 5 Dec 2016

Let's take a simple case

Aren't all of those also applicable to generators? I think it would make the most sense to consider this use case to shape generators' API as it's the best place to put all the things together.

alrz on 5 Dec 2016

👍1

@CyrusNajmabadi at a minimum, you could unblock the ability to hook in and transform syntax trees right before compilation and still leave all the responsibility with the community.

No. We can't do that. The moment we do that then we take responsibility for that scenario. That means, for example, supporting it for the MS support lifecycle. We don't get to inject things into Roslyn, then wipe our hands of it.

I mean, imagine if we did that, and said "hey community, this is your responsibility now". Then we broke it a couple of years down the line. That would be terrible.

CyrusNajmabadi on 5 Dec 2016

Aren't all of those also applicable to generators?

Yes. Another reason why generators did not make it into VS17 :)

As an aside, we do not do things in an off the cuff manner. If we're going to work on a feature, we're going to really beat on it to make sure that it's of a design quality that we think is worthwhile. If it can't meet that bar, then we're not going to do it. While that might be disappointing to people who have an itch that this would really scratch, it's vastly better than having Roslyn be littered with a bunch of bad choices and poorly thought through designs.

I mean, i'm sure you all can imagine that sort of thing happening elsewhere. You may have already experienced it yourself. I'm loathe to let that happen to Roslyn :)

CyrusNajmabadi on 5 Dec 2016

❤1 👍1

@CyrusNajmabadi Yes, a minimum amount of support. An "If --experimental-rewriter exists, load and invoke it on the syntax tree."

jnm2 on 5 Dec 2016

@CyrusNajmabadi

I do not see why STody could not do the same.

Fody takes the already compiled assembly, modifies the IL and re-writes the new IL to the same file.

The feature people want here is an option to intercept and modify the syntax trees before they're passed to the default compiler. This is currently not possible with the given API. There is no way to pass modified syntax trees to the compiler. You would either need to create new source files (which does not work when you want rewrite existing source files), modify existing source files (a source control nightmare) or fork the compiler and provide a custom possibility to pass the syntax trees (which defeats the point of using the default compiler).

This is not considering the tooling support.

MartinJohns on 5 Dec 2016

@CyrusNajmabadi Fody doesn`t replace csc.exe - it just adds a after build target. So I don't think fodies msbuild files can help here .

@MartinJohns

I think we should take these solutions and improve upon them:

https://github.com/antiufo/roslyn-linq-rewrite

Basically it looks like we can replace csc.exe via CscToolPath ....

See also this previous comment (- https://github.com/dotnet/roslyn/issues/3248):

The CscToolPath causes a single VBCSCompiler.exe to start up when you first compile (which takes a couple of seconds), but it reuses that for subsequent builds.

So everything should be there - we just have to start building a new ecosystem of plugins.
The only thing we have to "fork" and modify is csc.exe, which isn't a lot. See also:

https://github.com/antiufo/roslyn-linq-rewrite/tree/master/RoslynLinqRewrite/dotnet-compile-csc-linq-rewrite

davidroth on 5 Dec 2016

@CyrusNajmabadi Fody doesn`t replace csc.exe - it just adds a after build target. So I don't think fodies msbuild files can help here .

I was using Fody as an example of how you can distribute a package that takes care of the annoying MSBuild integration problem. Clearly, it's possible.

CyrusNajmabadi on 5 Dec 2016

👍1

@CyrusNajmabadi Yes, a minimum amount of support.

A minimum amount of support is by requirement the MS support lifecycle. That's the minimum. That's why i recommend the community pick this up.

CyrusNajmabadi on 5 Dec 2016

@CyrusNajmabadi Minimum _scope_ to support. You load a single DLL, if specified, and call a well known static function that takes a syntax tree and returns a syntax tree. Yes, this takes a bit of planning, but it doesn't require having IDE support designed and shipped before it can be used. It can be done in isolation.

jnm2 on 5 Dec 2016

The feature people want here is an option to intercept and modify the syntax trees before it's passed to the default compiler. This is currently not possible with the given API. There is no way to pass modified syntax trees to the compiler.

I do not see why that is the case. You have your task run your tool run first, generate the modified sources into some directory (likely 'obj') and then have the task modify the msbuild c# task so that it points at whatever files you think are appropriate for Roslyn to compile.

Fody doesn`t replace csc.exe - it just adds a after build target.

Yes, and in my example, you'd have a target which ran before build, did whatever generation it needed, and then updated the msbuild properties so that when vbcscompiler was invoked it was passed the data you wanted.

CyrusNajmabadi on 5 Dec 2016

It can be done in isolation.

It can be. It shouldn't.

Those things should be planned and thought through, not rushed and quickly implemented because it can be done. Cyrus already mentioned it, and most people know that Microsoft is very keen on keeping compability with earlier versions. So if they add this feature now as you want it, they have to stick with it, they have to support it.

MartinJohns on 5 Dec 2016

👍1

@CyrusNajmabadi Minimum scope. You load a single DLL, if specified, and call a well known static function that takes a syntax tree and returns a syntax tree.

And immediately the next request is "but i need access to semantics to do this work properly". And then the next is "but i need all these functions to run after these otehr functions." And then the next is "i need the functions to run until no more changes happen.

In this very thread, we have all sorts of examples where a pure syntactic translation would not be sufficient :)

CyrusNajmabadi on 5 Dec 2016

but i need access to semantics to do this work properly

Very true.

And then the next is "but i need all these functions to run after these otehr functions." And then the next is "i need the functions to run until no more changes happen.

All can be handled from inside the DLL you pass, no?

jnm2 on 5 Dec 2016

It can be done in isolation.

No, it really can't be. Again, we now have to support this design for the entirety of hte MS support lifecycle. That means we can't change that. That puts an enormous constraint on all future work here. Say we actually come up with a design we like better later on. Now we have to support both designs. Now we have to consider customers who are going to work with both. etc. etc.

CyrusNajmabadi on 5 Dec 2016

👍1

Because of msbuild interop, sounds like the most efficient and forward-compatible plan is to write your own csc, and make it pluggable yourself, and use a targets line in the project file to get msbuild to use your csc. Sounds like none of these things are hard and all can be accomplished using NuGet references today.

It would be nice if there was an official sample of this (or a better tactic if one exists) so that people like me would be able to get started composing Roslyn right away and not have to debug what we missed in msbuild scripting. And so that Microsoft can keep an eye on not breaking that scenario with new msbuild targets.

jnm2 on 5 Dec 2016

Very true.

Ok. So now you have semantics. So now you have to decide how that works. You're going to have customer who say "i want to get the initial semantics, and then use that when i process each file". Another customer will say "after i update a file, i want to know the new semantics, as that will affect the other files i'm changing".

Then you have the case where you have multiple dlls (which of course people will want). One customer will say "ok, run the function from A.dll first. Then the function from B.dll". But B.dll will want the semantics of all the files before B runs. Another customer will want A.dll to run across all files first, then have B.dll process all those results.

etc. etc. etc.

This is not something you whip up as a prototype and then ship out into the world. It needs a real forward thinking design, and it has to acknowledge and plan for the complex scenarios that people absolutely do want.

CyrusNajmabadi on 5 Dec 2016

👍7 ❤1

write your own csc, and make it pluggable yourself

@jnm2 that's exactly what we do in StackExchange.Precompilation.

@CyrusNajmabadi the issue with doing your own CSC is that you loose all of the shared compilation optimizations that CSC does to make itself load up fast. Any way that framework can be made more public?

m0sa on 5 Dec 2016

👍2

the issue with doing your own CSC is that you loose all of the shared compilation optimizations that CSC does to make itself load up fast.

Are you referring to VBCSCompiler? If so I think there is a lot of misconception about what benefits this brings to the compilation pipeline.

If you're referring to startup time of the compiler as installed by Visual Studio then VBCSCompiler doesn't make an appreciable difference here. The compiler when installed by Visual Studio is NGEN'd. Hence startup time of the compiler is pretty much negliable as compared with the IPC overhead of VBCSCompiler. True VBCSCompiler will be strictly faster if you measure it, but not at a level that is really appreciably by end users.

The main benefit the VBCSCompiler implementation provides is for batch compilation. Essentially when an MSBuild operation would kick off two or more compilation actions. There are a couple of benefits this provides:

Caching parsed DLL symbols: for example why parse mscorlib.dll once per compilation when you can parse one and re-use the immutable symbols?
Re-using other caches that are lazily populated and not symbol related (tokens, shared sources, etc ...)

Any way that framework can be made more public?

Can you be more specific? The VBCSCompiler source is public and if you're building your own csc you would presumably also be building VBCSCompiler along with it.

jaredpar on 5 Dec 2016

@jnm2 The easiest way to do it is to define the _CscToolPath_ property in a property group in your csproj file, which is the path to the directory containing your custom csc.exe, not the path to the file itself. If you want to keep the original csc.exe in place and make a second one for some custom behavior, you can do that too:

<CscToolExe>csc2.exe</CscToolExe>
<CscToolPath>..\..\External\Compiler</CscToolPath>

Inverness on 11 Dec 2016

Analyzers have ~80% of the plumbing for this feature to run inside of the IDE and in csc. That said, there are certain kinds of transformations that can completely ruin most of the IDE experiences including intellisense, formatting, debugging, refactoring etc. We decided to just punt on those when we did compile modules in DNX. One way to think about it is that by enabling this feature, every C# file turns into a razor file (I'm on the ASP.NET team so there's my analogy 😄). Every modification to a C# file now needs to generate the real C# file and that's what matters for compilation (the first C# file is just input), and line pragmas have to match up so that debugging works... (that's just the tip of the iceberg).

That said, people have been doing hacks (like double compiling) to do extra code generation before compile and it hasn't been great. With things like .NET native on the rise, features like this become super important as code generation shifts from a runtime feature to a compile time feature. If there was a first class way to do that as part of a library that didn't require components to each repeat the same boilerplate logic (get references, get sources, make a compilation, do magic, spit out new code), that would be great.

davidfowl on 20 Dec 2016

👍7

Analyzers have ~80% of the plumbing for this feature to run inside of the IDE and in csc.

It's important to separate out the plumbing though. The compiler plumbing for analyzers and generators in the is virtually identical. Generators from that respect are actually pretty boring feature: take a compilation and add new source to it. The only real decisions to be made is what the command line argument is, ordering and whether or not to allow recursion. That part of the prototype was done in a very short time.

The IDE plumbing that exists today for analyzers is not really usable for generators. From an IDE perspective they're solving very different problems:

analyzers: tell me what is wrong with my code by adding squiggles and populating the error list.
generators: produce new code for my PE which can be leveraged by existing code.

The user expectation around diagnostics is that they can happen later. There has always been a small delay in between typing and getting squiggles. This gives the IDE a lot of flexibility when hosting analyzers. In particular there is no reason that actions like Intellisense need to be blocked on the result of an analyzer. They can be run after the user has paused for a small amount of time, in the background, with no appreciable loss to the development experience.

The user expectation around code is much different. The IDE should be a window into the code that makes up my project. If Intellisense / quick info / etc ... isn't showing me the state of the code that's being compiled then it subverts the users expectations (and it will be seen definitively as a bug). This really constrains how the IDE can host generators because they need to execute as a part of the core experience.

Take for example extension methods. Generators adding extension methods to the compilation is a core scenario. What good are these extension methods though if developers can't see them in intellisense? When stepping into them during debugging what will the user think if they get dumped into disassembly instead of C#? Basically they'll think the experience is broken and won't feel very C# at all. The only way to fix this is sound IDE integration and that is a large problem (which we did make considerable progress on).

If there was a first class way to do that as part of a library that didn't require components to each repeat the same boilerplate logic (get references, get sources, make a compilation, do magic, spit out new code), that would be great

Agreed. This is why we spent a lot of time digging into this feature. We did make a lot of progress around the IDE experience and getting the core scenarios fleshed out. But in the end we did find there was considerable IDE work needed to make this a viable feature.

Another option we discussed was to simply move this out of the core compiler. Instead move it to say the MSBuild pipeline. The same generator experience can be defined in MSBuild but it doesn't have the same expectations around an IDE experience.

jaredpar on 20 Dec 2016

👍5

For those of us interested in AOP, I think the best thing to do now is just have someone fork Roslyn, add code for an "advanced code generator" or something similar that gives full control of the compilation object, and package that as a NuGet package that replaces CscToolPath with its own path.

Then we can all develop our AOP style code generators against that.

I should also note that I experimented with something like this myself with Roslyn at one point. I used line directives to ensure that the debugging experience was not broken based on my additions to the syntax tree.

Inverness on 21 Dec 2016

@Inverness https://github.com/StackExchange/StackExchange.Precompilation already does just that.

mausch on 21 Dec 2016

🎉2

Another option we discussed was to simply move this out of the core compiler. Instead move it to say the MSBuild pipeline

Debugging of generated code inside IDE is important. As far as I understand, this will be unavailable in case of MSBuild.

Dennis-Petrov on 22 Dec 2016

Another option we discussed was to simply move this out of the core compiler. Instead move it to say the MSBuild pipeline. The same generator experience can be defined in MSBuild but it doesn't have the same expectations around an IDE experience.

I would start with that but still have a hook at the csc level (similar to analyzers). That makes it run at the "right" time. This is essentially what compile modules did (ignored the IDE 😄 because it was hard.).

davidfowl on 22 Dec 2016

^^ saying that though, you could debug them by throwing in a Debugger.Launch and attaching Visual Studio - wasn't perfect, but was doable.

Antaris on 22 Dec 2016

Debugging of generated code inside IDE is important. As far as I understand, this will be unavailable in case of MSBuild.

That should work fine actually. The MSBuild approach would augment the compilation with additional files. As this isn't done in the compiler these files would need to reside physically on disk (likely in the obj folder). Hence debugging would just work.

jaredpar on 22 Dec 2016

Could something like CallerMemberNameAttirbute be implemented by generators? I believe caller info attributes are just a matter of code generation and can be done outside of the compiler, however, it needs "inspecting" the code, rather than adding a compilation unit and replacing members in declaration-site i.e. replace/origin. I'm sure a lot more interesting AOP scenarios can be implemented with generators if said API exists.

alrz on 22 Dec 2016

@alrz

Could something like CallerMemberNameAttirbute be implemented by generators?

There are two types of generators to consider:

Augmenting: this simply adds additional source files, or possibly references, to the compilation.
Modifying: this can modify any aspect of the compilation including source, references, etc ...

An modifying generator would be able to implement CallerMemberNameAttribute. It has the ability to modify the source file where the user added the attribute and hence can add the necessary information.

An augmenting generator would not be able to. It can only add source files hence can't modify the file authored by the user where the [CallerMemberName] annotation was used.

Note that when I've discussed generators on this thread I've mostly been talking about an augmenting generator. Those IDE problems I've discussed for augmenting generators pale in comparison to the challenges faced by a modifying generator. How for instance do you design a rational IDE experience around a plugin that can virtually erase the keystroke you are currently typing in the emitted binary? It's quite daunting and likely there is no sane possible experience.

These problems are why the compiler team eventually took on a two prong solution: augmenting generators + language features to make generators more powerful. The latter has been done in the past (think partial types and methods). The original / replaces model extended that to allow a lot more flexibility.

jaredpar on 22 Dec 2016

@jaredpar

How for instance do you design a rational IDE experience around a plugin that can virtually erase the keystroke you are currently typing in the emitted binary?

It modifies the AST that is handed to the emitter to generate the final binary. I think this shouldn't go back and forth in the same assembly boundary. For this particular scenario, modifying the invocation wouldn't even affect other parts of the code, so there is no need to know what has been changed. I agree in any other cases that need dramatic changes to members declarations, replace/original can do a better job.

alrz on 22 Dec 2016

@alrz

It modifies the AST that is handed to the emitter to generate the final binary

Sure but what does Intellisense say? How does debugging work? The final syntax tree will be, possibly, very different than what lives in your source repo. When you F5 and step into that file what happens?

jaredpar on 22 Dec 2016

@jaredpar Right, the only observable thing for CallerMemberNameAttribute usages in debugging is the value passed to the method. However, if that was not a constant it wouldn't work well in debugging.

alrz on 22 Dec 2016

@alrz it's also affects all call sites, e.g.

Foo(); Bar();

where

public static string Foo([CallerMemberName] string caller = null);

So when you're debugging the first snippet, and your breakpoint is on the Foo(); invocation, after step-over - F10 , you need to land on Bar();, but if the rewritten source line is

Foo("myMember"); Bar();

Visual Studio would highlight Member, inside the string... That's why you'd also need to add #line directives, which is not the most straightforward thing to do. And this is just the tip of the iceberg...

@jaredpar I think having something like a structured representation of the source map when rewriting SourceTrees (either on the SourceTree, or on the entire Compilation) would solve a lot of problems we face RE proper debugging support. Having dealt with it in both C# and JavaScript, I must say I think the JS SourceMap approach is superior to what we currently have in C#. I think it shouldn't be to difficult to adjust the map automatically for unmodified parts of the SourceTree on modifications, since we already have a structured representations for changes.

m0sa on 22 Dec 2016

@jaredpar
Augmenting code is pretty limited and have high cost (creating generators and significantly increasing amount of code).
Better ways is move from replace/original + generators to pluggable solution driven by attributes. This will allow to intercept methods, properties and events and call provided code. Additionally, after implementing #6671, there will be possible to dig into methods and track / modify default code execution:. For example

[MyLoopMonitor()]
foreach (string x in coll)
{
    ...
}

Where MyLoopMonitor class can track and modify x variable and finish loop enumeration at any time (revolution).

vbcodec on 23 Dec 2016

@mattwar There is/was an interesting, but unknown project called Genuilder, written which hooked into a pre-compile event in msbuild, and passed the source tree to custom c# generators (which are in standalone dlls), which could then output extra source code. Here's a repo using it: Magic.

The classes would get picked up by VS and you had working intellisense for generated code within the same project (R# intellisense wouldn't pick up the generated classes though).

mcintyre321 on 24 Jan 2017

Just an idea, talking about modifying generators, as @jaredpar classified them.

What if there will be some "special" type of project item, which can be edited by user as regular C# source file, using Intellisense, code analyzers, syntax highlighting, etc, but this project item will be able to run generators to get actual source code to compile? It's some sort of "advanced T4 template".

This project item must not allow to modify its source code, typed by user, directly from generator - modification must be applied to the output. Also, debugger must step into actual, modified source code. This could solve problems with IDE experience, since there won't be any code (user typed) modifications "on-the-fly". Also, user can see, what he gets from generators - this will decrease level of "magic", brought by one or another generator.

There are some things to think about in context of breakpoints - user can put breakpoint at line, which will be absent in output, but this could be solved by disabling such breakpoints.

What do you think about that?

Dennis-Petrov on 24 Jan 2017

Which is more or less what this does: https://github.com/AArnott/CodeGeneration.Roslyn

/cc @AArnott

kzu on 24 Jan 2017

There's also Scripty, which is similar: https://github.com/daveaglick/Scripty

reduckted on 24 Jan 2017

I'm almost sure, that there are number similar 3rd party tools.

The basic problem is that they are 3rd party tools. The probability of abandoning them is rather high. Moreover, if this is non-commercial projects for contributors, I'd afraid to bring them into real projects. E.g., both mentioned projects have less than 70 commits. This is negligibly small. Compare number of commits to alive and popular projects, like Autofac: https://github.com/autofac/Autofac.

Remember Code Contracts?

IMHO, tools with impact like this should be an official part of .NET ecosystem.

Dennis-Petrov on 25 Jan 2017

@CyrusNajmabadi

Thanks for all the time you spent this December patiently explaining your POV on this. It's certainly of value for stakeholders such as me and my company, trying to assess the likeliness of source generators to be realized some time soon.

If you'd find a minute or two to spare at some point, I'd highly appreciate your view on my recent question here.