Powershell: How to write cross-PowerShell tools in .NET, particularly AST visitors?

Created on 22 Oct 2019  路  21Comments  路  Source: PowerShell/PowerShell

Now that PowerShell has added more syntaxes and more features in other places, and also targets .NET Core 3.0, a challenge is apparent to maintainers of tools, in particular binary modules, that target multiple PowerShell versions.

For example, PSScriptAnalyzer currently targets:

  • PowerShell Standard with netstandard2.0 for PowerShell 6+
  • PowerShell reference libraries with net452 for Windows PowerShell 5.1- (it actually ships one set of DLLs per PowerShell version, making the module quite large)

But in PowerShell 7, new syntactic structures with their own ASTs like ternary expressions and pipeline chains have been added.

PSScriptAnalyzer can't compile an AstVisitor2 overriding VisitTernaryExpression() targeting PSStandard. So then perhaps it can target the PowerShell 7 SDK? Not without retargeting to .NET Core 3.0 from .NET Standard there.

So a list of some choices I see:

  • Continue with PSStandard and .NET Standard and be blind to new syntax. This is particularly unfriendly to PSScriptAnalyzer.
  • Add a .NET Core 3 compile target. For PSScriptAnalyzer this might be palatable, but in a case like PSES where the explicit move was made to PSStandard, this recomplicates the code and the build again, and if we support a .NET Core target, why not a .NET Framework one?
  • Build tooling to dynamically create the relevant methods, while still compiling for .NET Standard. This option prevents build difficulties, but makes writing the code much harder and more dangerous. More importantly, it's something that third party tool creators would have to grapple with (we would need to publish tools to make it possible).
  • Include new APIs in PowerShell Standard despite them not existing in old PowerShell versions. This would go against the idea of PSStandard, and is particularly bad since the old platforms won't throw nice exceptions (unlike in .NET Standard where it's the new platform that threw the exception)
  • Refactor relevant public parts of PowerShell (like the parser) to make it work with .NET Standard as a standalone (so for example, PSSA could embed it rather than relying on the current PowerShell version's parser). This would (1) be a fair amount of work, (2) mean removing .NET Core-specific code, and (3) change the mandate of these public components from "works in PS 7" to "is backward compatible while still exposing new features"

None of the choices above stands out to me, although there might be others.

TL;DR: What is PowerShell's story for exposing new important APIs to .NET libraries targeting multiple PowerShell versions?

This question affects tools like PSScriptAnalyzer and PowerShell Editor Services, but also any other .NET library targeting multiple PowerShells, in particular binary modules and syntax-facing libraries.

/cc @JamesWTruher @bergmeister @TylerLeonhardt @SeeminglyScience @daxian-dbw @joeyaiello @SteveL-MSFT

Issue-Question WG-Engine

Most helpful comment

I'm going to play the role of API consumer who doesn't care about the implementation details:

Obviously ScriptAnalyzer running in PowerShell 5 doesn't _necessarily_ need to understand syntax that won't work in PowerShell 5 -- but if all of these sorts of tools have to change which API they call in order to support new syntax, that puts a lot of burden on developers, and if those new APIs only work on new compilation targets, that puts a lot more burden on them, forcing the maintenance of maintain multiple incompatible versions of the tools.

  • Is impact on tool developers part of the conversation in the development of features? If not, why not?
  • What part of the API surface should remain backward compatible?
  • What justifies changing the surface in a breaking manner?
  • How much bigger is the justification for changing compilation targets?

In terms of this specific type of problem:

  • Is there any value in keeping an old parser API if it can't handle the current syntax?
  • Can't new tokens be added without breaking changes to the parser API?
  • Shouldn't parsers, specifically, be down-level compatible, at least to the previous version?

All 21 comments

It's worth pointing out a refactor could also mean fork - as in - 2 distinct code bases. This could be the simplest option knowing the language is somewhat stable and porting parsing changes should be relatively easy.

@rjmholt We already discussed this in #7408. Dup?

Given that most of the recent changes to the grammar have been purely additive, I'd be inclined to say:

  • Revisit the decision to change AstVisitor2/ICustomAstVisitor2 rather than extending
  • Refactor PS7 Parser implementation with optional "version-awareness".
  • If user specifies a previous version of the language: emit parse time errors when encountering syntactical components or constructs that fall outside the <7 grammar superset

(yes, I know there's _a lot_ of work to be done to achieve that) :)

@IISResetMe

Revisit the decision to change AstVisitor2/ICustomAstVisitor2 rather than extending

I would say that it was extended, current visitors will continue to work for the most part, just ignoring the new elements. The problem is how do you actually account for those new syntax elements without explicitly targeting netcoreapp3.0 and referencing the full SDK.

@lzybkr

It's worth pointing out a refactor could also mean fork - as in - 2 distinct code bases. This could be the simplest option knowing the language is somewhat stable and porting parsing changes should be relatively easy.

Yes please! A separate, publically consumable, standalone package for parsing would be phenomenal. Bonus, if there are any plans to make breaking changes to AST structure or anything like that (e.g. spanifying/adding trivia) those changes could be demoed much easier in a separate library.

A separate, publically consumable, standalone package for parsing would be phenomenal.

Personally this is my ideal, to have a parser that will parse all PowerShell syntax ever valid, and then for the appropriate runtime to decide whether or not it supports it. (Maybe you won't have a runtime at all, because you're a linter...)

But that's a pretty significant change, and we'd need to identify some intermediate steps to make it possible. In the case of Java for example, this was enough work that they built the Eclipse compiler rather than refactor javac.

So really I think we should work out what the next best plan is if those changes can't be made in PowerShell. Because with PS 7 in LTS, libraries are going to need to support it for some time to come.

@rjmholt We already discussed this in #7408. Dup?

There's definitely overlap, but the scenario here is different and broader. There's a larger number of people for whom addressing this issue is going to be important.

From #7408 we see that decoupling Parser in any way is a lot of work. Also we have to sync projects, I guess, manually only. This looks too complicated and unreliable given that over time versions gap will widen, new tools, versions and targets will appear.

There was an idea (from Jason?) to implement hosting model to speed up startup scenario. We could adopt the idea for the Parser scenario too. This implies the creation of an intermediate API (REST?) which can easily be PSStandard 2.* compliant. Since these will be separate processes (Tool and Parser) there will be full compatibility with any version (Tool <-> Parser).

@SeeminglyScience

I would say that it was extended, current visitors will continue to work for the most part, just ignoring the new elements.

I think I probably should have qualified "extended" in this context: https://en.wikipedia.org/wiki/Open%E2%80%93closed_principle

Observing the Open/Closed principle would have made this a non-issue:

The problem is how do you actually account for those new syntax elements without explicitly targeting netcoreapp3.0 and referencing the full SDK.

:-(

I think @SteveL-MSFT and team were working on putting together reference libraries for v7 (a la PowerShellStandard.Library) which might make it a bit easier, but even then it's a bit tricky to work with supporting both, for sure.

@IISResetMe Yeah that'd be awesome. Can you think of a way to apply that to this without taking a large performance hit (and still maintaining backward compatibility)?

They have added another method to the visitor classes (at least, I believe so, correct me if I'm wrong @rjmholt @daxian-dbw) called VisitOther or DefaultVisit that sort of allows that. Though if you want to actually account for the new syntax type, you'd probably need to use reflection. Of course since that will be new in PSv7, it doesn't really help this specific scenario though.

@SeeminglyScience The only gotcha is that we obviously complicate the type hierarchy every time we expand the grammar (I discussed with @rjmholt lumping abstract member ast types for a future interface/abstract type implementation in at the same time for this reason, although we didn't really reach a satisfying conclusion).

WRT version-aware parsing, let me throw together a simple example, will link it here later tonight

One possibility that strikes me for the AstVisitor problem at least is to have a targetable ICustomAstVisitor2 with all the up-to-date bits but under .NET Standard.

As an interface, there's no override problem. But you don't get the nice AstVisitor2 logic...

Perhaps we need an IAstVisitor2?

Oops wrong button

Sorry for the late response, I've been off for a while.
PSSA does not use PowerShellStandard (soley). The engine has to reference the SMA package of version 6.1 for getting the APIs that it needs such as e.g. SemanticVersion, which was only added in v6. As I've discussed already in issue #10372, newer SMA packages do not allow .Net Standard projects to reference the library, which requires now the specific .net Core version. I opened issue https://github.com/PowerShell/PowerShellStandard/issues/67 for getting PowerShell Standard v6 for references against v6.x only. Similary, we'd either need a package for v7x only and the conditionally compile against v6 and v7 or just have one package, which is the union of v6/7 (the former would be required only if PSSA has specific rule modiifcations for features only added in v7)
Therefore the first, blocking issue is the lack of proper reference libraries that also support .net standard. This should be solved first and the PS team has ignored community complaints from e.g. @jaykul for a while, now is the time to address them.
Separating out the parser into it's own project will be a bigger undertaking but maybe a first step would be to try compile SMA against netstandard2.1 so that usage of the SMA NuGet package for referencing only becomes more useful. Otherwise, we'll probably have to live with conditional compilation for a while.

Separating out the parser into it's own project will be a bigger undertaking but maybe a first step would be to try compile SMA against netstandard2.1 so that usage of the SMA NuGet package for referencing only becomes more useful.

The problem there is that netstandard2.1 doesn't support .NET Framework at all.

Whereas, S.M.A explicitly uses .NET Core specific APIs now, so it won't compile against netstandard2.0.

The solution to that is perhaps:

  • Generate reference libraries for SMA APIs that are netstandard2.0 compliant
  • Adopt a netstandard2.0 public API philosophy (not restrict APIs to that, but ensure that there is a netstandard2.0 compliant version of each API)
  • Have testing to ensure ongoing API support

Therefore the first, blocking issue is the lack of proper reference libraries that also support .net standard. This should be solved first and the PS team has ignored community complaints from e.g. @Jaykul for a while, now is the time to address them.

Maybe we can consolidate issues here. I opened this issue with precisely these issues in mind. There's no easy solution here, but as PowerShell 7 stabilises, I also feel we need to address them head-on.

When I look at this issue, there's an emphasis on how PSScriptAnalyzer will work which is a really great example.

At least in terms of that, AST-based static analysis, I would love to understand how other linters achieve this (if they do).

For example, when I use ESLint the most popular linter for JavaScript, I see that they offer the ability to specify the version of ECMAScript (aka JavaScript) via the parserOptions.ecmaVersion config. I've not yet dug into what parser they're using and how they're using it, but that seems to be a good model from the user's perspective...

I would love to understand how other linters achieve this (if they do).

I'm pretty sure they use a multi-target parser which will recognise all the syntaxes in each version of the language. Then the caller gets to decide based on the AST what works on their platform.

The alternative is that they use a different parser on demand, which blows out the size of the thing.

This is one of the reasons why syntactic breaking changes are very problematic; we have to work much harder to build/maintain a parser that works with other versions of the language.

It looks like eslint uses acorn which seems to also allow for specifying ecmaVersion.

I'm going to play the role of API consumer who doesn't care about the implementation details:

Obviously ScriptAnalyzer running in PowerShell 5 doesn't _necessarily_ need to understand syntax that won't work in PowerShell 5 -- but if all of these sorts of tools have to change which API they call in order to support new syntax, that puts a lot of burden on developers, and if those new APIs only work on new compilation targets, that puts a lot more burden on them, forcing the maintenance of maintain multiple incompatible versions of the tools.

  • Is impact on tool developers part of the conversation in the development of features? If not, why not?
  • What part of the API surface should remain backward compatible?
  • What justifies changing the surface in a breaking manner?
  • How much bigger is the justification for changing compilation targets?

In terms of this specific type of problem:

  • Is there any value in keeping an old parser API if it can't handle the current syntax?
  • Can't new tokens be added without breaking changes to the parser API?
  • Shouldn't parsers, specifically, be down-level compatible, at least to the previous version?

I put together an example of how you can sort of polyfill new AST types while still targeting PowerShellStandard and still maintain the same type identity. It basically works like netstandard.

It doesn't really help too much until we can all target a version that includes ICustomAstVisitor2.DefaultVisit but it still may have value.

https://github.com/SeeminglyScience/AstPolyfillExample

With support from PowerShell, you could ship a similar reference lib package every time new ASTs are added. Implementers would just add the new package, and add a new clause to DefaultVisit. The version of PowerShell that actually contains the new types would simply load their version with TypeForwardedTo decorations at start up.

Was this page helpful?
0 / 5 - 0 ratings