Powershell: Feature: Register-OutputParser

Created on 9 Oct 2019  路  16Comments  路  Source: PowerShell/PowerShell

Summary of the new feature/enhancement

The concept is the ability to register a scriptblock for a specific native command to allow the scriptblock to parse the output and return objects. A format.ps1xml would be optionally registered as a parameter to get nice formatting.

The assumption is that in most cases, if the parser is registered then the default is to use it so that objects are returned even though the native command itself returns text. It might be ok to say if the command is last in the pipeline, it is ok to execute the parser. If the next command in the pipeline is native, then text is expected. If the next command is PowerShell, then execute the parser and pass objects down the pipeline.

There may still be a need to require text even when working with PowerShell in the pipeline, so need a way to specify that selectively.

Proposed technical implementation details (optional)

register-outputparser -nativecommand kubectl -scriptblock { param($stdout, $args) ... }

$stdout as the first param would contain the text output, probably no streaming semantics
$args is the args passed to the native command as an array of strings

if the scriptblock determines it can't process the data it should just return $stdout back out

Issue-Enhancement WG-Engine

Most helpful comment

Regardless of the final form, invoking such a parser should always be a deliberate action. Nobody should be surprised by this behaviour, and it should not take over standard native command executions.

Since we're talking about a Register-OutputParser cmdlet anyway, I think we should pair it with an invocation command, an explicit opt-in to this kind of behaviour. Something like Invoke-NativeCommand with paired with @tnieto88's cmdlet name suggestion.

But taking a step back for a minute, I'm pretty sure this kind of feature will largely go unused. Defining a custom function or even a cmdlet with the appropriate parameters and features will always fit better in the PowerShell ecosystem. This feels like a half-solution at best.

All 16 comments

I prefer an approach like PSMore using wrappers based on classes that is more flexible, extendable and faster. Also this allow us to have universal code without many hooks/workarounds for
for every special case. With the approach we could cover most of scenarios like:

  • encoding for input and output
  • completors for parameters and arguments
  • output formatting
  • ...

c# public class BaseWrapper : IEncoding, ICompletor, IFormatting

In the case, we'd need one Register-Wrapper.

For reference #10722

How would this affect a tool already using one of these applications directly? I know you mentioned a way to selectively disable that, but if it's not opt-in that could be problematic.

Can you speak a little about the advantages this has over defining a function of the same name?

@SeeminglyScience See #7857 for history and motivations.

@iSazonov isn't that mostly formatting? Maybe I'm reading the OP wrong but I took it to mean that the executable would output objects instead of text. In other words (kubectl).GetType() wouldn't be string anymore.

@SeeminglyScience PSMore is about formatting. We can generalize the approach. In the case script hooks for formatting or completors will be a special case of a more general approach.

@iSazonov right but this issue doesn't seem related to formatting or completers. It reads to me like it's talking about the literal output of a command, with an optional way of providing a format definition for that output.

@SeeminglyScience Idea is to have one typed wrapper, for output parsing scenario too.

I understand why it's valuable to have objects instead of text, but if I'm explicitly calling something like this:

$dotnet = $ExecutionContext.InvokeCommand.GetCommand('dotnet.exe', 'Application')
& $dotnet

I shouldn't all of a sudden start getting objects. That already has to be pretty explicit to safe guard against functions/aliases.

More than that though, I'm asking what advantage the proposed feature has over this:

function kubectl {
    $stdout = if ($MyInvocation.ExpectingInput) {
        $input | kubectl.exe @args
    } else {
        kubectl.exe @args
    }

    # do the parsing here
}

Yeah it's a bit more wordy, but not by much. It also seems really difficult for a user who doesn't have an output parser registered to see what's going on. Also if you're "powershellizing" a native command, wouldn't you want the wrapper to be Verb-Noun? That makes it very clear that the output is altered in some way.

I shouldn't all of a sudden start getting objects.

The idea is to have an universal engine. No full design exists. For the example it could be as simple as <ping wrapper>.Pipeline.Output.DisableParsing(), and we could do the same on the fly too in pipeline like PSMore demo for formatting.

The idea is to have an universal engine. No full design exists. For the example it could be as simple as <ping wrapper>.Pipeline.Output.DisableParsing(),

Well it'd be more like:

$isParsingEnabled = [Wrapper.Pipeline.Output]::IsEnabled
[Wrapper.Pipeline.Output]::DisableParsing()
try {
    # use executable
} finally {
    if ($isParsingEnabled) {
        [Wrapper.Pipeline.Output]::EnableParsing()
    }
}

That's assuming that try/finally becomes reliable at some point. I know that was just an example, maybe if it's scope based then it may not be as big of a problem

Either way though, that's still more that needs to be done on top of an already pretty explicit process. Any script that tries to parse the output of an executable would break on machines with one of these parsers registered.

and we could do the same on the fly too in pipeline like PSMore demo for formatting.

I'm having a really hard time seeing how this is related to PSMore. Wouldn't output of these parsers would still be the same output as my function example? (i.e. whatever the parser author outputs)

We looked at this way back in V1 as it seemed very important at the time. (There was even some prototyping and, I think, a patent). The problem is that the output of native commands varies wildly based on the parameters used as well as the immediate environment. This makes it extremely difficult to write a general purpose text-to-object converter. Even a purpose-built converter is very hard (IIRC we tried doing this with ipconfig.exe but never shipped it.)

This is why the command line args are passed to the parser so it can determine if it can handle that specific type of output.

@SeeminglyScience a function wrapper means if different people want to support parsing different output of the same native command, they'd have to incorporate it into that function. The idea here is if someone just wanted to get objects from kubectl cluster-info dump and is able to write a parser for it, then they could register just for that similar to registering argument completers. The point of when you really want the text output is a concern particularly when the heuristics I indicated aren't sufficient. Haven't thought of any good idea that doesn't include a sigil or some other indicator that would be hard to discover (like --%...)

Could this be expanded to include parsing stderr? The cmdlet name might be something like:

Register-NativeParser -Command kubectl -ScriptBlock { param($stdout, $stderr, $args) ... }

Regardless of the final form, invoking such a parser should always be a deliberate action. Nobody should be surprised by this behaviour, and it should not take over standard native command executions.

Since we're talking about a Register-OutputParser cmdlet anyway, I think we should pair it with an invocation command, an explicit opt-in to this kind of behaviour. Something like Invoke-NativeCommand with paired with @tnieto88's cmdlet name suggestion.

But taking a step back for a minute, I'm pretty sure this kind of feature will largely go unused. Defining a custom function or even a cmdlet with the appropriate parameters and features will always fit better in the PowerShell ecosystem. This feels like a half-solution at best.

@vexx32 I was thinking something like a Invoke-NativeCommand as well. The other two commands that would pair well is Enable-NativeParser and Disable-NativeParser with a -Scope parameter to allow users to opt-in or out at whatever scope they would like. For example, I could have a script with native parsing turned on for all of my commands without having to use Invoke-NativeCommand each time.

I think Invoke-NativeCommand (there's several issues already that propose such a thing) would solve a different class of issues although this could be encapsulated into that cmdlet, it makes it odd to have to use a cmdlet to execute native commands in most cases. Alternatively, there could be a ConvertTo-Object cmdlet that uses the registered parsers to output objects and returns an error if there isn't one to handle the output.

Was this page helpful?
0 / 5 - 0 ratings