Powershell: Allow anonymous functions to be treated as filters

Created on 26 May 2019  路  3Comments  路  Source: PowerShell/PowerShell

Summary of the new feature/enhancement

If we permit the IsFilter property on a scriptblock to be either internally or publicly settable, we can make more effective use of anonymous scriptblocks as impromptu filters. These could be utilised in a pipeline context to emulate ForEach-Object without a lot of the extra features and overhead afforded that cmdlet, for scenarios where speed is significantly more important.

Proposed technical implementation details (optional)

https://github.com/PowerShell/PowerShell/blob/bd6fdae73520931f0d27a29d6290e18761772141/src/System.Management.Automation/engine/runtime/CompiledScriptBlock.cs#L325

This property needs to have at least internal set so that the following property that references it can also be set here instead of just throwing:

https://github.com/PowerShell/PowerShell/blob/bd6fdae73520931f0d27a29d6290e18761772141/src/System.Management.Automation/engine/lang/scriptblock.cs#L572-L577

With that permitted, we can perhaps implement an operator or an attribute such that this becomes possible as an impromptu ForEach-Object with none of the fluff.

1..10 | & [filter]{ $_ % 3 } 

Currently, this is possible by specifying the process block, but this syntax becomes quite cluttered very easily:

1..10 | & { process { $_ % 3 } }
Issue-Enhancement WG-Language

Most helpful comment

Performance of ForEach-Object can be significantly improved without really touching the parameter binder.

One option is to rewrite the pipeline - see how | Out-Null is special cased, you can make similar changes to handle limited uses of ForEach-Object, e.g. if you have a single literal scriptblock argument, you could rewrite the pipeline to not use ForeEach-Object and instead invoke the scriptblock as a filter.

Alternatively, you could create a special parameter binder that has special knowledge about ForEach-Object. This is a little riskier but a reasonable option for an such an important cmdlet.

All 3 comments

@vexx32 Ad hoc filtering is what "ForEach-Object" is for:

1..10 | foreach { $_ % 3 }
1..20 | % { $_ * 14 }

What do you think is missing?

(Aside - the whole filter thing was a mistake I made back in V1. At one point, functions weren't going to have begin/process/end so there would be no way of writing _functions_ that streamed. As a partial mitigation, I proposed the Filter keyword as a way to allow at least some stream processing. When we eventually did add begin/process/end, we really should have removed "filter". It just confuses people).

What's missing is any semblance of decent performance with ForEach-Object. Compared to an ad hoc function-style filter, it is extremely slow.

The fact that an ad hoc filter function can outstrip it by such large margins should be a pretty big red flag for the code that is used in ForEach-Object. Some additional overhead may be warranted, but not to this degree, surely.

There are other issues about this, which I'll happily dig up later if you need me to.

Performance of ForEach-Object can be significantly improved without really touching the parameter binder.

One option is to rewrite the pipeline - see how | Out-Null is special cased, you can make similar changes to handle limited uses of ForEach-Object, e.g. if you have a single literal scriptblock argument, you could rewrite the pipeline to not use ForeEach-Object and instead invoke the scriptblock as a filter.

Alternatively, you could create a special parameter binder that has special knowledge about ForEach-Object. This is a little riskier but a reasonable option for an such an important cmdlet.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

manofspirit picture manofspirit  路  3Comments

alx9r picture alx9r  路  3Comments

rkeithhill picture rkeithhill  路  3Comments

garegin16 picture garegin16  路  3Comments

Michal-Ziemba picture Michal-Ziemba  路  3Comments