Powershell: Suggestion: New Function Input Processing Method: First

Created on 9 Jun 2017 · 5Comments · Source: PowerShell/PowerShell

Currently PowerShell cmdlets support Begin, Process, and End blocks (i.e. per https://ss64.com/ps/syntax-function-input.html).

Begin executes before any pipeline input is available; so cannot touch a pipeline variable.
Process executes for every single element in the pipeline.

There are a number of scenarios where you need to inspect the first item in the pipeline; but don't want to execute for all; typically this means coding a pattern such as:

    Function Test-Example {
        [CmdletBinding()]
        param (
            [Parameter(ValueFromPipeline = $true)]
            $InputObject
        )
        begin {
            [bool]$first = $true
        }
        process {
            if ($first) {
                #do something with the first $InputObject in the pipeline
                $first = $false
            }
            #do whatever else
        }
        end {} #if needed
    }

An example scenario would be where we need to analyse what properties are available on the object; e.g. if writing a function to convert data from an array of PSObject to a DataTable.

Having support for syntax such as below would help made code more readable, and may aid optimisation (i.e. if there's a good way to skip the repeated IsFirst check on each iteration under the hood).

    Function Test-Example {
        [CmdletBinding()]
        param (
            [Parameter(ValueFromPipeline = $true)]
            $InputObject
        )
        begin {} #if needed
        first {
            #do something with the first $InputObject in the pipeline
        }
        process {
            #do whatever else
        }
        end {} #if needed
    }

Issue-Discussion Issue-Enhancement WG-Language

Source

JohnLBevan

All 5 comments

Just to clarify: you're looking to have the first object processed by _both_ the first and process blocks?

mklement0 on 12 Jun 2017

👍1

@mklement0 Good catch.
Yes; I believe in most scenarios it makes sense for the first object to be included in both blocks.

i.e. the process block contains code which should be applied to every item; generally there's nothing special about the first item in a list which means it should be excluded.

The reason for the proposed first block is for setup steps, which require any object from the pipeline to be available. e.g. many scripts work on the assumption that all objects in an array will have the same properties, so may have something like this:

     Function Find-ObjectWithValue {
        [CmdletBinding()]
        param (
            [Parameter(Mandatory = $true, ValueFromPipeline = $true)]
            [PSObject]$InputObject
            ,
            [Parameter(Mandatory = $true, Position = 0)]
            [PSObject]$Value
        )
        #begin {
        #    [bool]$first = $true
        #} 
        first {
            [string[]]$props = $InputObject | Get-Member -MemberType @('NoteProperty','Property') | Select-Object -ExpandProperty Name | sort 
        }
        process {
            #if ($first) {
            #    [string[]]$props = $InputObject | Get-Member -MemberType @('NoteProperty','Property') | Select-Object -ExpandProperty Name | sort 
            #    $first = $false
            #}
            $props | %{
                (New-Object -TypeName PSObject -Property @{
                    InputObject = $InputObject
                    Property = $_
                    Matched = ($Value -eq $InputObject."$_")
                })
            }
        }
    }
    [string[]]$users = @('samAccountName1','anotherUser')
    $users | Get-AdUser -Properties * | Find-ObjectWithValue 'sipfed.online.lync.com' | ? Matched

JohnLBevan on 16 Jun 2017

👍1

An example scenario would be where we need to analyse what properties are available on the object; e.g. if writing a function to convert data from an array of PSObject to a DataTable.

In general, a pipeline allows different types of input objects and makes parameter bindings for every input object. From this perspective the "first" is a bad user experience.
In this particular example, we could pass an object type as an cmdlet argument to be able to work with it in "Begin" step.

iSazonov on 16 Jun 2017

👍1

@JohnLBevan Your assuming that the first object has all the properties. That is not true in many cases. Each item in an object can have it's own set of properties as @iSazonov mentions above. So if you only check for the first item in the object, you lose properties if it wasn't specified in the first object.

For example, let's say that you have an object of people's names. You have First, Middle, Last, Suffix. First and Last is likely to be populated most of the time. However, Middle and Suffix are less reliable. Some people populate Middle, others don't and not everyone has a Suffix to their name. Let's represent this in PowerShell.

PowerShell $inputObject = @( [PSCustomObject] @{ First="William"; "Last="Riker" } [PSCustomObject] @{ First="Robert"; Last="Downy"; Suffix="Jr." } [PSCustomObject] @{ First="Sarah"; Middle="Jessica"; Last="Parker" } )

As you can see, the object we are accustomed to is really an array of hash tables cast as PSCustomObjects, with each having their own properties. This is very common when pulling data from APIs and Excel into PowerShell.

PowerShell also uses heuristics to combine all Output and figure out the properties to output to you in Format-Table and Format-List, as well as Get-Member. The less used properties may not be found and displayed by default.

Your not going to lose anything in trying to process the properties in a Process block, that is it's intended purpose. You need to go through each item to get the individual object and handle it anyway.

BTW, I've actually ran into this recently with Get-Member not accurately giving me all properties when I was trying to compare properties of two objects and only get the unique columns. I had to loop through all objects to most accurately get the properties to prevent bugs.

Where Begin and End blocks become truly useful is when you think of how to read a DataReader object across multiple cmdlets in a pipeline....

dragonwolf83 on 17 Jun 2017

Fair points; the below can be used where the behaviour I describe is required; and by not having the first block you avoid people using it when they shouldn't (i.e. when the shape of each object being passed to the pipeline is inconsistent).

[psObject[]]$list = Get-Data()
$list | Invoke-SomeMethod -TemplateObject $list[0]

[psObject[]]$list = Get-Data()
$list | Invoke-SomeMethod -TemplateObject (New-Object PSObject ([ordered]@{FirstName='';LastName='''}))

I'll close this request.

JohnLBevan on 28 Jun 2017

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Parameter parsing/passing: unquoted tokens that look like named arguments with colon as the separator are broken in two when passed indirectly via $Args / @Args

mklement0 · 3Comments

Update copyright notices

andschwa · 3Comments

Write-Output -NoEnumerate outputs PSObject[] rather than Object[] and generally doesn't respect the input collection type

mklement0 · 3Comments

PowerShell core on Linux - Get-Service should mimic linux "service" command.

MaximoTrinidad · 3Comments

Can't Uninstall/Remove AzureRM module from PowerShell on Mac OS

Michal-Ziemba · 3Comments