Powershell: Property-based pipeline-parameter binding should be applied to argument-based parameter binding as well

Created on 29 Jan 2018  路  18Comments  路  Source: PowerShell/PowerShell

Note: Fixing this would be a breaking change, but arguably one that falls into Bucket 3: Unlikely Grey Area.

Currently, when you use a ValueFromPipelineByPropertyName parameter attribute, the property named there is only bound when using _pipeline_ input and not also when passing the same object as a direct _argument_.

To illustrate the discrepancy with a simple example:

# Declare a function with a parameter that binds pipeline input by the value of 
# property .LiteralPath or its alias .PSPath
function foo { 
  param(
   [Parameter(ValueFromPipelineByPropertyName)]
   [Alias('PSPath')]
   [string[]] ${LiteralPath}
  ) 
  "`$LiteralPath: [$LiteralPath]" 
}

# Create object with a .PSPath property.
$obj = [pscustomobject] @{ PSPath = 'pspath' }

# Pass the object to the function, first via the pipeline, then as a direct argument
$obj | foo
foo -LiteralPath $obj

The output shows that the .PSPath property is only bound via the pipeline, and not as a direct argument:

$LiteralPath: [pspath]                       # OK (pipeline): .PsPath value was bound.
$LiteralPath: [@{PSPath=pspath}]  # !! (argument): $obj was simply stringified (.ToString())

The binding of the direct _argument_ pays attention only to the parameter _type_, causing it to simply _stringify_ $obj with .ToString() (not with default output formatting).

This discrepancy is not only surprising, it can have grave consequences when dealing with [System.IO.FileInfo] instances output by Get-ChildItem, because the latter often stringify to a _mere filename_, potentially causing unexpected and destructive behavior with -LiteralPath cmdlets such as Remove-Item, Move-Item and Copy-Item:

# Create sample file 'tmpFile' in subfolder 'tmpDir'; i.e.: tmpDir/tmpFile
$null > (Join-Path (New-Item -force -Type Directory tmpDir) tmpFile)

# Obtain a [System.IO.FileInfo] instance representing the new file.
# Note that `Get-ChildItem tmpDir` rather than `Get-Item tmpDir/tmpFile` (or
# `Get-ChildItem tmpDir/tmpFile` is purposely used, because it is only
# `Get-ChildItem` without a filename component in the path that stringifies 
#  the resulting instances to their filename only.
$f = Get-ChildItem tmpDir

# Demonstrate that $f stringies to just 'tmpFile' - the mere filename.
"`$f stringifies to: [$f]"

# Now bind $f to the -LiteralPath parameter as a *direct argument* of 
# several cmdlets, which *fails*, because trying to locate the input object
# as a *string* that is the *filename only* fails, given that file is in
# a *subdirectory*.
Remove-Item -WhatIf -LiteralPath $f
Move-Item -WhatIf -LiteralPath $f foo
Copy-Item -WhatIf -LiteralPath $f foo

The last 3 calls fail unexpectedly; while this happens not to be destructive / make unexpected changes in this example, it would be the case if namesake files happened to exist in the current location.

The fact that whether this problem arises depends on the specific method through which the [System.IO.FileInfo], [System.IO.DirectoryInfo] instances were obtained makes this behavior particularly insidious.

Environment data

PowerShell Core v6.0.0 on macOS 10.13.2
PowerShell Core v6.0.0 on Ubuntu 16.04.3 LTS
PowerShell Core v6.0.0 on Microsoft Windows 10 Pro (64-bit; v10.0.15063)
Windows PowerShell v5.1.15063.674 on Microsoft Windows 10 Pro (64-bit; v10.0.15063)
Area-Cmdlets-Core Issue-Discussion

All 18 comments

We could get significant performance improvements if our cmdlets could accept not only string paths but also [FileInfo] objects - currently we do a lot of unnecessary work on the path objects obtained from the file system (globbing, normalization ...).

@iSazonov:

That's a great idea - is there an issue for that yet?

What you propose would fix the problems for file-related cmdlets, but I feel that the general discrepancy between via-pipeline and by-argument parameter binding is worth addressing as such.

No Issue. The nearest is #5789
I don't know the code yet, and I'm moving forward very slowly. 馃槙

If we implement using [FileInfo] internally then maybe your offer will look a little different (simpler).

The big issue here is that we cannot assume anything about the providers. FileSystem is only one provider, and the common cmdlets are generic in how they work with respect to the providers.

However, it's maybe not unreasonable to say that "All Providers are equal, but some Providers (cough* FileSystem*cough) are more equal than others".

Maybe we should have special treatment for the FileSystem provider since is is verly likely not only the most used provider, but also a provider with large data sets.

The API is not public so we are free to enhance it or re-design to make it more flexible.

@iSazonov What API specifically are you talking about here?

Mostly I meant our globbing and that #5789 API don't exist still.

@powercode:

Good point about different providers: given that the items returned across providers don't implement a shared interface, we can't anticipate all item types using dedicated parameter sets.

While giving preferential treatment to the FileSystem provider sounds sensible, wouldn't that require us to declare the -LiteralPath parameters as [object] and then reflect on the type (given that you can't have the same parameter name with different types in different parameter sets)?
For all other types, .PSPath-based could (continue to) be used, albeit with the fix proposed here for argument passing.

@iSazonov:

If we implement using [FileInfo] internally then maybe your offer will look a little different (simpler).

Again, I feel the discrepancy discussed should be fixed in general, irrespective of the performance issues around file-related cmdlets.

There are two additional aspects:

  • For providers whose items are PowerShell-extended versions of general-use .NET types - [System.IO.FileInfo] again being a prime example - you don't get the .PSPath property if you instantiate such types directly (as opposed to obtaining instances via provider cmdlets) - see #4347.

    • If a reflection-based [object] parameter is introduced to detect [System.IO.FileInfo], [System.IO.DirectoryInfo] instances, that problem would go away for the FileSystem provider, but it could affect other providers too.
  • Arguably, for symmetry with ValueFromPipelineByPropertyName, a ValueByPropertyName parameter attribute should be introduced that allows the same type of property-name-based binding for _non_-pipeline parameters.

Note: in .Net Core 2.1 File enumeration extensibility. is expected

Some points:

  • The default position 0 parameter on commands that take paths is Path not LiteralPath so even if we enabled property binding from the command line it would have no effect.
  • It's Path not LiteralPath because people type paths and want to be able to use globbing.
  • It's of type string because that's what people type and that's what the providers consume universally.

So I don't see doing property binding from the command line as a win. On the other hand, this

ls -rec | foreach { test-path $_ }

is a recurring problem. Perhaps adding an argument transformation attribute to Path that converts FileInfo to a string by getting .FullName is the way to go.

Perhaps adding an argument transformation attribute to Path that converts FileInfo to a string by getting .FullName is the way to go.

We could enhance -Path and -LiteralPath to accept FileInfo/DirectoryInfo/PathInfo/etc - it allow us to skip tons of re-parsings and tons of path normalizations.

@BrucePay:

Good point about the 1st positional argument binding to -Path, so adding an argument transformation for FileInfo / DirectoryInfo to all cmdlets that accept filesystem items via -Path sounds like a great idea (it should probably bind via .PSPath for symmetry with how -LiteralPath pipeline input currently binds).

Longer-term, @iSazonov's idea of supporting these types _directly_ is desirable (though as previously discussed, that may require [object] parameters combined with reflection, so as to remain provider-neutral - or is there a better alternative?).

That said, the general symmetry that this issue asks for - bind by property whether the object came from the pipeline or by argument - is still worth implementing.

In the case at hand, if someone used -LiteralPath / -lp with an argument, they'd expect reliable full-path binding too:

ls -rec | foreach { test-path -LiteralPath $_ }  # !! Currently also unreliable

On a side note:

The stringification behavior of FileSystem actually changed since v6.0.2 and it seems that FileInfo instances now consistently stringify to their full path.
While this would help, I suspect it was an unintentional breaking change - see #7132

Rather than making -Path and -LiteralPath take object, we could add another parameter set that took a FileSystemInfo object. That violates the egalitarian notion of all providers are equal but it solves the file system name problem. However, even if we get the object into the cmdlet, using it more efficiently might be problematic as the entire provider infrastructure is based on the abstract notion of a path.

@BrucePay:

Irrespective of efficiency issues, wouldn't the parameter in that new parameter set have to have a _different name_?

I guess we could bind to that parameter _positionally_, but introducing yet another parameter name may be confusing (-FileSystemInfo / -FileInfo / -DirectoryInfo?).

Also, many cmdlets only accept _file(s)_, so for them the type would have to be FileInfo only.

We could have unified wrapper for all providers - PathInfo.

Here's a proof-of-concept (in PowerShell code) that makes do with the existing parameters and only requires attaching a new argument-transformation attribute to -LiteralPath.

It should work with all providers, given that they all decorate their items with a .PSPath property.
That said, I know little about providers, so there may be something I'm missing.

The approach relies on _non-string_ arguments seemingly getting bound to the parameter with the argument-transformation argument (-LiteralPath) even with _positional_ binding, which with _string_ input defaults to a different parameter, -Path, as per the default parameter set.

using namespace System.Management.Automation

# Transforms objects that happen to have a  .PSPath property - assumed to be
# PS provider items as returned by Get-Item / Get-ChildItem - to that property's
# value.
# Instances of all other types are passed through as-is.
class ProviderItemPathTransformationAttribute : ArgumentTransformationAttribute  {
  [object] Transform([EngineIntrinsics] $engineIntrinsics, [object] $inputData) {
    return $(foreach ($o in $inputData) { 
      if ($psPathProp = $o.psobject.Properties['PSPath']) {
        $psPathProp.Value
      } else {
        $o
      }
    })
  }
}

function pathdemo {

  [CmdletBinding(DefaultParameterSetName='Path', PositionalBinding=$false)]
  param (
    [Parameter(Mandatory, Position=0, ParameterSetName='Path', ValueFromPipeline, ValueFromPipelineByPropertyName)]
    [string[]] $Path
    ,
    [Parameter(Mandatory, Position=0, ParameterSetName='LiteralPath', ValueFromPipelineByPropertyName)]
    [ProviderItemPathTransformation()]
    [Alias('LP', 'PSPath')]
    [string[]] $LiteralPath
  )

  process {
    @"
    Path:        $Path
    LiteralPath: $LiteralPath
"@

  }
}

Sample calls:

# String input by argument: bind to -Path, as before:
PS> pathdemo 'foo'
    Path:        foo  
    LiteralPath: 

# String input from the pipeline: bind to -Path, as before:
PS> 'foo' | pathdemo
    Path:        foo  
    LiteralPath: 


# NEW: Provider-item input by argument: bind to -LiteralPath, via .PSPath, 
# using the argument-transformation attribute.
PS> pathdemo (Get-Item /)
    Path:        
    LiteralPath: Microsoft.PowerShell.Core\FileSystem::/

# Provider-item input from the pipeline: bind to -LiteralPath, via .PSPath, 
# using the by-property binding via the 'PSPath' parameter alias, as before.
PS> Get-Item / | pathdemo
    Path:        
    LiteralPath: Microsoft.PowerShell.Core\FileSystem::/

Again, note that while this approach happens to restore symmetry in recognizing input objects with a .PSPath property from both the pipeline and an argument _in this specific case_, that symmetry should be implemented generically.

@mklement0 Thanks for the great sample!

If we extract a string from PSPath argument and assign it to LiteralPath, we have to parse this string again internally. Seems we need something like PSPathObject - may be with PathInfo type. Although in this case the new parameter set will be better to exclude overheads.

@iSazonov:

I fully agree that in the long run we should be passing _objects_ (as opposed to strings) around so as to also improve _performance_.

My approach was meant as an intermediate, easy-to-implement step in the right direction that at least addresses the _functional_ problem.

If someone is willing to take on the proper solution right away, that's great.

As for terminology: If we pass the output from, say, Get-ChildItem around, we're passing _items themselves_, not path-information objects.

Therefore, my suggestion is to name the parameter -Item, and to implement a provider-independent item type named, e.g., PSProviderItem.

Was this page helpful?
0 / 5 - 0 ratings