Powershell: Get-ChildItem -Recurse treats literals passed to -Path like wildcard expressions

Created on 15 Dec 2017  路  21Comments  路  Source: PowerShell/PowerShell

The 1st positional argument passed to Get-ChildItem binds to -Path, meaning that wildcard expressions _are_ supported.

However, values that are _literal_ paths - values that do not contain wildcard metacharacters such as * - should be treated as such.

  • _Without_ -Recurse, that is already the case: A nonexistent literal path triggers an error.

  • _With_ -Recurse:

    • The nonexistent path (assuming that it is a mere filename or its parent path exists) is treated like a wildcard expression that happens to match nothing.
    • Additionally, this matching is apparently performed in _all_ the directories in the subtree.

Workaround (and generally the better choice if you know a path to be a literal one):
Get-ChildItem -Recurse -LiteralPath <path>

_Update_: Get-ChildItem -Recurse <name> may also yield _false positives_, by matching <name> _anywhere in the subtree_ rather than just in the current location - see below.

Steps to reproduce

Get-ChildItem NoSuch
Get-ChildItem NoSuch -Recurse

Expected behavior

Both commands should report an error re nonexistent item NoSuch.

Actual behavior

Get-ChildItem NoSuch -Recurse produces no output at all (neither success nor error), and $? is set to $True.

Environment data

锘縋owerShell Core v6.0.0-rc.2 (v6.0.0-rc.2) on Microsoft Windows 7 Enterprise  (64-bit; v6.1.7601)
Windows PowerShell v5.1.14409.1012 on Microsoft Windows 7 Enterprise  (64-bit; v6.1.7601)
Breaking-Change Committee-Reviewed WG-Engine-Providers

Most helpful comment

This is dangerous and very hard to track down. The only reason that we found it this time is because we have ran into this several times before.

If someone runs gives a path to .\source\Output, why should it ever consider looking in .\source\Tests and every other folder in .\source.

The other scenario that we run into this as an issue is when that parent folder is very large. Like when it sits in C:\Windows. Not only does it take forever, you also often get access denied error for completely unrelated paths. If it is in the root of a company share with millions of files, this command will appear to never return. Both scenarios are very hard for the inexperienced to troubleshoot. Even when I have seen this before, it can be a challenge.

All 21 comments

The inconsistent behavior is wrong, but with the -literalpath workaround and checking the returned count, I'm not sure if we would change this. It seems to me that if this was v1, I would have -Path and -LiteralPath be the root of the search and require -Filter for filtering rather than mixing it with the path.

I understand that the behavior is useful if the only / last path component is an actual wildcard pattern (e.g., Get-ChildItem *.txt -Recurse - though it gets confusing with Get-ChildItem /tmp/*.txt -Recurse), but with literal components you not only may get quiet no-ops when there should be an error, you may also get _false positives_, which is more problematic:

> mkdir -p /tmp/sub/tmp2; sl /tmp; Get-ChildItem -Recurse tmp2  | % fullname
/tmp/sub/tmp2

Note how tmp2 is unexpectedly matched _anywhere in the subtree_ of /tmp instead of complaining that /tmp has no child item named tmp2.

Needless to say, this can have grave consequences.

It's fine to keep the existing behavior with actual wildcard expressions in the last path component, but my vote is to fix the behavior with literals.

@PowerShell/powershell-committee discussed this and is concerned about the compatibility risk with existing scripts. We acknowledge that the behavior is surprising and would defer to potentially writing a new FileSystemProvider in the future to resolve these types of issues. Feel free to mention me on other ones that should be addressed in FileSystemProvider v2.

To the developer for this issue,
I noticed this issue is relevant to this PR: #5896

Work on a new FileSystemProvider should probably wait until all the changes to System.IO in .NET core is shipped. They are doing lots on high perf/low allocation wrt file system. Can barely wait! :)

See for example https://github.com/dotnet/designs/pull/24

It works as expected in v2. As far as I remember, per the PowerShell Team explanation, this was a design decision in order to match CMD dir behaviour and improve the user experience. There was an issue Microsoft Connect 766100 but half of Connect issues were removed, including this one.

@PowerShell/powershell-committee - @nightroman 100% is correct. _This is a deliberate feature_ implemented to make it easier (and more friendly to cmd users) to recursively search for a file name. It is extremely useful, likely broadly used and consequently extremely likely to break both scripts and the user experience if it is changed.

If the consensus is that the risk of breaking existing scripts is too high, the best we can do is to _document_ the pitfalls - and the pit can be deep, as my example with the false positives shows.

But let's be clear that while the ability to match on all levels of the subtree is indeed very useful, it is regrettable that such a treacherous implementation was chosen (even if as a nod to cmd users) - all the more so, given that the correct implementation of the feature is _also_ available (-Filter, -Include).

This is a extremely dangerous when used as such:

Get-ChildItem -Path "C:\DoesNotExist" -Recurse | ForEach-Object { Remove-Item -Path $_.FullName -Recurse }

This will enumerate all files on the C drive and start deleting them, where my, what I think is reasonable, expectation, is that it would try to enumerate the files in the specified path and if none exist then quit immediately not begin trawling through every file.

It seems that if we know the path is literal (c:\foo or even .\foo) we shouldn't treat that as a search and that might be acceptable from a compatibility point of view

@SteveL-MSFT: I agree, and note that that is the very thing my original post asked for:

values that are _literal_ paths - values that do not contain wildcard metacharacters such as * - should be treated as such.

A really dangerous bug open for an eternity :(
Is it so hard to fix?
Would a headline like 'PowerShell killed all my data' be good marketing?

@mi-hol the change itself is not hard, but hard problem is maintaining backwards compatibility with existing scripts dependent on this behavior

@SteveL-MSFT This new feature in v3 was breaking for v2 scripts, too. And lots of people were surprised right from the beginning. Unfortunately the history on Connect (mentioned above issue) was erased.

Is there some way to have new behavior behind a flag/version? So we somehow could opt in to fixed, otherwise breaking changes?

A bit like what @daxian-dbw did with experimental feature, but for production? Or does it become a maintenance hell?

@powercode We could certainly put this behind an experimental feature flag (which is opt-in by definition and thus less people will use it). We currently don't have a good solution as to when something goes from experimental to stable in the case where it's a breaking change. I'm concerned that having something that is breaking that is opt-in will fragment the community.

For the argument that v3 broke v2, I don't think we can use that as justification to break v3, v4, v5 by itself.

For the argument that v3 broke v2, I don't think we can use that as justification to break v3, v4, v5 by itself.

Not at all, of course. I mentioned the history just as the precedent of making breaking changes for the greater good. Justifications should be different, something like "potentially dangerous feature", "violation of the least surprise principle", etc.

@nightroman Agree 100%. v3 was before my time on this team, so I don't know the history of it whether the break was intentional or a mistake (feels like the latter).

Agreeing with @Atheuz here, we just encountered this for the second time in 6 months.

It is insane that when the below folder doesn't exist, the following code:
$Output = '\\s-buildserv\[...]\s\Output'
Get-ChildItem -Path $Output -Directory -Recurse |
Remove-Item -Recurse -Force -ErrorAction 'Ignore'
deletes
\\s-buildserv\[...]\s\Tests\Data\output
\\s-buildserv\[...]\s\Tests\DataDNSExtract\output
\\s-buildserv\[...]\s\Tests\DataTemplateDedupe\output

I see this as an exceptionally dangerous bug that at the very least should have a stopgap fix to recognize non-wildcard paths and refuse to recurse from one level up.

EDIT: removed non-relevent portion of filepaths for clarity

This is dangerous and very hard to track down. The only reason that we found it this time is because we have ran into this several times before.

If someone runs gives a path to .\source\Output, why should it ever consider looking in .\source\Tests and every other folder in .\source.

The other scenario that we run into this as an issue is when that parent folder is very large. Like when it sits in C:\Windows. Not only does it take forever, you also often get access denied error for completely unrelated paths. If it is in the root of a company share with millions of files, this command will appear to never return. Both scenarios are very hard for the inexperienced to troubleshoot. Even when I have seen this before, it can be a challenge.

@PowerShell/powershell-committee due to the breaking change nature of this request, we would recommend making the behavior change in a FileSystemProvider v2 implementation

Was this page helpful?
0 / 5 - 0 ratings