Powershell: globbing for native commands is too agressive

Created on 3 Jun 2017  路  26Comments  路  Source: PowerShell/PowerShell

Steps to reproduce

Intuitively globbing should not kick-in inside the single-quoted strings.

echo '11:1' | grep '.*:.'

Expected behavior

Works, output is 11:1, like in bash.

Actual behavior

Cannot find drive. A drive with the name '.*' does not exist.
At line:1 char:1
+ echo '11:1' | grep '.*:.'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (.*:String) [], DriveNotFoundException
    + FullyQualifiedErrorId : DriveNotFound 

The error is pretty confusing for a unix user.

Workaround

Escape * by a backtick in the regex.

Environment data

> $PSVersionTable

Name                           Value                                                                                      
----                           -----                                                                                      
PSVersion                      6.0.0-beta                                                                                 
PSEdition                      Core                                                                                       
BuildVersion                   3.0.0.0                                                                                    
CLRVersion                                                                                                                
GitCommitId                    v6.0.0-beta.2                                                                              
OS                             Darwin 16.6.0 Darwin Kernel Version 16.6.0: Fri Apr 14 16:21:16 PDT 2017; root:xnu-3789....
Platform                       Unix                                                                                       
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}                                                                    
PSRemotingProtocolVersion      2.3                                                                                        
SerializationVersion           1.1.0.1                                                                                    
WSManStackVersion              3.0  
Issue-Bug Resolution-Fixed WG-Language

Most helpful comment

Bump. Any update? This is preventing me from moving to beta on non-Windows.

All 26 comments

Just to state it explicitly: globs shouldn't be expanded inside "..." (double-quoted strings) either, which currently happens too:

printf '%s\n' '*'    # should print literal *
printf '%s\n' "*"    # ditto

Only _unquoted_ tokens should ever be subject to globbing, as in POSIX-like shells.

Just noticed this after moving to Beta on OSX. Maybe a recent regression?

It makes using curl impossible if you have URL query parameters.

> curl 'https://google.com'                                                                                                                                                                                                                                         
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="https://www.google.com/">here</A>.
</BODY></HTML>
> curl 'https://google.com?foo=bar'                                                                                                                                                                                                                                 
Cannot find drive. A drive with the name 'https' does not exist.
At line:1 char:1
+ curl 'https://google.com?foo=bar'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (https:String) [], DriveNotFoundException
    + FullyQualifiedErrorId : DriveNotFound

Just noticed this after moving to Beta on OSX. Maybe a recent regression?

Yes, not really a regression, it a new feature in beta-1
https://github.com/PowerShell/PowerShell/pull/3643

Is there a way to disable globbing entirely and opt back in to the Windows-style behavior? Even when it works by design, I kind of hate it.

e.g. git add * used to just work, but with globbing I'd need to do git add '*'

For this specific case git add . may work

Bump. Any update? This is preventing me from moving to beta on non-Windows.

Polite ping @BrucePay

Another example of native utility that became unusable is youtube-dl (or anything that takes url for that matter)

youtube-dl https://www.youtube.com/watch?v=QQ0Yn1fqugg
Cannot find drive. A drive with the name 'https' does not exist.
At line:1 char:1
+ youtube-dl 'https://www.youtube.com/watch?v=QQ0Yn1fqugg'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (https:String) [], DriveNotFound 
   Exception
    + FullyQualifiedErrorId : DriveNotFound

And yet another example of globbbing messing up invocation of a native executable:

conan info --only "None" --package_filter PkgName/* --cwd ../..

BTW the suggested workaround of escaping the * doesn't work because `* is passed to the executable.

So I was trying out my home grown echoargs in my WSL/Bash shell and noticed that given this command (in Bash):

hillr@HILLR1:~$ echoargs g* --package_filter FOO/*

You get this output:

Command line: "/home/hillr/dotnet/echoargs/bin/Debug/netcoreapp2.0/ubuntu.16.04-x64/echoargs.dll getcwd getcwd.c get-pip.py git --package_filter FOO/*"

RE globbing, how does Bash know to glob the first argument g* but not the second one FOO/*?

Bash (per POSIX):

  • only ever applies globbing to _unquoted_ tokens; * is subject to globbing, '*' and "*" are not (or a singly escaped *, \*).

    • Side note: strictly speaking, what matters is whether the _individual globbing metacharacters/expressions_ are quoted or not; POSIX-like shells allow you to form a single argument composed of both quoted and unquoted parts; e.g., "foo"* would still match all files whose names start with foo, because the * is unquoted.
  • passes unquoted tokens that do not match any filesystem objects through as-is (in Bash, you can opt into different behavior, but that is not part of the POSIX standard); in your example, FOO/* presumably didn't match anything and was therefore left untouched.

    • Note that it is irrelevant whether or not an unquoted token with globbing metacharacters partially happens to refer to existing filesystem items or is even a syntactically valid path - the original argument is passed through as-is in all cases.
    • Arguably, this is not the most sensible default behavior, but it's POSIX-mandated and has a long history.

My assumption has been that the idea behind PowerShell's native globbing is to emulate POSIX-shell rules (is that not true?), so in your earlier conan example, if you wanted to pass PkgName/* through, you'd have to _quote_ it to exempt it from globbing:

conan info --only "None" --package_filter 'PkgName/*' --cwd ../..

Quoting doesn't work in PowerShell (does work from Bash). That is part or the problem with PowerShell's aggressive globbing:

1> echoargs info --only "None" --package_filter 'PkgName/*' --cwd ../..
Cannot find path '/home/hillr/PkgName' because it does not exist.

That cannot find path error is coming from the globbing code - not echoargs.

BTW you're right about Bash not finding anything that matched PkgName/*. If a add folder with that name and some files in it, it globs that as well. So I guess one difference (bug?) with PowerShell's globbing is that if it doesn't find a path it shouldn't error, it should pass the value straight through to the native exe.

Sorry for the reposts - I didn't want to clutter this thread with piecemeal insights.

Indeed: that it currently doesn't work as I described and instead unexpectedly works as you demonstrate is the reason this issue was created.

What you're seeing is a variation of what @vors experienced, and demonstrates both current problems (deviations from POSIX-shell behavior as of beta 8):

  • Globbing is applied blindly, instead of leaving quoted tokens alone.

  • A glob that matches nothing _due to nonexistent path components_ results in an _error_ rather than in the original token getting passed through (e.g., /bin/echo /nosuch/*); by contrast, a glob without a path component or one whose non-globbing components exist _is_ passed through as-is, in line with POSIX (e.g., /bin/echo nosuch*).

    • In other words: PowerShell currently _in part_ matches the POSIX behavior in this respect. Given that the passing-through-as-is-if-no-match arguably _never_ really makes sense, perhaps PowerShell could decide to deviate from POSIX behavior here:
      Arguably, the sensible behavior is to complain about non-existent path components (as PowerShell already does) _and_ to otherwise pass globs that happen to match nothing as distinct _empty-string_ arguments? (This is _not_ the same as setting shopt -s nullglob in Bash, where non-matching globs are _eliminated_ as arguments altogether). Not sure what the right answer is.

Thanks for sharing the POSIX design. I readily acknowledge that there is a lot of value in offering this well-established behavior to PowerShell users, even as a default on unix systems. This makes PowerShell easier to pick up for folks coming from that ecosystem. By no means am I opposed to PowerShell having this capability.

But dang, that's totally idiotic. I really really hope there can be an option to disable this and use Windows-style (e.g. no) globbing, even on Unix. I have to switch back and forth between Windows and Mac pretty frequently, and it's been fantastic being able to use PowerShell on both. But the introduction of globbing on Mac was a complete show-stopper. I've had to stick with the alpha builds due to this. Even if the overly-aggressive stuff is fixed and behavior matches POSIX as described above, it still sounds terrible to me. | fl * is muscle memory, now I need to type | fl '*'? Wildcards are all over the place in PowerShell, I will need to defensively quote each and every one from now on, just in case my CWD has a particular structure?

the introduction of globbing on Mac was a complete show-stopper.

Same here for us on Ubuntu.

really hope there can be an option to disable this

Agreed. You can disable pathname expansion globbing in Bash with set -f noglob. There needs to be a way in PowerShell to do the same.

BTW a question for the Bash savvy. We expect that some folks will continue to live in Bash but we want them to be able to use PowerShell scripts in our repo. So we've shebang'd them and committed them with chmod=+x.

The problem comes when they want use Bash globbing with a PowerShell command. Bash globbing space separates the files and that messes up our PowerShell command. Internally, it can do the wildcard resolution but it is kind of sucky to have to tell Bash users they have to put wildcards in quotes when calling PowerShell scripts. Is there an option in Bash, to get it to generate glob lists that are comma separated?

@latkin:

| fl * is muscle memory, now I need to type | fl '*'?

No: the globbing is only applied when calling _external utilities_, on _Unix_, so nothing changes for calls to cmdlets / functions / *.ps1 PS scripts - or at all on _Windows_.

Note: Globbing _is_ applied when passing arguments to an executable PowerShell script _with a shebang line_, as it technically is an external utility too.

@rkeithhill:

I don't think there is such an option - arguments are strictly space-separated in the Unix world, and there is no concept of an array-valued argument in the shell.

You could use a ValueFromRemainingArguments parameter in your PowerShell scripts, but that limits you to 1 wildcard expression and notably precludes use of the parameter name in the invocation.

As an aside: even a comma-separated list wouldn't help you, because PowerShell doesn't recognize arrays in arguments passed to it from the outside; e.g., 1,2 would be interpreted as scalar string 1,2, and 1, 2 would be interpreted as 2 arguments: 1, and 2.

So Bash users need to know to quote wildcard arguments. Guess that's just the way it'll have to be.

PowerShell doesn't recognize arrays in arguments passed to it from the outside

Well that appears to be a new bug. Has that been filed yet? So how do I direct my script users to pass array args to my PowerShell script from Bash? Sigh...

@rkeithhill:

Well that appears to be a new bug.

I agree that passing arrays from the outside would be nice, but has that really ever worked?

I discovered the issue a while ago and assumed it was a by-design limitation of the CLI's argument parsing, similar to how all arguments are interpreted as literal strings.

has that really ever worked

Well, it is something that PowerShell users are use to e.g.:

Remove-Item foo.txt, bar.txt, baz.txt

The question is what do Bash users expect? Presumably us PowerShell users will be using PowerShell. However, we will tell our Bash buddies how to run our PowerShell scripts and we're going to have to know that the array literal syntax we're used to in PowerShell isn't going to work in Bash.

The question is what do Bash users expect?

They expect to pass a list (array) - whose semantics are known to the target utility only - as either a single whitespace-less argument - e.g., to pass column names pid and comm to utility ps via its -o option:
ps -o pid,comm
or as a quoted argument, if the value contains whitespace or other shell metacharacters - e.g.,
ps -o 'pid, comm'

To Bash it is just a single argument in either case.


So, at least with how the PowerShell CLI currently works, the answer is again:

  • _quoting_ when passing - even when calling from _within_ PowerShell (see below)
  • combined with splitting the single string into the embedded elements inside the PowerShell script

From within PowerShell, if you call an external utility - including a shebang-line PS script - with an array argument:

  • if what would normally be an array in PS has _no embedded whitespace_ (e.g., 1,2 or 'a','b'), it is NOT treated as an array and passed as a _single_ argument.

  • If the array elements are space-separated, they turn into _individual arguments_, by virtue of converting the array to a space-separated list of its elements (e.g., 1, 2 turns into 1 2, seen by the target utility as separate arguments 1 and 2.

Oh, at last I found this issue.
There is same problem (at least as I see it) with invoking native commands with variables containing special symbols in options/arguments.

There is a real-life example when globbing interfere when it shouldn't - SELinux file context management through semanage (of course I leared this hard way, in the middle of writing deployment script).
I wrote this 'mini-test' to demonstrate it.
I have this behavior on PS 6.0.0-beta.9 on CentOS 7 1611

"make sure you have semanage, if not - run 'yum --assumeyes install policycoreutils-python'" 
"recreating test directory" 
if (test-path /testdata -ea 0) { Remove-Item /testdata -Force -Recurse }; New-Item /testdata/testdir1 -ItemType Directory
"show current selinux context" 
ls -lZ /testdata 

"Expected behavior:"
"testing context changing using 'stop-parsing --%' symbol" 
semanage --% fcontext --add -t httpd_sys_rw_content_t "/testdata/testdir1(/.*)?"
restorecon --% -R /testdata/testdir1
"we should see context changed to httpd_sys_rw_content_t" 
ls -lZ /testdata 
"restoring default context" 
semanage --% fcontext --delete "/testdata/testdir1(/.*)?"
restorecon --% -R /testdata/testdir1
ls -lZ /testdata 

"Actual behavior:"
"And now we try to execute same command using PS variables" 
$contextpath = '/testdata/testdir1(/.*)?'
semanage fcontext -a -t httpd_sys_rw_content_t $contextpath
"And what if we try to enclose path in double-quotes?" 
$contextpath = '"/testdata/testdir1(/.*)?"'
semanage fcontext -a -t httpd_sys_rw_content_t $contextpath
"What about escaping?" 
$contextpath = "/testdata/testdir1`(`/`.`*`)`?"
semanage fcontext -a -t httpd_sys_rw_content_t $contextpath

"Workaround:"
"1) write resulting invoke to script file, chmod +x, invoke bash file"
"2) Use Start-process, which doesn't capture command output, which brings another PITA to solve:"
$contextpath = '/testdata/testdir1(/.*)?'
$semanageArgs = @(
'fcontext'
'-a' 
'-t'
'httpd_sys_rw_content_t'
$contextpath
)
Start-Process -FilePath semanage -ArgumentList $semanageArgs -Wait 
restorecon --% -R /testdata/testdir1
ls -lZ /testdata

I think at this point there is agreement that "Unix-native" globbing in PowerShell is broken, but we don't know yet _how_ it will be fixed.

It's been laid out here how POSIX-like shells handle globbing, which decide whether to apply globbing based on the distinction between _quoted_ and _unquoted_ tokens - and anyone with Unix shell-scripting experience is aware of that.
Furthermore, even unquoted _variable references_ are subject to globbing (e.g., in Bash:
var='*.txt'; echo $var # globbing happens, because $var is unquoted)

Both concepts are alien to PowerShell, where

  • *.txt and '*.txt' are treated the _same_.
  • the distinction between $var and "$var" exists, but is entirely unrelated to globbing (and the need to pass a value with embedded whitespace as a _single_ argument); it merely forces stringification.

Two worlds collide here, and something's gotta give.

Adopting the quoted-vs.-unquoted distinction at least for _literal_ unquoted tokens for calls to external utilities seems like a reasonable compromise to me, ~but perhaps there's a different solution - we have yet to hear from the powers that be.~ (_a fix is underway_)

As an aside re --%:

--% is not the answer not only because you then cannot use PS variables, but because it was designed for _Windows_ and still behaves exclusively that way:

  • Because it doesn't know It treats single quotes as syntactic elements, they become part of the argument to pass: /bin/echo --% 'foo, bar' results in 2 arguments, 'foo, and bar' - note the embedded single quotes.

  • It will expand cmd-style environment variable references even on Unix (e.g., %HOME%), yet doesn't recognize Bash-style ones (e.g., $HOME).

A decision was made _not_ to adapt --% to Unix (whether as --% or with a distinct name) -
see https://github.com/PowerShell/PowerShell/issues/3733#issuecomment-327641533

@mklement0 Could you please review ##5188 ?

@iSazonov Oops! Sorry I missed that a fix is already underway - will take a look.

Was this page helpful?
0 / 5 - 0 ratings