Powershell: `switch -File` unexpectedly and invariably interprets the path as a wildcard expression

Created on 26 Feb 2019 · 36Comments · Source: PowerShell/PowerShell

Related: #4726

Steps to reproduce

New-Item 'foo[0].txt'
switch -File 'foo[0].txt' { default { 'hi' } }

Expected behavior

No output (the file is empty, there's nothing for switch -File to iterate over).

Actual behavior

The following error occurs:

No files matching 'foo[0].txt' were found.

That is, filename foo[0].txt was interpreted as a wildcard pattern, not a literal name.

With the current behavior, the workaround is to `-escape the [ and ] chars.

Note that even successfully resolving the wildcard expression ultimately fails if it resolves to _multiple_ filenames, so the utility of interpretation-as-wildcard is questionable.

Environment data

PowerShell Core v6.2.0-preview.4 on macOS 10.14.2
PowerShell Core v6.2.0-preview.4 on Ubuntu 18.04.1 LTS
PowerShell Core v6.2.0-preview.4 on Microsoft Windows 10 Pro (64-bit; Version 1803, OS Build: 17134.471)
Windows PowerShell v5.1.17134.407 on Microsoft Windows 10 Pro (64-bit; Version 1803, OS Build: 17134.471)

Breaking-Change Hacktoberfest Issue-Enhancement Up-for-Grabs WG-Engine

Source

mklement0

Most helpful comment

@PowerShell/powershell-committee reviewed this. We agree that switch iterating over multiple files would be a great new feature. Proposal is that -File would support multiple files (wildcard or collection) and a new -LiteralFile would be added to support a literal path.

SteveL-MSFT on 7 Mar 2019

👍4

All 36 comments

Do you mean that we should interpret the path as Literal? It is not a breaking change and I like it.

@rjmholt It could be good issue for hackathon if you still continue.

iSazonov on 26 Feb 2019

👍1

@iSazonov: Yes, as in the case of #4726 I think the path should always have been interpreted as a _literal_ path.

Do note, however, that it _is_ technically a breaking change, because someone could rely on code such as the following:

# Note: Only works if exactly ONE *.txt file exists in the current dir.
switch -file *.txt { 'hi' { 'there' } }

That said, I think it's a Bucket 3: Unlikely Grey Area change.

mklement0 on 26 Feb 2019

@SteveL-MSFT Could PowerShell Committee approve the breaking change?

iSazonov on 26 Feb 2019

Wouldn't it be better to just clarify this in the documentation? It's easily solved:

$file = [WildcardPattern]::Escape('foo[0].txt')
switch -file $file { 'abc' { 'something something' } }

IISResetMe on 28 Feb 2019

I mean, given the fact that having a wildcard that resolves to multiple files _fails with no reason_ I think that it would make more sense to be a literal path.

The alternative would be to _fix_ the wildcard interpretation by a) allowing it to actually work with wildcards that resolve to multiple files, and b) add a -LiteralFile switch instead to allow ease of use with literal paths.

Both should probably accept an array of values, too.

vexx32 on 28 Feb 2019

👍2

I'd consider the Issue as UX issue. Any complications worsen UX. So allowing wildcards implicitly convert switch to cycle that is likely to worsen UX.

iSazonov on 28 Feb 2019

If we stick with the current single-input-file-only behavior:

The always-as-wildcard interpretation is so useless and unexpected (as it is in the invocation scenarios in #4726) that the _only_ reason to stick with it would be the fear of breaking existing code.
Since I think it's pretty unlikely someone relied on the old behavior (why would you use a wildcard that you know must resolve to a _single_ file and not just target that single file directly? if you don't know, use Get-Item / Convert-Path / Resolve-Path), the behavior should be fixed.

Allowing _multiple_ input files is intriguing, but I think it would requires additional functionality to be useful:

Access to information about the file currently being processed.
Possibly also allow actions to be taken before and after a given file has been processed, along the lines of GNU awk's BEGINFILE / ENDFILE patterns.

While switch -File currently languishes in obscurity, perhaps adding such functionality could make it a more widely used tool; it certainly provides great performance benefits.

Even then it's possible to stay away from _wildcards_, however, so as to avoid the whole -Path / -LiteralPath (-File, -LiteralFile) mess; with _array_ support, you could then write:

switch -File (Get-Item *.txt) { ...

mklement0 on 28 Feb 2019

Looks like duplication of Select-String.

iSazonov on 28 Feb 2019

To be clear:

I'd personally be fine with just fixing the current behavior without adding functionality, given that the switch statement is already a complex beast and that wrapping a Get-ChildItem loop around switch -file is not much of a hardship (and any per-input-file logic would go into that loop).

As for your specific argument:

Looks like duplication of Select-String.

It would be a duplication in (loosely) the same sense that -match is a duplication of Select-String: an expression-mode alternative that performs much better (and provides more flexibility).

mklement0 on 28 Feb 2019

@mklement0

why would you use a wildcard that you know must resolve to a _single_ file and not just target that single file directly?

Interactively it's great. I don't think a day goes by that I don't use a wildcard with cd at least once (so many guid folders). Granted I can probably count the number of times I've used switch interactively on one hand.

I'm not particularly arguing for or against the suggested changes (partially because I didn't know until now switch even had that parameter). It just seemed like the general idea behind them is that it's pointless to have wildcards if the target must be a single file, if that's the case I strongly disagree.

SeeminglyScience on 1 Mar 2019

Interactively it's great.

Yes, but if you're looking for _one_ file interactively, and you don't want to type the whole name / you're not sure of its exact name, you you'd use _tab completion_ for that; the _statement_ doesn't need to support it - and it even works when you type switch -file *.txt<tab> (but see next point).

Granted I can probably count the number of times I've used switch interactively on one hand.

Personally, I can count it on zero hands 😁.
We are talking about _scripting_ here.

I didn't know until now switch even had that parameter

Yes, it's little-known, at least in part because you don't _expect_ it, given that the analogous construct in other languages doesn't have such a feature - it is quite handy, though; and fast, by PowerShell standards.

is that it's pointless to have wildcards if the target must be a single file

To summarize:

In scripting, it is pointless - it creates more problems than it solves, notably with _accidental_ wildcard-like literals paths such as foo[0].txt.
- A related problem from #4726: $null > ./foo[0].txt -> Cannot perform operation because the wildcard path ./foo[0].txt did not resolve to a file.
Interactively - if needed at all - you can use tab completion.

mklement0 on 1 Mar 2019

👍2

You're quite right @mklement0 (I forgot we're talking about _a single file here_), this behavior is useless at best. +1 for treating it as a literal path

IISResetMe on 1 Mar 2019

👍2

Yes, but if you're looking for _one_ file interactively, and you don't want to type the whole name / you're not sure of its exact name, you you'd use _tab completion_ for that

If you mean that's how everyone would use it currently, I would disagree. I personally do not tend to do that because the pattern is typically easier to edit than the expanded value if there's an issue. I can't say which is more common, but either way it's not universal.

We are talking about _scripting_ here.

Right I understand that, but that's my point. It makes perfect sense from a _scripting_ perspective, but _only_ from that perspective.

SeeminglyScience on 1 Mar 2019

@SeeminglyScience Look at it from this angle: wildcard patterns are inherently _ambiguous_ (hence their usefulness) - and it doesn't make sense to expect the user to supply an ambiguous reference when we're explicitly looking for _exactly one file_.

IISResetMe on 1 Mar 2019

👍1

@IISResetMe In a script I agree 100%. I definitely would not recommend using it there. Interactively however, it makes the same amount of sense that all of the other shell related quality of life features do.

For example, lets say you are navigating through HKLM:/Software/Microsoft/Windows/CurrentVersion/Uninstall. If you are trying to cd to a specific key, it's most likely a guid. Yeah you could use wildcards and then tab complete, but that's going to be almost a whole line of the prompt by itself. And yeah, that's not that big of a deal, but neither is typing Set-Location instead of cd.

Maybe none of that applies to this scenario because it's an obscure parameter on a keyword unlikely to be used interactively all that often. I'll concede that absolutely, but the idea that wildcards are only useful when targeting multiple items is not taking interactive use into account.

SeeminglyScience on 1 Mar 2019

You're talking about two different things now - _relative path resolution_ is absolutely useful regardless of whether you're describing one item or multiple ones, and yes there are definitely cases where the default tab completion behavior (globbing _and_ absolute path resolution) are annoying for interactive use.

But again, super narrow use case here (and _only files_, no other providers)

IISResetMe on 1 Mar 2019

👍2

Yeah we're on the same page. What I disagree with is damaging the interactive experience, even though none of us happen to use it interactively.

Tbh I think it would make more sense to add support for multiple files rather than to remove wild card support.

SeeminglyScience on 1 Mar 2019

👍2

I'm definitely down for that, but I think that a switch to allow proper literal path input would be needed as well.

Switch is very useful in a lot of cases, so expanding that utility is a good call overall, I think. 😄

vexx32 on 1 Mar 2019

👍2

Agreed, @vexx32 - switch -File in particular is a powerful feature that deserves wider exposure.

If we go the route of supporting wildcards / an array of files, in addition to a dedicated -LiteralFile parameter, for multi-file processing to be truly useful, we'd need at least new functionality to provide context information about the file currently being processed and possibly also special labels to handle before/after per-file events, as discussed above.

I'm personally fine with that added complexity, but not everyone may be.

mklement0 on 1 Mar 2019

but not everyone may be.

The feature "switch -file" is seems very rarely used. Is it worth the effort?

iSazonov on 1 Mar 2019

If we don't go that route, @SeeminglyScience:

Before we look at whether any interactive experience would actually be damaged:

Compromising the robustness of a language feature for the sake of interactive convenience is ill-advised.

Let's look at the interactive experience, _which wouldn't change_:

With foo.txt present in the current dir:

switch -file *.txt<tab>

expands to:

switch -file ./foo.txt

which seems reasonable to me - no problem with expansion to a full path and bloating of the command line.

The same goes for your example with HKLM:/Software/Microsoft/Windows/CurrentVersion/Uninstall:

You can't _submit_ with a wildcard in place - because _multiple_ paths match, so you'll get an error message.

You can _tab-complete_, however, to _home in_ on the one path you're interested in and the submit the expanded result.

# Cycle through child keys, then submit.
cd HKLM:/Software/Microsoft/Windows/CurrentVersion/Uninstall/*<tab>

So, what convenience would be lost?

mklement0 on 1 Mar 2019

@iSazonov:

The more interesting question is:

_Should_ it be used more frequently, because it is useful and people are missing out by not using it?

If we agree that it is useful, we can:

try to make it better-known
improve it.

mklement0 on 1 Mar 2019

👍2

@mklement0

Compromising the robustness of a language feature for the sake of interactive convenience is ill-advised.

I can see where you are coming from with this, and I respect the perspective. We do not agree here though, I believe the interactive experience is just as important as the scripting experience. There are already many instances of compromised robustness for the interactive experience.

You can't _submit_ with a wildcard in place - because _multiple_ paths match, so you'll get an error
message.

You can't submit with _just_ a wild card. Lets say you have a folder like this.

    |- {1620545f-25c8-4db7-a4ba-88ffbfb8ad66}
        |- MyFile.txt
    |- {1ed1767f-0e9f-4040-b128-d23ec967552d}
    |- {37e65536-9c18-4f1b-b78b-41fe22f6d20d}
    |- {4bb70271-a65e-48a9-9570-50b2fa3b1db7}

You could cd to the last folder with cd *-50b* and you could access the txt file with switch -File ./*/MyFile.txt.

Yeah they can still be tab completed, but the former would become cd '{4bb70271-a65e-48a9-9570-50b2fa3b1db7}' and the latter switch -File './{1620545f-25c8-4db7-a4ba-88ffbfb8ad66}/MyFile.txt'.

SeeminglyScience on 1 Mar 2019

Thanks, @SeeminglyScience.

There are already _many_ instances of compromised robustness for the interactive experience.

I know it's a tangent, but if you'll indulge my curiosity: are you thinking of elastic syntax (e.g., prefix-unique parameter names such as -Expand for -ExpandProperty), or are you thinking of other cases (too)?

As for your example:

Yeah they can still be tab completed, but the former would become cd '{4bb70271-a65e-48a9-9570-50b2fa3b1db7}' and the latter switch -File './{1620545f-25c8-4db7-a4ba-88ffbfb8ad66}/MyFile.txt'.

I know it's a matter of personal opinion, but that seems like a small price to pay to me, especially if a robust scripting experience is to be gained - especially in this case, given that switch is not really suited to interactive use.

mklement0 on 1 Mar 2019

I know it's a tangent, but if you'll indulge my curiosity: are you thinking of elastic syntax (e.g., prefix-unique parameter names such as -Expand for -ExpandProperty), or are you thinking of other cases (too)?

That's a good example, there's definitely other cases though. Here's some off the top of my head:

Syntax is designed around accomidating interactive use
- Member expressions cannot have the . on a new line
- Curly brackets cannot be on a new line during command parsing
Module autoloading. Can lead to loading clobbered (and possibly broken) commands. If depended on in a script it can also lead to issues with portability (different PSModulePaths/different installed modules) or simply make it more difficult to tell what modules a script depends on. Yeah you can fully qualify and/or set PSModuleAutoLoading in a try/finally but both neither are great for readability.
Aliases. I don't mean the typical "people use aliases in scripts". I mean the fact that aliases exist and take priority over all over commands in look up really limits what "private" function names you can use. For example, the scoop project uses a function named shim heavily, and I had an alias for shim which completely broke the project.
The level of dynamicity. I'm not saying PowerShell would have been static if it didn't have an interactive component, but I don't think it would be as dynamic. There are lots of issues where state (variables, commands, imported modules, etc) is leaked into a script in unexpected and difficult to troubleshoot ways.

I think in the answer in general is that a sizable portion of core design decisions were made with interactivity in mind. Often with some sort of downside for scripting. I think PowerShell is excellent in both interactive use and scripting, but scripting would have definitely been more robust without worrying about interactive use.

SeeminglyScience on 4 Mar 2019

👍1

SteveL-MSFT on 7 Mar 2019

👍4

Great to hear, @SteveL-MSFT; let me state again that while making -File support multiple files is great in principle, its utility is severely diminished if you don't also _somehow_ provide information as to which particular file's lines are being processed, analogous to Select-String providing that information in the Path ETS property that matching lines are decorated with.

mklement0 on 15 Apr 2019

Thanks for the examples, @SeeminglyScience.

While there is an inherent tension between designing for interactive convenience and readability / long-term maintainability, PowerShell by and large does a great job of accommodating both - but there's plenty of room for improvement.

Syntax is designed around accommodating interactive use. Member expressions cannot have the . on a new line. Curly brackets cannot be on a new line during command parsing

True, but the curly braces restriction doesn't apply in expression mode, and in expression mode you can at least _end_ the line with . in order to continue on the next line.

Module autoloading.

This problem strikes me as solvable: with side-by-side module loading using fully qualified module names and scoped module imports.

Aliases

A current problem for sure, but we could take a page out of Bash's book, where alias expansion is _turned off_ in scripts by default; see https://github.com/PowerShell/PowerShell/issues/2000#issuecomment-326649565

My linked comment also proposes an _opt-in_ for disallowing _elastic syntax_ in scripts (such as using -Expand for -ExpandProperty).

On the flip side, #4730 proposes making elastic syntax even more elastic by allowing you to prefix-abbreviate _operator_ names (e.g., 'foo' -m 'o' for 'foo' -match 'o').

There are lots of issues where state (variables, commands, imported modules, etc) is leaked into a script in unexpected and difficult to troubleshoot ways.

Indeed, PowerShell currently has a problem with global state affecting code, taking the global effect of Set-Location and the dynamic scoping of Set-StrictMode as examples.

Even the dynamic scoping of variables (descendent scopes seeing a caller's variables by default) is worth reconsidering.

Again, these problems are solvable.

As an aside: Bash doesn't have the working-dir. problem, for instance, because its scripts execute in a _child process_; for the same reason, there's no ability to modify the _global_ scope.

mklement0 on 15 Apr 2019

True, but the curly braces restriction doesn't apply in expression mode, and in expression mode you can at least end the line with . in order to continue on the next line.

Yeah it's limited in scope for sure, just something that bothers me personally a lot.

Note: I'm not advocating for it to be changed, I understand why it is the way it is and I don't see a better solution. It's still the first example I can think of though 🙂

This problem strikes me as solvable: with side-by-side module loading using fully qualified module names and scoped module imports.

I would be more inclined to call fully qualifying commands is a workaround than a fix. It severely hurts readability and forces you to splat just about every command if you also want to keep a reasonable line length.

I'm not clear on what you mean by scoped module imports.

A current problem for sure, but we could take a page out of Bash's book, where alias expansion is turned off in scripts by default; see #2000 (comment)

I would absolutely love to see an opt-in AST-level (similar to using) setting for that. I think making it default would cause enough issues to hurt adoption significantly.

On the flip side, #4730 proposes making elastic syntax even more elastic by allowing you to prefix-abbreviate operator names (e.g., 'foo' -m 'o' for 'foo' -match 'o').

I've always wanted that for interactive use 😁 I mess that up going back and forth between ? prop -m value and ? {$_.prop.otherprop -m 'value'} at least once a day. Unfortunately I also agree that it probably shouldn't happen to allow for new operators. Though I dream of a hacky way I could enable it in my profile 🙂

Even the dynamic scoping of variables (descendent scopes seeing a caller's variables by default) is worth reconsidering.

Again I'd love to see an AST-level modifier for this. Default seems a bit too far on the drastic side, I think that ship has sailed.

Also just to clarify, I didn't necessarily give those examples as things that need to be fixed. They all have work arounds (though some of them should certainly have better work arounds) They're just examples of the scripting experience taking a hit for the good of the interactive experience. Whether or not they should be fixed is certainly debatable, but they are still decisions that were made at least partially to cater to interactivity.

SeeminglyScience on 15 Apr 2019

As for opt-in vs. changing defaults:

Understood that opt-ins are the best we can do if backward compatibility is a must.
But at some point we're back to #6745 and the hope that there may be a version that can shed historical baggage and doesn't require explicit opt-in to what should be the default.

I would be more inclined to call fully qualifying commands is a workaround than a fix.

If you want robust dependency management, you need to be explicit about version number [ranges].
Fully agreed that the current method for specifying FQMN (fully-qualified module names) is unwieldy, but can we can take a page out of npm's book, with identifiers such as [email protected] - and generally establishing semver (semantic versioning) in the PowerShell world sounds like a worthwhile goal as well.

I'm not clear on what you mean by scoped module imports.

Currently, module imports take effect _scope-domain-wide_ by default (by _scope domain_ I mean the scope stacks for non-module code vs. each module's).

Just like we're discussing variables and strict mode as perhaps better being lexically scoped, so could modules.

You _can_ already restrict Import-Module to the current scope with -Scope Local, but (a) it's an explicit opt-in and (b) _its descendants_ again see the imports too, and - perhaps most importantly - using module is implicitly and invariably scope-domain-wide.

They're just examples of the scripting experience taking a hit for the good of the interactive experience.

Among all the things discussed, the _only_ one that strikes me as _inherently_ necessary - due to PowerShell having to be both a shell and a full-fledged scripting language - is the one related to argument mode vs. expression mode (curly braces, periods); as an aside: @KirkMunro's PR for allowing | to be placed on the next line - #8938 - will bring _some_ relief there.

All others _need not be_ compromises, and should be fixed:

as _opt-ins_, if backward compatibility is paramount
by changing the _default behavior_, in a version allowed to break compatibility

mklement0 on 15 Apr 2019

Understood that opt-ins are the best we can do if backward compatibility is a must.
But at some point we're back to #6745 and the hope that there may be a version that can shed historical baggage and doesn't require explicit opt-in to what should be the default.

I'm not suggesting it be opt-in purely for compatibility. I don't agree that we should make scripts act all that differently by default. I'd love to see some opt-in options for making a script or module a lot more strict, but having code that runs (more or less) the same in a script as it does interactively is incredibly valuable. That keeps the learning curve very approachable and makes experimentation and prototypes a breeze. I'd love to see the opt-in settings for an easier way to get a more stable experience in large scale deployments/unknown environments, I do worry about the effect such a default would have outside of that scenario.

If you want robust dependency management, you need to be explicit about version number [ranges].
Fully agreed that the current method for specifying FQMN (fully-qualified module names) is unwieldy, but can we can take a page out of npm's book, with identifiers such as [email protected] - and generally establishing semver (semantic versioning) in the PowerShell world sounds like a worthwhile goal as well.

Yeah for sure, but my point is that it shouldn't need to be declared on every instance of every command invocation.

We need a way to say I want command discovery to prioritize:

Module x
With version y
For this scope/module

So when you call Get-Item It doesn't import a version from C:\BadVendor\MyHackyCommandOverrides.psd1 with no Path parameter. Right now you can do Microsoft.PowerShell.Management\Get-Item but a script full of that is very difficult to read.

Currently, module imports take effect scope-domain-wide by default (by scope domain I mean the scope stacks for non-module code vs. each module's).

I know you probably already know this, but for anyone else following: PowerShell refers to that as a SessionState. There's one for each module and one that is global (referred to by the engine as the top level session state)

I'm not really sure the benefit of changing it to default to scope, but I don't have very strong opinions on it.

Among all the things discussed, the only one that strikes me as inherently necessary - due to PowerShell having to be both a shell and a full-fledged scripting language

I didn't mean to imply there was no other choice, or even that interactivity was the sole reason behind those decisions. I do believe though, that they were made at least in part if not mostly because of how they would impact interactivity. I don't really know if I would refer to these decisions as things to be solved, but an opt-in way to overcome the drawbacks of them would be awesome.

SeeminglyScience on 16 Apr 2019

but having code that runs (more or less) the same in a script as it does interactively is incredibly valuable.

Fully agreed, but there's a simpler solution: have distinct filename extensions:

one for "loose" code, just like on the command line.
one for strict code, where the extension alone turns on - and enforces - all behaviors needed for robust development

I know you probably already know this, but for anyone else following: PowerShell refers to that as a SessionState

I do know, thanks to your excellent blog post exploring the subject: https://seeminglyscience.github.io/powershell/2017/09/30/invocation-operators-states-and-scopes

However, I think this term invites everlasting confusion, and given that it isn't yet widely known - fortunately, in this case - I was hoping to get a more descriptive name established, and my suggestion is _scope domain_: the _default scope domain_ for non-module code, and one _module scope domain_ for each loaded module - writing up a proposal in the docs repo has been on my to-do list for some time.

I'm not really sure the benefit of changing it to default to scope, but I don't have very strong opinions on it.

It's the same benefit as with all other instances of lexical scoping: what you do in the privacy of your own scope (if you will) doesn't affect others: my importing a module shouldn't magically make commands appear in your scope - unless that is the _explicit_ intent.

I do believe though, that they were made at least in part if not mostly because of how they would impact interactivity

The question is ultimately moot: It sounds like we agree that there are problems with the existing behavior and that we should think about ways to make things better (without compromising the interactive experience).

mklement0 on 16 Apr 2019

Fully agreed, but there's a simpler solution: have distinct filename extensions:

one for "loose" code, just like on the command line.
one for strict code, where the extension alone turns on - and enforces - all behaviors needed for robust development

Love it ❤️

I do know, thanks to your excellent blog post exploring the subject:

😁

However, I think this term invites everlasting confusion, and given that it isn't yet widely known - fortunately, in this case - I was hoping to get a more descriptive name established, and my suggestion is scope domain: the default scope domain for non-module code, and one module scope domain for each loaded module - writing up a proposal in the docs repo has been on my to-do list for some time.

The word domain makes me think of something as Runspace level. It took me a few reads to get what you meant. To each their own on this one, I generally stick to referring to things as the engine does, but I can definitely see your point.

SeeminglyScience on 16 Apr 2019

👍1

Glad to hear it, @SeeminglyScience.

Regarding the terminology question: I've finally created a docs issue, and I invite you to weigh in there (I've already quoted your concern re _scope domain_ there): https://github.com/MicrosoftDocs/PowerShell-Docs/issues/4288

mklement0 on 9 May 2019

Interesting, why do we consider only files? We could generalize to support any provider with -Path/-LiteralPath.

iSazonov on 10 May 2019

👍1

We can use the same restriction that ${C:\test.txt} has - the provider must support Get-Content functionality.

Sounds good to me. However, then we have the semantic issue where this _really_ ought to be named switch -Path file.txt / switch -LiteralPath file[].txt

Can we alias these in some fashion?

vexx32 on 10 May 2019

👍1

Was this page helpful?

0 / 5 - 0 ratings