Powershell: Can we allow compound statements (loops and conditionals) as the first pipeline segment and as sub-expressions too?

Created on 3 May 2018  Β·  14Comments  Β·  Source: PowerShell/PowerShell

In the world of _assignments_, _compound statements_ (loops and conditionals) can be used as value-returning _expressions_ too:

$str = 'hi'  # simple expression
$flag = if ($str -eq 'hi') { 1 } else { 0 }  # flow-control statement works too; returns 1

By contrast, in the context of a _pipeline_ or as _sub-expressions_, flow-control statements do _not_ work as-is:

# Simple expressions: OK as the 1st pipeline segment / with redirections.
'hi' > out.txt
'hi' | Out-File out.txt

# Flow-control expressions: do NOT work (directly) as the 1st pipeline segment.
# What follows the flow-control expression is treated as a *new statement* (and these 
# cause a syntax error here):
if ($str -eq 'hi') { 1 } else { 0 } > out.txt
if ($str -eq 'hi') { 1 } else { 0 } | Out-File out.txt

# Similarly, trying to use a flow-control statement as a *sub-expression* 
# (part of a larger expression), fails:
(foreach ($i in 1..3) { $i }) -join ', '   # !! Only works with $(...), not just (...)

You can work around this:

  • for use in the pipeline: by enclosing the flow-control statement either in & { ... } / . { ... } for _streaming_ output.
  • for use in expressions: by enclosing the flow-control statement in $(...) or @(...), which collects all output up front.

However, the need for that is not obvious, and it is somewhat cumbersome, and in the case of sub-expressions carries a performance penalty.

There may be parsing challenges and ambiguities I'm not considering, but perhaps compound statements can be treated the same as simple expressions in these contexts, allowing their direct use as the first pipeline segment / redirection source.

In other words: If compound statements were bona fide _expressions_, the above problems would go away (though streaming behavior should be retained when used as the 1st pipeline segment).

P.S.:

  • Is there a better or more established term for _compound statements_?
  • This issue was inspired by this SO question.

Environment data

Written as of:

PowerShell Core v6.0.2
Issue-Discussion

Most helpful comment

Sorry, I didn't mean any disrespect. Thumbs down is less common here than in other repos, so it might have a connotation that I didn't intend.

That said, it's often stated that decisions are largely made based on community consensus. My intent was to indicate that I disagree without bumping the thread or paraphrasing the reasons why. You could argue that it was clear I disagreed from context, but to be honest it seems like context is often lost in committee meetings.

All 14 comments

This kind of begs the question of why can't we use these anywhere in the pipeline directly? It would effectively remove the need for things like Where-Object entirely.

@vexx32:

It makes sense to limit use of expressions to the _first_ pipeline segment, because while expressions create _output_, they are not prepared to handle _pipeline input_.
In other words: they can _start_ a pipeline, but don't fit in the middle or at the end of pipeline.

In the world of expressions you already have ForEach-Object and Where-Object analogs: the .ForEach() and .Where() collection "operators" (methods); e.g.:

PS> (1..3).ForEach({ $_ + 1 })
2
3
4

An RFC proposal (of mine) suggests surfacing these methods as bona fide PowerShell _operators_, which would allow you to write:

PS> 1..3 -foreach { $_ + 1 }
2
3
4

In https://github.com/PowerShell/PowerShell/issues/10967#issuecomment-561843650, @rjmholt has provided the explanation for why what the initial post is asking for is _not_ possible with the current grammar :

PowerShell has pipelines contained by statements, not statements contained by pipelines. That's always been the case, [...]

In short, if I understand correctly:

  • What I call a _compound statement_ in the initial post is _one_ form of a statement.
  • _Pipelines_ are another (which includes expressions by themselves and expressions as the 1st pipeline segment).

Currently, never the twain shall meet, sadly.

Essentially, with such a syntactic change you're asking for a new language, with a different treatment of syntactic and semantic constructs like expressions, pipelines and statements.

Asking largely hypothetically, @rjmholt - I do recognize how such a change would be the mother of all changes:

  • If we could start from scratch, would treating what I've called compound statements the same as expressions be feasible (that is, also allow them in a pipeline, but only as the _first_ segment)?

  • Would any current code break, if we did?

In other words: Would a grammar such as the following work (adapted from https://github.com/PowerShell/PowerShell/issues/10967#issuecomment-549167463)?

pipeline:
 | β€˜return’ pipeline [β€˜&’]
 | pipeline [β€˜&’]
 | pipeline β€˜|’ command_expression
 | expression

expression:
 | ...             # true expressions such as `1+2` or `'1,2' -split ','`)
 | command_expression
 | compound_expression

compound_expression:
 | `foreach` ...
 | `while` ...
 | `if` ...
 # ...

command_expression: 
 | β€˜Get-Item /β€˜   # e.g.

Asking largely hypothetically, @rjmholt - I do recognize how such a change would be the mother of all changes:

  • If we could start from scratch, would treating what I've called compound statements the same as expressions be feasible (that is, also allow them in a pipeline, but only as the _first_ segment)?

Yeah if you're starting from scratch, pretty much anything is feasible.

  • Would any current code break, if we did?

All third party AST based tooling for one. The shape of all of those API's would be dramatically different.

As for PowerShell scripts specifically, also yeah. I mean it's probably possible to redesign the language and rewrite the parser without breaking anything, but it'd be one a hell of a trick.

Thanks, @SeeminglyScience.

By contrast, in the context of introducing && and || the grammar had to be modified without breaking anything(?); what are your thoughts on the impact of the change I've proposed in https://github.com/PowerShell/PowerShell/issues/10967#issuecomment-568220288? (Please comment there, if you're up for it.)

@mklement0 I'm not sure what you're asking for in that issue, from a technical standpoint. How do you propose that be done without making return and exit something other than statements?

I think you'd have to make them pipelines or expressions, which would also be a breaking change for tooling. Maybe you could introduce new ASTs that are the expression/pipeline version of those statements but that makes it very confusing imo.

On a more subjective note, I also just don't like how hard it would be to read. I know PowerShell has historically been all about letting users shoot themselves in the foot, and that can be great. However, after the whole question mark in variable names situation I'm not keen on the idea of adding more features like that.

return wouldn't have to change (from what it was _before_ the current && / || implementation), only exit would be the exception - it would be allowed to moonlight as a non-initial link in a pipeline chain, which is what users will - sensibly - expect.

Note that shooting yourself in the foot is already possible; that is, exit works as the _first_ link of a pipeline chain:

# This happily exits your session, irrespective of what comes after the `&&`
exit 0 && Get-Item /foo

That the - perfectly sensible and common - reversal does _not_ work is why I'm proposing the exception:

# Doesn't work - requires $(...) around `exit 0`; ditto with `return`
Get-Item /foo && exit 0 

I think you'd have to make them pipelines or expressions, which would also be a breaking change for tooling. Maybe you could introduce new ASTs that are the expression/pipeline version of those statements but

It is the latter I was thinking of, but I'm definitely out of my depth here.

that makes it very confusing imo.

It may make the _implementation_ more confusing, of necessity (too late to change the fundamentals), but to me it definitely _lessens_ the confusion for the _users_.

letting users shoot themselves in the foot
I also just don't like how hard it would be to read.

I think that my proposal _helps_ with both aspects, because in the current implementation:

  • That the return in return ls /foo || Write-Host 'Continuing?' applies to the _whole chain_, and not just to a chain _link_ is unexpected and confusing.

  • That the very common idioms ls /foo || exit 1 and ls /foo || return must be written as ls /foo || $(exit 1) and ls /foo || $(return) is unexpected, confusing, and cumbersome (a hat-trick).

Note that in the context of PowerShell there is no behavioral precedent to adhere to in these situations: These are new features, and the behavior I'm proposing will not only make it easier for Bash users to use them, but, I believe, generally makes more sense and is easier to conceptualize than the current implementation.

return wouldn't have to change (from what it was _before_ the current && / || implementation), only exit would be the exception - it would be allowed to moonlight as a non-initial link in a pipeline chain, which is what users will - sensibly - expect.

I'm not following, why only exit? That makes it even harder to understand the language rules.

Note that shooting yourself in the foot is already possible; that is, exit works as the _first_ link of a pipeline chain:

# This happily exits your session, irrespective of what comes after the `&&`
exit 0 && Get-Item /foo

Okay but how do you expect that to work? Because if you change 0 to 10 the result doesn't change. The exit statement is taking the pipeline 0 && Get-Item /foo, executing it, and then using the results (an array of @(0; Get-Item /foo)) as the "exit code" (which it doesn't know how to interpret and uses 0 instead). Don't think of it as the first link, think of it as a modifier of the whole chain. It's basically like doing this:

$myInvalidExitCode = 0 && Get-Item /foo
exit $myInvalidExitCode
# or
exit $(0 && Get-Item /foo)
# also, this exits with exit code 30
exit 0 && $(exit 30)

And that's kind of the point, the problem isn't "how would you make it work for later portions of the chain" it's how would you fit it into an actual link, changing the behavior significantly.

It may make the _implementation_ more confusing, of necessity (too late to change the fundamentals), but to me it definitely _lessens_ the confusion for the _users_.

I hear ya, but I don't agree. I think it makes the language a lot more confusing past the surface level, and any surface level benefit doesn't outweigh the cost of the complexity.

  • That the return in return ls /foo || Write-Host 'Continuing?' applies to the _whole chain_, and not just to a chain _link_ is unexpected and confusing.
  • That the very common idioms ls /foo || exit 1 and ls /foo || return must be written as ls /foo || $(exit 1) and ls /foo || $(return) is unexpected, confusing, and cumbersome (a hat-trick).

Yeah I agree that it's unfortunate. I still don't think it makes sense for PowerShell.

Note that in the context of PowerShell there is no behavioral precedent to adhere to in these situations:

They are extensions of pipelines which currently follow the same rules.

These are new features, and the behavior I'm proposing will not only make it easier for Bash users to use them, but, I believe, generally makes more sense and is easier to conceptualize than the current implementation.

Probably right in regards to bash users, which is very regrettable. I don't agree with the rest though. My personal opinion is that it should stay how it is. I understand the arguments on both sides fully, I just don't agree on this one.

I'm not following, why only exit?

My thought was: If we change the && / || implementation to how return _used to work_ - i.e., returning a _single_ pipeline's result - nothing needs to change in pipeline _chains_ - but I now realize that return is not itself part of a pipeline and therefore indeed requires an exception too.

With the exit exception also in place, you then get consistent behavior for return and exit: both only apply to their chain _link_, not to the _rest of the chain_.

Okay but how do you expect that to work?

In that vein: I expect exit to only apply to the _first link_, and, by the nature of exit, for the second link to be _ignored_.

That exit 0 && Get-Item /foo would evaluate 0 && Get-Item /foo _as a whole_ and pass the output to exit is, frankly, baffling to me.
(With return it's debatable, but there too only passing 0 - i.e. _not_ crossing chain-link boundaries - makes much more sense to me).

As an aside: that exit quietly ignores an invalid exit code and defaults to 0(!) is a problem in itself.
Even the fact that you can pass a (single) whole _pipeline_ to both return and exit is probably _not_ widely known, if I were to guess - most real-world uses I've seen pass a variable / literal.

Scoping return and exit to a chain _link_ is what most users will likely expect - not just coming from Bash - and it allows for the natural foo || exit / foo || return syntax.

As stated, conditionally _exiting_ (returning from) the current scope based on _individual_ pipelines' outcomes is a primary use case for && / || chains; passing entire chains to exit / return contravenes that goal.

My hunch is that most PowerShell users will not only naturally assume that foo || exit works, but also will also be oblivious to the fact that this use of exit doesn't fit the fundamentals of the grammar.

any surface level benefit

To me, ensuring that a feature makes sense to users is much more than a surface-level benefit.

They are extensions of pipelines which currently follow the same rules.

I don't think of them as _extensions to_ pipelines, but as a _conditional sequencing of_ them; that is, _independent pipelines_ are _combined_.

Other than being forced by historical design decisions ("The current grammar just substitutes pipeline chains for pipelines" - https://github.com/PowerShell/PowerShell/issues/10967#issuecomment-549167463), I see no justification for the current behavior.

_That said, it may well be that what I'm asking for is still too much to shoehorn into / breaks the current grammar_ - a question I cannot answer by myself:

statement:
 | ...
 | ’return’ pipeline [β€˜&’]
 | ’exit’ pipeline
 | pipeline [β€˜&’]
 | pipeline_chain

pipeline_chain:
 | pipeline_chain β€˜&&’ pipeline [β€˜&’]
 | pipeline_chain β€˜&&’ ’return’ pipeline [β€˜&’]
 | pipeline_chain β€˜&&’ ’exit’ pipeline
 | pipeline_chain β€˜||’ pipeline [β€˜&’]
 | pipeline_chain β€˜||’ ’return’ pipeline [β€˜&’]
 | pipeline_chain β€˜||’ ’exit’ pipeline
 | pipeline

I'm not following, why only exit?

My thought was: If we change the && / || implementation to how return _used to work_ - i.e., returning a _single_ pipeline's result - nothing needs to change in pipeline _chains_ - but I now realize that return is not itself part of a pipeline and therefore indeed requires an exception too.

With the exit exception also in place, you then get consistent behavior for return and exit: both only apply to their chain _link_, not to the _rest of the chain_.

And inconsistent behavior with all other keywords, and with how all other language elements work with return and exit. From an implementation standpoint, the only thing I can think of that would be within the realm of feasibility would be allowing any statement to be chained (but that has it's own problems).

Okay but how do you expect that to work?

In that vein: I expect exit to only apply to the _first link_, and, by the nature of exit, for the second link to be _ignored_.

That exit 0 && Get-Item /foo would evaluate 0 && Get-Item /foo _as a whole_ and pass the output to exit is, frankly, baffling to me.
(With return it's debatable, but there too only passing 0 - i.e. _not_ crossing chain-link boundaries - makes much more sense to me).

πŸ€·β€β™‚ that's how everything else works. I get the confusion in reference to other languages, but it just doesn't make sense for PowerShell imo.

As an aside: that exit quietly ignores an invalid exit code and defaults to 0(!) is a problem in itself.
Even the fact that you can pass a (single) whole _pipeline_ to both return and exit is probably _not_ widely known, if I were to guess - most real-world uses I've seen pass a variable / literal.

return working that way is pretty wildly known from what I've seen. exit is a bit surprising, though tbh I'm not really sure what I'd rather it do. Too late to change now anyway.

Scoping return and exit to a chain _link_ is what most users will likely expect - not just coming from Bash - and it allows for the natural foo || exit / foo || return syntax.

As stated, conditionally _exiting_ (returning from) the current scope based on _individual_ pipelines' outcomes is a primary use case for && / || chains; passing entire chains to exit / return contravenes that goal.

Yeah I'm not disputing that. When I say that I don't think it makes sense for PowerShell, and that it is not feasible, I'm saying that with this in mind. I understand all of the reasons why on the surface this seems like the obvious right move, but I strongly disagree that it is.

πŸ€·β€β™‚ that's how everything else works

Before && and || there was no everything else. There was just a _single_ pipeline you could pass to return (which you say is well-known; as an inconsequential aside: I hadn't heard of it until this discussion, and don't recall seeing it on Stack Overflow) and to exit (which we agree is uncommon).

And that wouldn't go away with what I'm proposing.

No end user previously had to think about the fact that sticking a return or exit in front a pipeline _technically_ made it a _statement_ and that that means you can't use the whole thing _as a pipeline_ - the question simply didn't arise in the absence of pipeline _chains_.

From an end user's perspective, conceiving of a pipeline chain as a sequence of pipelines _each_ optionally preceded by return or exit - or made up of _just_ those keywords - makes much more sense than conceiving of return or exit as something you stick in front of an _entire chain_: the whole point of a chain is _conditional_ (exit) behavior, depending on what links execute.

And I don't think the proposed behavior contravenes the _spirit_ of PowerShell in any way.

It does contravene the current _implementation_, however, and I get that how chaining was implemented fits in best with that.

And inconsistent behavior with all other keywords

Not if you think of exit and return as something you can stick in front of a pipeline - which you always could do - whereas you could never do that with any of the other keywords.

From an implementation standpoint, the only thing I can think of that would be within the realm of feasibility would be allowing any statement to be chained (but that has its own problems).

While I definitely wish we could also use compound statements such as foreach, while, ... in a pipeline - as the _first_ segment only, like expressions (the original topic of this thread) - and therefore also in a chain link, my understanding is that this is what would constitute "the mother of all changes" and is therefore off the table.

I can also see how implementing just the return-and-exit-per-chain-link proposal based on the current implementation without breaking anything may turn out to be too challenging and too much of a maintenance burden (I can't personally assess that).

But it is clear to me that the current chain implementation was dictated by the limitations of the original grammar - whose subtleties most users are probably unaware of - not by what would make the feature most useful to end users.

To bring closure to the question asked in the OP, based on @rjmholt's feedback in https://github.com/PowerShell/PowerShell/issues/10967#issuecomment-561843650:
In the context of the current grammar, the suggested change isn't possible.

I've summarized my conclusions from this exchange with respect to && and || in https://github.com/PowerShell/PowerShell/issues/10967#issuecomment-569285154.

P.S.: Just noticed the thumbs-down on my previous comment, @SeeminglyScience:

I get not wanting to expend more energy on a discussion at a certain point (when you feel like the disagreement is understood but no shared understanding can be reached, when you feel like not being heard, when the conversation is going around in circles, ...), but a thumbs-down as the sole feedback on a comment were multiple points were argued in detail just tells us "I don't like this" - and nothing else; it is a gesture of opposition without content.

I value your expertise, especially in areas where my knowledge is superficial, but the overall tone of this exchange left a sour aftertaste.

Sorry, I didn't mean any disrespect. Thumbs down is less common here than in other repos, so it might have a connotation that I didn't intend.

That said, it's often stated that decisions are largely made based on community consensus. My intent was to indicate that I disagree without bumping the thread or paraphrasing the reasons why. You could argue that it was clear I disagreed from context, but to be honest it seems like context is often lost in committee meetings.

Was this page helpful?
0 / 5 - 0 ratings