Note:
What I'm proposing requires adding special handling to array indices, which is currently _not_ the case: PowerShell has no special array-subscript syntax, it allows use of any expression, as long as it results in an array of (valid) indices.
While this is a very powerful concept, the lack of awareness of the array context precludes some useful features, hence this suggestion.
Note that a related array-slicing feature suggestion for introducing _stepping_, #7928, does _not_ require array-being-sliced awareness and could be implemented on the range operator (..
) itself.
# Sample array
$a = 'one', 'two', 'three', 'four', 'five'
# CURRENTLY required syntax for returning everything starting with the 3rd element:
$a[2..($a.Count-1)]
three
four
five
# WISHFUL THINKING: having $a.Count-1 be *implied* as the end of the range.
$a[2..]
three
four
five
Aside from being more concise, this has the added advantage of not needing the input array to be stored in a variable beforehand.
# Sample array
$a = 'one', 'two', 'three', 'four', 'five'
# CURRENTLY required syntax for returning everything except the last 3 elements:
$a[0..($a.Count-1 - 3)]
one
two
# WISHFUL THINKING: allow specifying just N, without explicitly needing to refer
# to the end of the array.
# Note that $a[0..-3] does NOT work, because it creates array 0, -1, -2, which does something different.
$a[0..@-3]
one
two
As stated, this would require introducing special syntax specific to collection indexing, and such modified range expressions wouldn't make sense outside that context.
I am not wedded to the specific syntax forms proposed above - [<n>..]
and [<m>..@-<n>]
, but what makes them appealing is not having to _explicitly_ refer to the array being sliced in the expression.
The less desirable alternative (from an end-user perspective) would be to introduce a new automatic variable representing the array's highest index, such as $#
.
Written as of:
PowerShell Core 6.1.0
History notes: When I first implemented ranges, I'd planned to support a unary range operator e.g. $a[1..]
but never got around to it (obviously). For the upper bound, I'd been thinking about having a magic variable $end
which would be equivalent to $a.Length-1
so you could write $a[1..$end]
to get everything but the first element. And unfortunately we got the precedence "wrong" for computed endpoints (because we wanted to allow ranges to be concatenatable as in $low .. $middle + $high .. $veryhigh
) so I don't really see how we can avoid parens in things like $a[1..($end-1)]
but it's still much nicer than $a[1..($a.length-2)]
Is there any particular reason why $end
itself cannot map cleanly to the $_.Length - 1
value, circumventing that issue?
@vexx32:
I think @BrucePay indeed meant $end
to be $_.Length - 1
, which requires no arithmetic in the all-remaining scenario, but obviously still does in the except-last-N scenario.
Note that in the all-remaining scenario you can even currently get away without arithmetic, because it's benign (though potentially confusing) to exceed the array bounds by 1:
$a=1,2,3; $a[1..$a.Count] # works, though strictly speaking it should be `($a.Count - 1 )`
2,
3
Yes, the need for parentheses is unfortunate, but providing the automatic highest-index variable would indeed help (I suggested $#
, because, unlike $end
, it is currently a syntax error and therefore cannot clash with existing user variables).
That said, using the variable-less special syntax I proposed would make both problems go away (parentheses, need for new variable).
Are we open to special syntax in the context of indexing ($a[1..]
and $a[0..@-1]
)?
Introducing $#
($end
) wouldn't require special _syntax_, but it still amounts to special-casing the indexing context.
I think a lot of this is handled by Select -Skip
and -SkipLast
, and even works out shorter and clearer to do that:
$a = 'one', 'two', 'three', 'four', 'five'
# CURRENTLY required syntax for returning everything starting with the 3rd element:
$a[2..($a.Count-1)]
$a|select -skip 2
# CURRENTLY required syntax for returning everything except the last 3 elements:
$a[0..($a.Count-1 - 3)]
$a|select -SkipLast 3
Being able to select arbitrary elements such as $a[4,7,1,3]
is nice, but the way ranges include both endpoints, the precedence rules that mean 0..($x-1)
needs parens, they way you can't add/subtract to a whole array like 1,2,3 -1
to make it 0,1,2
..
What about leaving range ..
alone, and introducing an entirely new slice operator, which is basically "Python's slice operator" ?
# PS ranges stay the same
$a[0..3] # items 0,1,2,3
# Pythonic slicing
# start:end
# start:end:step
# missing values for "and the rest"
$a[2:] # items from index 2 through end
$a[:-2] # from the start, stopping 2 before the end
$a[::3] # from the start to the end, in steps of 3
$a[1:-2] # items from index 1.. stopping 2 before the end
$a[2::3] # items from index 2 through end, step 3 at a time
?
it's benign (though potentially confusing) to exceed the array bounds by 1
Except in Strict-Mode, then it's System.IndexOutOfRangeException
@HumanEquivalentUnit:
As a general rule, expression-mode solutions and pipeline solutions aren't interchangeable, for performance reasons.
Yes, Select -Skip
and Select -SkipLast
are the _functional_ equivalent of what we're looking for in the realm of _pipelines_, but they're not an option for performant code in the realm of _expressions_.
What about leaving range .. alone, and introducing an entirely new slice operator
My preference is to make do with minor tweaks to the existing range-operator syntax, to avoid confusion and reduce complexity (another thing to learn).
As an aside:
Except in Strict-Mode, then it's System.IndexOutOfRangeException
Good point, though, to be precise, it is Set-StrictMode -Version 3
or higher, and given the limitations of Set-StrictMode
, I tend to stay away from it:
Updated to reflect what ultimately made it into C# 8.0 on release.
@rkeithhill points out that the upcoming C# 8 (released with .NET Core 3.0) gained the features discussed here, as covered here.
Some of what was introduced in C# 8 has been a part of PowerShell since the beginning (kudos, PowerShell):
C# 8 | PS
----- | ---
^1
| -1
1..2
| (same)
Here's what's new, which covers what this issue proposes with some extra syntactic sugar:
C# 8 | PS now | This proposal (so far, written without knowledge of C# 8)
--- | --- | ---
1..
| 1..($arr.Length-1)
| 1..
..9
| 0..9
|
^9..
| -9..-1
| -9..
..
| 0..($arr.Length-1)
1..^1
| 1..($arr.Length-1 - 1)
| 1..@-1
Note: While it is possible to use ranges with System.Span<T>
instances to get slices that are windows into the original array, using them with regular arrays creates slices that are new, independent arrays, as in PowerShell.
Syntax-wise, an option is therefore to go with ^
instead of @-
for the index-from-end syntax to align with C#.
(I'm not sure why ^
was chosen for C#).
..9
for 0..9
, though less compelling than 9..
, might be nice for symmetry.
As for ..
by itself: when used on a regular array, it is effectively a way to create a shallow copy of the input array.
FYI: The spec for ranges in C# 8 is here.
@mklement0
Returning spans ... We cannot do that in PowerShell, as this behavior would be a breaking change,
It also won't work as ByRef types are not supported:
PSCore (1:46) > $sp = [span[int]]::Empty
Cannot get or set the property or field "Empty" of the ByRef-like type "System.Span`1[System.Int32]". ByRef-like types are not supported in PowerShell.
馃 Could that be supported in some fashion? Certainly, that's a separate issue, but... even so. ^^
Also note that slices in PowerShell aren't based on ranges, they're based on _arrays_. Ranges just generate arrays so you can do things like:
PSCore (1:55) > $a = 1..100
PSCore (1:56) > $a[1,3 + 7..9]
2
4
8
9
10
PSCore (1:57) > $slice = 1,3 + 7..9
PSCore (1:58) > $a[$slice]
2
4
8
9
10
One possible non-colliding alternative solution for slices would be to add code methods to arrays and strings to do the slicing: First( int n)
, Last(int n)
, Slice(int n, int length)
, etc.
@vexx32 As an alternative to Span<T>
, we could use ArraySegment<T>
which works fine in PowerShell.
Please consider situations where the array is created just before the split. In this case, the result should be everything from the second path element through the end. Then, joined back up as a path. Using 999
is a hack, but it works.
Get-ChildItem -File -Recurse | ForEach-Object { $_.FullName.Split('/')[2..999] -join '/' }
In addition to this, what if the path without the last N elements is needed?
Using
999
is a hack, but it works.
It only works if the max. upper bound is low enough to be practical; an attempt at generalizing that with [2..([int]::MaxValue-1)]
does _not_ work (and is prohibitively slow/memory-intensive even if you lower the number to the maximum permissible upper bound).
Other than that, to address your scenarios in the the terms of this proposal:
$_.FullName.Split('/')[2..] # 3rd element and all remaining ones
And to get all remaining except the last, say, 2
elements:
$_.FullName.Split('/')[2..@-2] # 3rd element through all but the last 2
I'm a bit late on this one, but in regards to:
@vexx32 As an alternative to
Span<T>
, we could useArraySegment<T>
which works fine in PowerShell.
Memory<T>
also works. Granted, Memory<T>
doesn't really work in PowerShell since most of it's functionality requires you call the Span
property, but improving that handling would be great as well.
Also FWIW C#8's slicing only doesn't give you Span
from an array, it just gives you a sub-array similar to PS. For example:
object GetSlice(int[] source) => source[0..^1];
Is translated by the compiler to:
object GetSlice(int[] source)
=> RuntimeHelpers.GetSubArray(source,. new Range(0, new Index(1, fromEnd: true)));
Also +1 to supporting ^ from C# 8's implementation.
Most helpful comment
I'm a bit late on this one, but in regards to:
Memory<T>
also works. Granted,Memory<T>
doesn't really work in PowerShell since most of it's functionality requires you call theSpan
property, but improving that handling would be great as well.Also FWIW C#8's slicing only doesn't give you
Span
from an array, it just gives you a sub-array similar to PS. For example:Is translated by the compiler to:
Also +1 to supporting ^ from C# 8's implementation.