Related: #7713 and #7867
Sometimes all you want Select-String
to do is to output only the matching parts of the input lines as strings, similar to what grep -o
does on Unix-like platforms; e.g.:
# Extract only the parts that match the regex, each on its own line.
PSonUnix> "line1`nline2`nline3" | grep -o '[0-9]'
1
2
3
The equivalent Select-String
solution is currently cumbersome:
PS> "line1", "line2", "line3" | Select-String '[0-9]' | ForEach-Object { $_.Matches[0].Value }
1
2
3
If we introduced a switch named, say, -MatchingPartOnly
(name TBD, could have an alias of -o
)
-OnlyMatching
(see decision below), the command could be simplified to:
PS> "line1", "line2", "line3" | Select-String '[0-9]' -OnlyMatching
As an alias, -om
could be considered (just -o
could break existing code that used it for -OutVariable
).
This would also speed up processing, because constructing [Microsoft.PowerShell.Commands.MatchInfo]
instances can be bypassed.
Written as of:
PowerShell Core 6.1.0-preview.4
@SteveL-MSFT Can you approve?
I think this should works for -SimpleMatch
too.
@PowerShell/powershell-committee reviewed this, seems fine to add but should match the grep description of the parameter and call it -OnlyMatching
The OnlyMatching name doesn't correlate with "output" 馃槙
Sometimes all you want Select-String to do is to output only the matching parts of the input lines as strings
Seems like a lot of the time, that's all I want -match
to do as well; and this would get roughly as close as my imaginary -keep
operator (the inverse of -replace
):
PS C:\> 'a word and here' | Select-String -AllMatches -MatchingPartOnly -Pattern'\w{4}'
word
here
PS C:\> 'a word and here' | sls -a -m '\w{4}'
word
here
PS C:\> 'a word and here' -keep '\w{4}'
word
here
馃槂 Yay!
@HumanEquivalentUnit, I quite like the idea of a -keep
(or perhaps -extract
) operator . Can I suggest you create a feature request for it?
@mklement0 I did; https://github.com/PowerShell/PowerShell/issues/7958 but it looked like a duplicate at the time, of your linked issue which is basically the same request.
Oops! Completely forgot about #7867's -matchall
proposal, which is indeed the same in essence - thanks.
@SteveL-MSFT There's one remaining design question to answer:
grep -o
returns _multiple_ matches on each line:
$ echo foo | grep -o o
o
o
If we follow this logic, -OnlyMatching
would effectively invariably imply -AllMatches
.
However, it may make more sense to have -OnlyMatching
only report the _first_ match by default, with the option to report _all_ if -AllMatches
is also specified.
It is related to question about SimpleMatch + AllMatches.
Yes: here we don't strictly have a backward-compatibility problem, because -OnlyMatching
will be a new feature.
Technically we are therefore free to to support combining -SimpleMatch
and -AllMatches
with -OnlyMatching
to output all matching literal substrings (even multiple ones on a single line).
However, given the committee decision _not_ to fix #11102 in order to support combining -SimpleMatch
and -AllMatches
(in the default, non--OnlyMatching
case), this would lead to an awkward asymmetry.
/cc @SteveL-MSFT
I believe we could have an symmetry and PowerShell Committee could discuss both cases:
If someone curious about a current way of doing so, you can use this:
$value = 'aa' #or even [System.IO.File]::ReadLines($filename)
$options = [Text.RegularExpressions.RegexOptions]::IgnoreCase -bor [Text.RegularExpressions.RegexOptions]::CultureInvariant
$regex = 'REGEX'
$properselectstring = [regex]::Matches($value,$regex,$options)
$properselectstring.value
Most helpful comment
I believe we could have an symmetry and PowerShell Committee could discuss both cases: