Related: #7713 and #7867
Sometimes all you want Select-String to do is to output only the matching parts of the input lines as strings, similar to what grep -o does on Unix-like platforms; e.g.:
# Extract only the parts that match the regex, each on its own line.
PSonUnix> "line1`nline2`nline3" | grep -o '[0-9]'
1
2
3
The equivalent Select-String solution is currently cumbersome:
PS> "line1", "line2", "line3" | Select-String '[0-9]' | ForEach-Object { $_.Matches[0].Value }
1
2
3
If we introduced a switch named, say, -MatchingPartOnly (name TBD, could have an alias of -o)
-OnlyMatching (see decision below), the command could be simplified to:
PS> "line1", "line2", "line3" | Select-String '[0-9]' -OnlyMatching
As an alias, -om could be considered (just -o could break existing code that used it for -OutVariable).
This would also speed up processing, because constructing [Microsoft.PowerShell.Commands.MatchInfo] instances can be bypassed.
Written as of:
PowerShell Core 6.1.0-preview.4
@SteveL-MSFT Can you approve?
I think this should works for -SimpleMatch too.
@PowerShell/powershell-committee reviewed this, seems fine to add but should match the grep description of the parameter and call it -OnlyMatching
The OnlyMatching name doesn't correlate with "output" 馃槙
Sometimes all you want Select-String to do is to output only the matching parts of the input lines as strings
Seems like a lot of the time, that's all I want -match to do as well; and this would get roughly as close as my imaginary -keep operator (the inverse of -replace):
PS C:\> 'a word and here' | Select-String -AllMatches -MatchingPartOnly -Pattern'\w{4}'
word
here
PS C:\> 'a word and here' | sls -a -m '\w{4}'
word
here
PS C:\> 'a word and here' -keep '\w{4}'
word
here
馃槂 Yay!
@HumanEquivalentUnit, I quite like the idea of a -keep (or perhaps -extract) operator . Can I suggest you create a feature request for it?
@mklement0 I did; https://github.com/PowerShell/PowerShell/issues/7958 but it looked like a duplicate at the time, of your linked issue which is basically the same request.
Oops! Completely forgot about #7867's -matchall proposal, which is indeed the same in essence - thanks.
@SteveL-MSFT There's one remaining design question to answer:
grep -o returns _multiple_ matches on each line:
$ echo foo | grep -o o
o
o
If we follow this logic, -OnlyMatching would effectively invariably imply -AllMatches.
However, it may make more sense to have -OnlyMatching only report the _first_ match by default, with the option to report _all_ if -AllMatches is also specified.
It is related to question about SimpleMatch + AllMatches.
Yes: here we don't strictly have a backward-compatibility problem, because -OnlyMatching will be a new feature.
Technically we are therefore free to to support combining -SimpleMatch and -AllMatches with -OnlyMatching to output all matching literal substrings (even multiple ones on a single line).
However, given the committee decision _not_ to fix #11102 in order to support combining -SimpleMatch and -AllMatches (in the default, non--OnlyMatching case), this would lead to an awkward asymmetry.
/cc @SteveL-MSFT
I believe we could have an symmetry and PowerShell Committee could discuss both cases:
If someone curious about a current way of doing so, you can use this:
$value = 'aa' #or even [System.IO.File]::ReadLines($filename)
$options = [Text.RegularExpressions.RegexOptions]::IgnoreCase -bor [Text.RegularExpressions.RegexOptions]::CultureInvariant
$regex = 'REGEX'
$properselectstring = [regex]::Matches($value,$regex,$options)
$properselectstring.value
Most helpful comment
I believe we could have an symmetry and PowerShell Committee could discuss both cases: