Powershell: Add a switch to Select-String that returns the matching parts only, analogous to grep -o

Created on 5 Sep 2018  路  14Comments  路  Source: PowerShell/PowerShell

Related: #7713 and #7867

Sometimes all you want Select-String to do is to output only the matching parts of the input lines as strings, similar to what grep -o does on Unix-like platforms; e.g.:

# Extract only the parts that match the regex, each on its own line.
PSonUnix> "line1`nline2`nline3" | grep -o '[0-9]'
1
2
3

The equivalent Select-String solution is currently cumbersome:

PS> "line1", "line2", "line3" | Select-String '[0-9]' | ForEach-Object { $_.Matches[0].Value }
1
2
3

If we introduced a switch named, say, -MatchingPartOnly (name TBD, could have an alias of -o)
-OnlyMatching (see decision below), the command could be simplified to:

PS> "line1", "line2", "line3" | Select-String '[0-9]' -OnlyMatching

As an alias, -om could be considered (just -o could break existing code that used it for -OutVariable).

This would also speed up processing, because constructing [Microsoft.PowerShell.Commands.MatchInfo] instances can be bypassed.

Environment data

Written as of:

PowerShell Core 6.1.0-preview.4
Area-Cmdlets-Utility Committee-Reviewed First-Time-Issue Hacktoberfest Issue-Enhancement Up-for-Grabs

Most helpful comment

I believe we could have an symmetry and PowerShell Committee could discuss both cases:

  • -SimpleMatch + -AllMatches
  • -OnlyMatching + -AllMatches

All 14 comments

@SteveL-MSFT Can you approve?

I think this should works for -SimpleMatch too.

@PowerShell/powershell-committee reviewed this, seems fine to add but should match the grep description of the parameter and call it -OnlyMatching

The OnlyMatching name doesn't correlate with "output" 馃槙

Sometimes all you want Select-String to do is to output only the matching parts of the input lines as strings

Seems like a lot of the time, that's all I want -match to do as well; and this would get roughly as close as my imaginary -keep operator (the inverse of -replace):

PS C:\> 'a word and here' | Select-String -AllMatches -MatchingPartOnly -Pattern'\w{4}'
word
here
PS C:\> 'a word and here' | sls -a -m '\w{4}'
word
here
PS C:\> 'a word and here' -keep '\w{4}'
word
here

馃槂 Yay!

@HumanEquivalentUnit, I quite like the idea of a -keep (or perhaps -extract) operator . Can I suggest you create a feature request for it?

@mklement0 I did; https://github.com/PowerShell/PowerShell/issues/7958 but it looked like a duplicate at the time, of your linked issue which is basically the same request.

Oops! Completely forgot about #7867's -matchall proposal, which is indeed the same in essence - thanks.

@SteveL-MSFT There's one remaining design question to answer:

grep -o returns _multiple_ matches on each line:

$ echo foo | grep -o o
o
o

If we follow this logic, -OnlyMatching would effectively invariably imply -AllMatches.

However, it may make more sense to have -OnlyMatching only report the _first_ match by default, with the option to report _all_ if -AllMatches is also specified.

It is related to question about SimpleMatch + AllMatches.

Yes: here we don't strictly have a backward-compatibility problem, because -OnlyMatching will be a new feature.

Technically we are therefore free to to support combining -SimpleMatch and -AllMatches with -OnlyMatching to output all matching literal substrings (even multiple ones on a single line).

However, given the committee decision _not_ to fix #11102 in order to support combining -SimpleMatch and -AllMatches (in the default, non--OnlyMatching case), this would lead to an awkward asymmetry.

/cc @SteveL-MSFT

I believe we could have an symmetry and PowerShell Committee could discuss both cases:

  • -SimpleMatch + -AllMatches
  • -OnlyMatching + -AllMatches

If someone curious about a current way of doing so, you can use this:

$value = 'aa' #or even [System.IO.File]::ReadLines($filename)
$options = [Text.RegularExpressions.RegexOptions]::IgnoreCase -bor [Text.RegularExpressions.RegexOptions]::CultureInvariant
$regex = 'REGEX'
$properselectstring = [regex]::Matches($value,$regex,$options)
$properselectstring.value
Was this page helpful?
0 / 5 - 0 ratings