Ripgrep: Output only matched patterns (data extraction)

Created on 14 Jan 2017  路  4Comments  路  Source: BurntSushi/ripgrep

Hi,

just been trying ripgrep for the first time, but I couldn't find an easy and straightforward way to just output the matching pattern.

For example, trying to search in a file like this:
rg "(\{.*\}\})" example.txt

The regex matching works as intended, correct result is highlighted in bold red text.

But when I tried to redirect ("pipe") the output to another program I realized that the actual output is still the unchanged input.

Am I missing something here? I couldn't find an option just for this output, seemed pretty straightforward to me, but neither rg --help or the front page here seem to mention something..

Most helpful comment

It's a missing feature: #34. It's on my list of things to do soon, but it's part of a larger refactoring effort, so it might take a little longer.

There is a work around today though. You can use the -r/--replace flag:

rg "(\{.*\}\})" example.txt -r '$1'

I think that should do it. (On mobile so i haven't checked.)

All 4 comments

It's a missing feature: #34. It's on my list of things to do soon, but it's part of a larger refactoring effort, so it might take a little longer.

There is a work around today though. You can use the -r/--replace flag:

rg "(\{.*\}\})" example.txt -r '$1'

I think that should do it. (On mobile so i haven't checked.)

rg "(\{.*\}\})" example.txt -r '$1'

Not really. Replacing works as intended apparently, in this case:

  • match and take capture group 1
  • replace with reference 1, i.e. capture group 1 again

But the thing is, the actual output is still unchanged, because the program always returns the whole line where the match was found.

To help illustrate things better, I made an example file: https://fnpaste.com/YZvd

But I found a way around this: Changing the rx syntax to match the whole line, and then replace the whole line with capture group 1.

For example:
rg ".*40 (\{.*\}\}).*" example.txt -r "$1"

This gives me the desired part. But there is a big caveat: I use the 40 as an anchor to make the group match the correct part, but this is specific for this example only, I don't actually know the value.

I tried ".*(\{.*\}\}).*" in my text editor and it actually works there, whereas the same pattern only matches the last {...}} part of the target inside the line with rg.

I'm not really familiar with the intricacies of regular expressions in Rust, so this needs a bit of tweaking, I guess, although it should definitely be possible.

You're right. I forgot to add that you need to make your regex match the entire line in order for -r/--replace to work like -o/--only-matching.

I think this gives the desired output for you though:

$ rg '^.*?(\{.*\}\}).*?$' example.txt -r '$1'

The trick is to make the .* lazy by using .*?. When it's lazy, it will try to match the least possible. Otherwise, it tries to match the most possible.

Thanks!

That did the trick just fine, greedy vs. lazy, obviously..

I'll close this issue for now.
We have a good workaround, and you already said that this is on your list of things to do, so I guess all is fine..

Was this page helpful?
0 / 5 - 0 ratings