Ripgrep: Panic when searching Ruby language interpreter source code

Created on 13 Sep 2018  路  2Comments  路  Source: BurntSushi/ripgrep

What version of ripgrep are you using?

jack@happy:~/Downloads/ruby-2.5.1$ rg --version
ripgrep 0.10.0
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)

How did you install ripgrep?

cargo install ripgrep

What operating system are you using ripgrep on?

Ubuntu 18.04.1

jack@happy:~/Downloads/ruby-2.5.1$ uname -a
Linux happy 4.15.0-33-generic #36-Ubuntu SMP Wed Aug 15 16:00:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

If this is a bug, what are the steps to reproduce the behavior?

Download Ruby source code and search in it's directory.

jack@happy:/tmp$ curl --silent https://cache.ruby-lang.org/pub/ruby/2.5/ruby-2.5.1.tar.gz > ruby-2.5.1.tar.gz
jack@happy:/tmp$ tar xf ruby-2.5.1.tar.gz 
jack@happy:/tmp$ cd ruby-2.5.1/
jack@happy:/tmp/ruby-2.5.1$ rg foobarbazquz
thread '<unnamed>' panicked at 'index out of bounds: the len is 1 but the index is 1', /home/jack/.cargo/registry/src/github.com-1ecc6299db9ec823/encoding_rs-0.8.6/src/handles.rs:309:21
note: Run with `RUST_BACKTRACE=1` for a backtrace.
^C

Process then is hung.

If this is a bug, what is the actual behavior?

Command: rg --debug foobarbazquz

Output: https://gist.github.com/jackc/06e3cd8ce8ae238e6762249564cc1a76

If this is a bug, what is the expected behavior?

Not panic.

bug

Most helpful comment

A smaller reproduction of this bug:

$ rg ZZZZZ ruby-2.5.1/test/rexml/data/t63-2.svg

It turns out that this svg file starts with a UTF-16LE BOM, but does not actually appear to be UTF-16. In any case, this trips over a corner case in the streaming transcoder that in turn causes a panic. The fault is either in my implementation of the streaming transcoder (by not upholding some precondition of the transcoder) or in the implementation of the encoding handling itself. Either way, I filed a bug to get to the bottom of it: https://github.com/hsivonen/encoding_rs/issues/34

For now, I patched this via a work-around in the streaming transcoder, although it's not clear that the root cause has been fixed. PR #1065 brings in the workaround to ripgrep.

Thanks so much for reporting this!

All 2 comments

Interesting! It looks like the panic is coming from inside encoding_rs, but it could still be ripgrep's fault. I'll need to dig into this and come up with a smaller reproduction. Thanks for reporting this!

A smaller reproduction of this bug:

$ rg ZZZZZ ruby-2.5.1/test/rexml/data/t63-2.svg

It turns out that this svg file starts with a UTF-16LE BOM, but does not actually appear to be UTF-16. In any case, this trips over a corner case in the streaming transcoder that in turn causes a panic. The fault is either in my implementation of the streaming transcoder (by not upholding some precondition of the transcoder) or in the implementation of the encoding handling itself. Either way, I filed a bug to get to the bottom of it: https://github.com/hsivonen/encoding_rs/issues/34

For now, I patched this via a work-around in the streaming transcoder, although it's not clear that the root cause has been fixed. PR #1065 brings in the workaround to ripgrep.

Thanks so much for reporting this!

Was this page helpful?
0 / 5 - 0 ratings