Rubocop: Invalid autocorrection with Style/RedundantRegexpCharacterClass

Created on 26 Oct 2020  ·  6Comments  ·  Source: rubocop-hq/rubocop

Expected behavior

Regex autocorrect is a valid regex.

Actual behavior

The offense is partially corrected and an invalid regex is produced. This is a similar problem to the one that was fixed in https://github.com/rubocop-hq/rubocop/pull/8913, but the issue still happens against current master with that fix.

Steps to reproduce the problem

Autocorrect:

MATCHES_CHEF_GEM ||= %r{/chef-[\d]+\.[\d]+\.[\d]+}.freeze

Correct is to:

MATCHES_CHEF_GEM ||= %r{/che-[\d+.[\d+.[\d+}.freeze

RuboCop version

master
bug

All 6 comments

@ysakasin did you want to have a look?

@marcandre Yes, I look it.

Let me just throw in another simple broken example.
While this correctly auto-corrects,
/OK[\d]/

this fails to get the position of [
/だめ[\d]/

because Regexp::Expression::CharacterSet#ts returns the value based on bytesize instead of String length in Ruby.

because Regexp::Expression::CharacterSet#ts returns the value based on bytesize instead of String length in Ruby.

Ohhh. That's a different issue though, and should probably be fixed in the regexp parser gem...

Examples

ysakasin@mac:~/s/rubocop|master⚡?
➤ cat a.rb
/[\d]+/
%r{[\d]+}

/abc[\d]+/
%r{abc[\d]+}

/OK[\d]/
/だめ[\d]/
ysakasin@mac:~/s/rubocop|master⚡?
➤ bundle exec exe/rubocop a.rb -a
Inspecting 1 file
An error occurred while Style/RedundantRegexpCharacterClass cop was inspecting /Users/ysakasin/src/rubocop/a.rb:8:0.
To see the complete backtrace run rubocop -d.
E

Offenses:

a.rb:1:2: C: [Corrected] Style/RedundantRegexpCharacterClass: Redundant single-element character class, [\d] can be replaced with \d.
/[\d]+/
 ^^^^
a.rb:2:1: C: [Corrected] Style/RegexpLiteral: Use // around regular expression.
%r{[\d]+}
^^^^^^^^^
a.rb:2:2: C: [Corrected] Style/RedundantRegexpCharacterClass: Redundant single-element character class, r{[\d] can be replaced with {[\d.
%r{[\d]+}
 ^^^^^^
a.rb:4:5: C: [Corrected] Style/RedundantRegexpCharacterClass: Redundant single-element character class, [\d] can be replaced with \d.
/abc[\d]+/
    ^^^^
a.rb:5:1: E: Lint/Syntax: premature end of char-class: /ac[\d+/
(Using Ruby 2.4 parser; configure using TargetRubyVersion parameter, under AllCops)
%r{ac[\d+}
^^^^^^^^^^
a.rb:5:1: C: [Corrected] Style/RegexpLiteral: Use // around regular expression.
%r{abc[\d]+}
^^^^^^^^^^^^
a.rb:5:5: C: [Corrected] Style/RedundantRegexpCharacterClass: Redundant single-element character class, bc[\d] can be replaced with c[\d.
%r{abc[\d]+}
    ^^^^^^
a.rb:7:4: C: [Corrected] Style/RedundantRegexpCharacterClass: Redundant single-element character class, [\d] can be replaced with \d.
/OK[\d]/
   ^^^^

1 file inspected, 8 offenses detected, 7 offenses corrected

1 error occurred:
An error occurred while Style/RedundantRegexpCharacterClass cop was inspecting /Users/ysakasin/src/rubocop/a.rb:8:0.
Errors are usually caused by RuboCop bugs.
Please, report your problems to RuboCop's issue tracker.
https://github.com/rubocop-hq/rubocop/issues

Mention the following information in the issue report:
1.0.0 (using Parser 2.7.2.0, rubocop-ast 1.0.1, running on ruby 2.7.1 x86_64-darwin19)
ysakasin@mac:~/s/rubocop|master⚡?
➤ cat a.rb
/\d+/
%{[\d+}

/abc\d+/
%r{ac[\d+}

/OK\d/
/だめ[\d]/

Ohhh. That's a different issue though, and should probably be fixed in the regexp parser gem...

Right. It seems like their problem.
But I guess regexp_parser behaves this way by design so that they could correctly handle grapheme thing (I mean, I just guess so).
And I suppose that it'd be hard to request a scanner to hold both byte-based and string-based sizes...

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ecbrodie picture ecbrodie  ·  3Comments

millisami picture millisami  ·  3Comments

david942j picture david942j  ·  3Comments

AndreiMotinga picture AndreiMotinga  ·  3Comments

bquorning picture bquorning  ·  3Comments