Rubocop: Misleading error message: Use casecmp instead of downcase ==

Created on 19 Feb 2017  路  10Comments  路  Source: rubocop-hq/rubocop

RuboCop warns not to do case-insensitive string comparison by using downcase for performance reasons: http://www.rubydoc.info/gems/rubocop/RuboCop/Cop/Performance/Casecmp

The output message for this pattern is Use casecmp instead of downcase ==, but this is misleading. casecmp, as a comparison operator, returns -1, 0, or 1, all of which are truthy, rather than true or false. Therefore a direct replacement of a.downcase == b.downcase with a.casecmp(b) will have incorrect behavior.

RuboCop autocorrect does correctly replace this a.casecmp(b).zero?, but the error message should indicate that the call to zero? is also needed.

Maybe: Use casecmp and zero? instead of downcase ==

$ rubocop -V
0.40.0 (using Parser 2.3.1.2, running on ruby 2.3.1 x86_64-linux)

Most helpful comment

In Ruby 2.4+, they added casecmp? (with the question mark), which will return true and false values instead of integers.

"abcdef".casecmp?("abcde")     #=> false
"aBcDeF".casecmp?("abcdef")    #=> true
"abcdef".casecmp?("abcdefg")   #=> false
"abcdef".casecmp?("ABCDEF")    #=> true

What does everyone think about adjusting the message to suggest using casecmp? instead of casecmp.zero?, provided they are on Ruby 2.4+?

All 10 comments

That sounds like a reasonable change to me.

Would you like to create the pull request? The message is coming from this line in Performance/Casecmp. There are a few references to the message in the spec too.

Opened a PR for this.

@harman28 Did you mean to open a PR? I can only see a commit above. Let me know if I can help.

@leoi11: Feel free to take a stab at it. 馃檪

In Ruby 2.4+, they added casecmp? (with the question mark), which will return true and false values instead of integers.

"abcdef".casecmp?("abcde")     #=> false
"aBcDeF".casecmp?("abcdef")    #=> true
"abcdef".casecmp?("abcdefg")   #=> false
"abcdef".casecmp?("ABCDEF")    #=> true

What does everyone think about adjusting the message to suggest using casecmp? instead of casecmp.zero?, provided they are on Ruby 2.4+?

is casecmp yet UTF-8 compatible in Ruby 2.5 > ? if you use UTF-8 strings then you cannot use casecmp you must use lowcase/upcase comparison with ==

@pkilpo Can you show some examples, like I did in my comment, just to make the issue with UTF-8 very clear?

Hi,
casecmp? works with '脰', '盲' etc but what I googled its the same as doing '脛'.downcase == '盲'.downcase in the performace point of view, based what I googled. I have not made any performance tests myself however. So if thats true then there is no reason to have this performance warning. That should be verified, I can try to make some tests later.
Here is one link https://github.com/rubocop-hq/rubocop/issues/4277

As of Ruby 2.6.1 (String documentation):

  • downcase does full unicode case mapping by default, but does not do context-dependent case mapping (i.e. adjacent characters are not considered). There are options for Turkic and Lithuanian languages when an optional argument is supplied.
  • casecmp (no ?) operates exclusively on A-Za-z (i.e. ASCII alpha). Non-alpha characters of different cases will not be seen as matching. E.g. and will not match.
  • casecmp? uses unicode case folding (i.e. "str".downcase(:fold)) which is neither context-aware or language-aware, but supports a wider range of characters and should be more performant.

Suggesting a.casecmp(b).zero? as a substitution for a.downcase == b.downcase is inappropriate and liable to introduce bugs when working with non-ASCII strings (i.e., when used with real data). Using casecmp? will produce slightly different results as well, but when appropriate Ruby versions are targeted, it's unlikely to produce unwanted behavior, though it could perhaps be opt-in to err on the side of caution.

a = "h茅llo"
b = "H脡LLO"
a.downcase == b.downcase # => true 
a.casecmp(b).zero? # => false 
a.casecmp?(b) # => true 

Hi, you look this issue? Somebody made comparison of downcase + == against casecmp? . Looks like casecmp? was even slower. So replacing downcase with casecmp? Is not good in the performance point of view. Read this fully :
https://github.com/rubocop-hq/rubocop/issues/4277

Was this page helpful?
0 / 5 - 0 ratings