After updating to 0.88 I noticed a lot of increases in CyclomaticComplexity in our codebase (= lots of new offenses); then, after 0.89, AbcSize and PerceivedComplexity grew drastically, too.
(I believe that it is not a good thing either way: after updating of large-ish codebase to Rubocop you suddenly find dozens of new offenses, and urged either to raise thresholds which felt previously "reasonable", or refactor a lot of code.)
In order to understand what exactly the numbers mean, and how to tackle complexity the best way, I wrote a draft of "expalnatory analyzer" script (ugly!): https://gist.github.com/zverok/5e858fac96a2e9f5282bbeab8fec2d39 -- which is just copy-pasted AbcSizeCalculator which adds meta-info about which node caused increase in one of A,B,C metrics. Now, the result for one method:
def setup_timecodes(interval)
words = paragraphs.flat_map(&:words)
# Clear previous, so exporter could be run with different parameters several time.
words.each { |w| w.timecodes = [] }
return unless interval
current = job.timecode || 0
words.first.timecodes << current
words.each do |w|
next if w.punct?
# if the pause is large, there could be several interval marks before the current word
while current + interval <= w.time.to_i
current += interval
w.timecodes << current
end
end
end
"Intuitively", it is "not good, not terrible" (YMMV), and Rubocop on our previous setting (AbcSize Max 17) was happy with it. After 0.89, Rubocop says 19.6. Here is the script's output with the investigation (note that line is repeated if several nodes of this line caused metric increase):
Outcome: sqrt(8 assignments虏 + 16 branches虏 + 8 conditions虏) = 19.6
def setup_timecodes(interval)
# ^^^^^^^^ +assignment
words = paragraphs.flat_map(&:words)
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +assignment
words = paragraphs.flat_map(&:words)
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +branch
words = paragraphs.flat_map(&:words)
# ^^^^^^^^^^ +branch
words = paragraphs.flat_map(&:words)
# ^^^^^^^ +condition
words.each { |w| w.timecodes = [] }
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +condition
words.each { |w| w.timecodes = [] }
# ^^^^^^^^^^ +branch, +branch
words.each { |w| w.timecodes = [] }
# ^ +assignment, +assignment
words.each { |w| w.timecodes = [] }
# ^^^^^^^^^^^^^^^^ +assignment, +branch
return unless interval
# ^^^^^^^^^^^^^^^^^^^^^^ +condition
current = job.timecode || 0
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^ +assignment
current = job.timecode || 0
# ^^^^^^^^^^^^^^^^^ +condition
current = job.timecode || 0
# ^^^^^^^^^^^^ +branch
current = job.timecode || 0
# ^^^ +branch
words.first.timecodes << current
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +branch
words.first.timecodes << current
# ^^^^^^^^^^^^^^^^^^^^^ +branch
words.first.timecodes << current
# ^^^^^^^^^^^ +branch
words.each do |w|
# ^^^^^^^^^^^^^^^^^ +condition
next if w.punct?
# ^^^^^^^^^^^^^^^^ +condition
next if w.punct?
# ^^^^^^^^ +branch
while current + interval <= w.time.to_i
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +condition
while current + interval <= w.time.to_i
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +condition (comparison)
while current + interval <= w.time.to_i
# ^^^^^^^^^^^^^^^^^^ +branch
while current + interval <= w.time.to_i
# ^^^^^^^^^^^ +branch
while current + interval <= w.time.to_i
# ^^^^^^ +branch
current += interval
# ^^^^^^^^^^^^^^^^^^^ +assignment
current += interval
# ^^^^^^^ +assignment
w.timecodes << current
# ^^^^^^^^^^^^^^^^^^^^^^ +branch
w.timecodes << current
# ^^^^^^^^^^^ +branch
I am not 100% sure script output always points to "right" nodes, but I really hope so... Here, I can see several suspicious things (repeating the same output with additional comments):
Outcome: sqrt(8 assignments虏 + 16 branches虏 + 8 conditions虏) = 19.6
def setup_timecodes(interval)
# ^^^^^^^^ +assignment
words = paragraphs.flat_map(&:words)
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +assignment
words = paragraphs.flat_map(&:words)
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +branch
words = paragraphs.flat_map(&:words)
# ^^^^^^^^^^ +branch
words = paragraphs.flat_map(&:words)
# ^^^^^^^ +condition -- is it?
words.each { |w| w.timecodes = [] }
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +condition -- is it?
words.each { |w| w.timecodes = [] }
# ^^^^^^^^^^ +branch, +branch
words.each { |w| w.timecodes = [] }
# ^ +assignment, +assignment -- two assignments?..
words.each { |w| w.timecodes = [] }
# ^^^^^^^^^^^^^^^^ +assignment, +branch
return unless interval
# ^^^^^^^^^^^^^^^^^^^^^^ +condition
current = job.timecode || 0
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^ +assignment
current = job.timecode || 0
# ^^^^^^^^^^^^^^^^^ +condition
current = job.timecode || 0
# ^^^^^^^^^^^^ +branch
current = job.timecode || 0
# ^^^ +branch
words.first.timecodes << current
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +branch
words.first.timecodes << current
# ^^^^^^^^^^^^^^^^^^^^^ +branch
words.first.timecodes << current
# ^^^^^^^^^^^ +branch
words.each do |w|
# ^^^^^^^^^^^^^^^^^ +condition -- is it?
next if w.punct?
# ^^^^^^^^^^^^^^^^ +condition
next if w.punct?
# ^^^^^^^^ +branch
while current + interval <= w.time.to_i
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +condition
while current + interval <= w.time.to_i
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +condition (comparison)
while current + interval <= w.time.to_i
# ^^^^^^^^^^^^^^^^^^ +branch
while current + interval <= w.time.to_i
# ^^^^^^^^^^^ +branch
while current + interval <= w.time.to_i
# ^^^^^^ +branch
current += interval
# ^^^^^^^^^^^^^^^^^^^ +assignment
current += interval
# ^^^^^^^ +assignment -- umm, two assignments in one expression?..
w.timecodes << current
# ^^^^^^^^^^^^^^^^^^^^^^ +branch
w.timecodes << current
# ^^^^^^^^^^^ +branch
I am not 100% sure ANY of the suspicious things shown above are some bugs, but they are... well, _suspicious_ :) And the increase throughout the codebase bothers me, making me _even more suspicious_.
(Anyway, I believe that my "explainer" is a useful excercise that might one day grow into a gem. But I'll be happy if somebody could explain the weird increases)
Maybe somebody will find this more readable: update script, which groups all line increments: https://gist.github.com/zverok/2f2a55c99f6093928103d58c1c149146 -- and its result:
Outcome: sqrt(8 assignments虏 + 16 branches虏 + 8 conditions虏) = 19.6
def setup_timecodes(interval)
# ^^^^^^^^ +assignment
words = paragraphs.flat_map(&:words)
# ^^^^^^^ +condition
# ^^^^^^^^^^ +branch
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +branch
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +assignment
words.each { |w| w.timecodes = [] }
# ^^^^^^^^^^^^^^^^ +branch
# ^^^^^^^^^^^^^^^^ +assignment
# ^ +assignment
# ^^^^^^^^^^ +branch
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +condition
return unless interval
# ^^^^^^^^^^^^^^^^^^^^^^ +condition
current = job.timecode || 0
# ^^^ +branch
# ^^^^^^^^^^^^ +branch
# ^^^^^^^^^^^^^^^^^ +condition
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^ +assignment
words.first.timecodes << current
# ^^^^^^^^^^^ +branch
# ^^^^^^^^^^^^^^^^^^^^^ +branch
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +branch
words.each do |w|
# ^ +assignment
# ^^^^^^^^^^ +branch
# ^^^^^^^^^^^^^^^^^ +condition
next if w.punct?
# ^^^^^^^^ +branch
# ^^^^^^^^^^^^^^^^ +condition
while current + interval <= w.time.to_i
# ^^^^^^ +branch
# ^^^^^^^^^^^ +branch
# ^^^^^^^^^^^^^^^^^^ +branch
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +condition (comparison)
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +condition
current += interval
# ^^^^^^^ +assignment
# ^^^^^^^^^^^^^^^^^^^ +assignment
w.timecodes << current
# ^^^^^^^^^^^ +branch
# ^^^^^^^^^^^^^^^^^^^^^^ +branch
I'm sorry for the trouble. These Metrics cops had many issues. Hopefully they were all fixes at once, so as to create transition issues only once...
Relevant changelog entries:
#8037: (Breaking) Cop Metrics/AbcSize now counts ||=, &&=, multiple assignments, for, yield, iterating blocks. &. now count as conditions too (unless repeated on the same variable). Default bumped from 15 to 17. Consider using rubocop -a --disable-uncorrectable to ease transition. ([@marcandre][])
#8276: Cop Metrics/CyclomaticComplexity not longer counts &. when repeated on the same variable. ([@marcandre][])
#8204: (Breaking) Cop Metrics/PerceivedComplexity now counts else in case statements, &., ||=, &&= and blocks known to iterate. Default bumped from 7 to 8. Consider using rubocop -a --disable-uncorrectable to ease transition. ([@marcandre][])
Metrics/CyclomaticComplexity now counts &., ||=, &&= and blocks known to iterate. Default bumped from 6 to 7. Consider using rubocop -a --disable-uncorrectable to ease transition. ([@marcandre][])The PRs have additional discussions. Let me know if you have particular issues with some of the calculations; #8276 is an example of further tweak I made after some good feedback.
@marcandre Oh, I didn't do my homework, it seems :) (I typically follow Rubocop development closely, but lately was on vacation and out of internetz... Should've checked changelogs more thoroughly.) Anyways, it was not an angry "you broke my stuff" ticket, rather curious "I wonder how it works" ticket.
Your links make sense to me, I'll just raise thresholds a bit for now in our codebase (and probably will work more on my "explainer" script, I kinda love the idea).
I am still a bit surprised that each is counted as "condition", but probably it makes sense too (and doesn't make much difference, whether it would be counted as "branch", say, it would still have the same effect)
I am still a bit surprised that
eachis counted as "condition",
It's definitely debatable (and it was debated too 馃槅). There is no perfect measure, and Ruby in particular is troublesome as simple attribute readers need to be counted as "branches", while if Ruby could be compiled that could be inlined and not count. The threads give the reasoning for each, but for AbcSize the idea is that each is basically a for and should count as much. The CyclomaticComplexity, it's a disjoint component so increases by +1.
The main issue Goodhart's law.
I know these revised metrics are more accurate and hopefully will fit your coding style even better than the previous one, albeit with increased defaults. It's just Metrics inflation 馃槄
Most helpful comment
It's definitely debatable (and it was debated too 馃槅). There is no perfect measure, and Ruby in particular is troublesome as simple attribute readers need to be counted as "branches", while if Ruby could be compiled that could be inlined and not count. The threads give the reasoning for
each, but forAbcSizethe idea is thateachis basically aforand should count as much. TheCyclomaticComplexity, it's a disjoint component so increases by +1.The main issue Goodhart's law.
I know these revised metrics are more accurate and hopefully will fit your coding style even better than the previous one, albeit with increased defaults. It's just Metrics inflation 馃槄