Rubocop: Improved syntax checking around line continuations

Created on 29 Oct 2018  Â·  19Comments  Â·  Source: rubocop-hq/rubocop

'this text is too ' \
'long'

Style/LineEndConcatenation recommends that we should use \ instead of + or << to concatenate strings across multiple lines. But I am still seeing wildly inconsistent formatting of these multiline strings. I would like to suggest RuboCop checking the following four aspects of formatting of multiline strings:

  1. One space between quote and \.
  2. Consistent indentation.
  3. Use same quotes on all lines.
  4. Keep whitespace at the end of lines.

1. One space between quote and \

# Bad
'this text is too '\
'long'

# Bad
'this text is too '  \
'long'

# Good
'this text is too ' \
'long'

2. Consistent indentation

Indent IndentationWidth or match the preceding line’s starting quote.

# IndentationWidth
puts 'this text is too ' \
  'long'

# Match quote
puts 'this text is too ' \
     'long'

3. Use same quotes on all lines

# Bad
"this text is #{foo} " \
'long'

# Good
"this text is #{foo} " \
"long"

# Good
'this text is too ' \
'long'

4. Keep whitespace at the end of lines

In the actual text, disallow spaces at the beginning of line n+1 if line n ends with a non-space.

# Bad
'this text contains a lot of' \
'               spaces'

# Good
'this text contains a lot of               ' \
'spaces'

# Bad
'this text is too' \
' long'

# Good
'this text is too ' \
'long'
feature request stale

All 19 comments

/cc @jonas054 (I’m not sure who to mention here, but I know you’ve worked on many syntax cops in the past)

3. Use same quotes on all lines. This might create ambiguity Style/StringLiterals which suggests using single quotes for strings without interpolation.

"this text is #{foo} " \
'long' # StringLiterals prefers this

"this text is #{foo} " \
"long" # But as per your suggestion LineEndConcatenation should prefer this

The idea was that this cop should take precedence over Style/StringLiterals on multi-line strings.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution and understanding!

@bbatsov I would like to hear your opinion. Is strings continued over multiple lines within the responsibility of RuboCop, or should a separate tool (e.g. rubocop-strings) handle these niche cases?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution and understanding!

I’m still hopeful that someone with knowledge of Parser will comment on this issue.

Maybe @whitequark would be the right person to ask? Or @pocke ?

Each chunk of the literal is emitted as a separate node so you can just look at the line attribute.
Screenshot_20190807_083230

@bquorning The first aspect (one space between quote and \) could be useful for other line continuations as well, not only between string literals.

In fact I was experimenting with extending ExtraSpace the other day to cover the case of too many spaces before the backslash. For your second bad example above:

6420.rb:2:20: C: Layout/ExtraSpacing: Unnecessary spacing detected.
'this text is too '  \
                   ^

Doing that I discovered a more general flaw in the cop that needs to be fixed first.

Thank you for the response, @whitequark. But, your example shows a string containing a newline character. My question is regarding two strings that are “implicitly joined” (as two strings joined by whitespace would be, e.g. "a" "b" == "ab"), only the strings are on different lines _and_ the first line is ended by a backslash (to signal to Ruby that the line continues).

In the following example, I would like to find the position of the \ following "foo " to see e.g. how many spaces are in front of \. Can that be done in Parser, currently?

>> Parser::VERSION
"2.6.3.0"
>> Parser::CurrentRuby
Parser::Ruby26 < Parser::Base
>> puts two_joined_strings = %Q{"foo " \\\n"bar"}
"foo " \
"bar"
nil
>> eval(two_joined_strings)
"foo bar"
>> Parser::CurrentRuby.parse(two_joined_strings).loc
{
           :end => nil,
         :begin => nil,
    :expression => #<Parser::Source::Range (string) 0...14>
}
>> Parser::CurrentRuby.parse(two_joined_strings).children.map(&:loc)
[
    [0] {
               :end => #<Parser::Source::Range (string) 5...6>,
             :begin => #<Parser::Source::Range (string) 0...1>,
        :expression => #<Parser::Source::Range (string) 0...6>
    },
    [1] {
               :end => #<Parser::Source::Range (string) 13...14>,
             :begin => #<Parser::Source::Range (string) 9...10>,
        :expression => #<Parser::Source::Range (string) 9...14>
    }
]

In the following example, I would like to find the position of the \ following "foo " to see e.g. how many spaces are in front of \. Can that be done in Parser, currently?

No; the location of \, like with some other tokens (for example ,) is thrown away after lexing.

In the following example, I would like to find the position of the \ following "foo " to see e.g. how many spaces are in front of \.

@bquorning This is the approach I was playing with, basically analyzing the whitespace around line breaks with regexps: https://github.com/buehmann/rubocop/blob/32ccea042243ee0edf9d9f0dc426bf5baf5d604e/lib/rubocop/cop/layout/space_before_line_continuation.rb#L54-L61

@buehmann Interesting approach with the #investigate implementation. Does it actually work, though? Does the between object contain backslashes, or did Parser throw them away at this point?

I was playing around with e.g.

def on_dstr(node)
  node_line_numbers = node.first_line..node.last_line
  raw_lines = processed_source.raw_source.lines

  node_line_numbers.each do |node_line_number|
    raw_node_line = raw_lines[node_line_number - 1]
    investigate(node, raw_node_line)
  end
end

and matching the line with line.match?(/\S \\\Z/).

processed_source.raw_source definitely contains the backslashes, but isn’t tokenized, and I am only working with a raw string and regular expressions. I’m not sure if RuboCop (or Parser or AST) has some helper classes/methods that could make it cleaner.

It works. The source buffer contains the backslashes, they just sit between tokens returned by Parser.

➜  rubocop (line-cont/6420) ✗ rubocop -a --stdin 6420.rb < 6420.rb --only Layout/SpaceBeforeLineContinuation
Inspecting 1 file
C

Offenses:

6420.rb:2:20: C: [Corrected] Layout/SpaceBeforeLineContinuation: Missing space before line continuation
'this text is too '\
                   ^
6420.rb:6:20: C: [Corrected] Layout/SpaceBeforeLineContinuation: Extra space before line continuation
'this text is too '  \
                   ^

1 file inspected, 2 offenses detected, 2 offenses corrected
====================
# Bad
'this text is too ' \
'long'

# Bad
'this text is too ' \
'long'

# Good
'this text is too ' \
'long'

Please note that https://bugs.ruby-lang.org/issues/6265#note-8 suggests the Ruby core developers are trying to remove this C-like String concatenation syntax in Ruby 3.0.

Thanks @josb, I wasn’t aware of that ruby-core issue. But, reading the ticket I see that the syntax change was suggested 7 years ago, didn’t become a warning in Ruby 2.0, and 5 years ago it didn’t become a warning in Ruby 2.2. The last update (3 years ago) suggests to me that the syntax is still so heavily used in core Ruby that it won‘t be removed for a while yet.

Granted. But it would be good if RuboCop would help people move in the right direction, no? Could even link to that ticket...

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution and understanding!

This issues been automatically closed due to lack of activity. Feel free to re-open it if you ever come back to it.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mikegee picture mikegee  Â·  3Comments

Aqualon picture Aqualon  Â·  3Comments

benoittgt picture benoittgt  Â·  3Comments

AndreiMotinga picture AndreiMotinga  Â·  3Comments

tedPen picture tedPen  Â·  3Comments