Im not sure if it is intended behaviour but i believe string elements in the array which returned by String.lines (which is generated using String.each_line) shouldn't include "\n" characters.
"hello\nworld\nlast".lines # => ["hello\n", "world\n", "last"]
Ruby does the same. It allows preserving/differentiation between \r\n
and \n
, if you ever need to.
[70] pry(main)> "foo\nbar\nbaz\n".lines
=> ["foo\n", "bar\n", "baz\n"]
[71] pry(main)> "foo\nbar\nbaz\n".each_line.to_a
=> ["foo\n", "bar\n", "baz\n"]
It also makes more sense for the variant taking a parameter, in case we're ever going to support that.
[74] pry(main)> "foo\nbar\nbaz\n".lines("o")
=> ["fo", "o", "\nbar\nbaz\n"]
[75] pry(main)> "foo\nbar\nbaz\n".each_line("o").to_a
=> ["fo", "o", "\nbar\nbaz\n"]
I thought about it. "String.split" does what i want for sure but "String.lines" is a better name.
This is something that I thought about many times: most of the time you need to use chomp
afterwards and it's a bit more inefficient. We could maybe add an option to remove the newline, but just for efficiency (using chomp
would have the same effect, only a bit slower).
One case I found where preserving the newline is useful is when parsing an HTTP::Request, the headers end when a "\r\n" line is found. Although... if gets discarded it, the condition could probably be "if the line is empty" (the current code also checks for "\n").
I don't know if there are other cases where preserving the newline is needed. Maybe we could make gets
discard it, and using gets(char)
would preserve it, so you can do gets('\n')
if you need it.
Pondering this thread, I've come to the following conclusion: I'd much rather have semantics where #lines and #each_line without arguments will drop \r\n
or \n
characters as in:
"hello\nworld\nlast".lines # => ["hello", "world", "last"]
"hello\r\nworld\r\nlast".lines # => ["hello", "world", "last"]
This will make handling dos and unix text files much easier and eliminate a lot of special case handling for \r\n
in every application that needs it.
"hello\nworld\nlast".lines('\n') # => ["hello\n", "world\n", "last"]
"hello\r\nworld\r\nlast".lines('\n') # => ["hello\r\n", "world\r\n", "last"]
Ruby just added a chomp
optional argument to many methods, for example IO#gets
. We could do the same. I always wanted to read lines and automatically have them chomped. Invoking chomp
is OK, but will create a new string, so the chomp
option will improve performance a bit.
@asterite Is it that common to need an un-chomped string? Maybe chomp: true
should be the default.
@RX14 Yes, I wouldn't mind it being true by default
On the other hand, chomp
in Ruby also applies to the given character if passed. For example:
io = StringIO.new "hello\nfoo\nbar"
io.gets(chomp: true) # => "hello"
io.gets('o', chomp: true) # => "f"
io.gets("ar", chomp: true) # => "o\nb"
So I'm not sure chomp should be true by default. When passing a character it's usually to say "read up until this char, and keep it". Maybe with \n
it's different, but I don't know if it's good to have the chomp
flag being true by default for some cases and false for others.
On the other hand, passing a delimiter to gets
isn't very common, so it could be true
by default when not passing any delimiter.
Hey! This is already fixed in 0.20.5 (and I think a couple of versions before that too):
"hello\nworld\nlast".lines # => ["hello", "world", "last"]
"hello\nworld\nlast".lines(chomp: false) # => ["hello\n", "world\n", "last"]
Most helpful comment
Ruby just added a
chomp
optional argument to many methods, for exampleIO#gets
. We could do the same. I always wanted to read lines and automatically have them chomped. Invokingchomp
is OK, but will create a new string, so thechomp
option will improve performance a bit.