Crystal: Windows line endings break escaped newlines in strings

Created on 16 Mar 2018  路  23Comments  路  Source: crystal-lang/crystal

The following does not work as expected if your file's line endings are \r\n:

text = "This will have \
    newlines and \
    leading whitespace \
    if your line endings are \\r\\n."

Perhaps Crystal wants to support CRLF as a regular newline, perhaps it doesn't - whatever the case, I figured I'd report it.

feature discussion topicparser lang

Most helpful comment

\r, \r\n and \n should just be converted to \n as early as practical in the lexer.

All 23 comments

IMHO Crystal shouldn't support CRLF in source files at all.

Any particular reason? Just to deliberately make future chances of Windows compatibility more difficult, or...? I don't like CRLF either, but that's no reason to deliberately handle them incorrectly.

Because it's completely unnecessary and avoids annoying line ending issues.
This is not about being correct, it's just a question if Crystal should support if (and if yes, to full extend) or not. If it is decided that Crystal source code only allows LF, then that's deliberately correct.
Editors on Windows are equally capable of handling LF as editors on Unix-likes can handle CRLF. There is absolutely no reason to assume CRLF would be required to provide some sort of Windows compatibility.
Having only one accepted line ending makes life easier for everyone and I don't think there is a compelling reason for CRLF support.

Having only one accepted line ending makes life easier for everyone

Well, except for everyone developing on Windows.

Only allowing LF will, for example, require lines like *.cr text eol=lf in .gitattributes, as files will otherwise be checked out with CRLF on Windows. In addition, while editors can generally handle LF even on Windows, they generally default to CRLF when creating new files, etc.

Personally, I think Crystal should aim to be friendly for the developers using it, as opposed to just the developers developing the language itself.

The issue isn't LF vs CRLF, but that we shouldn't support escaping line breaks with \ as asterite reported numerous times.

Oh, that's on its way out for some reason? In the string literals section of the syntax docs, it's quite clearly documented, and I'd say it's a fairly welcome feature. Guess that might not be the case soon, though?

@obskyr I'm working on Windows as my primary development platform. None of my source code files uses CRLF (unless explicitly required) and this really keeps away pain.

This is way easier to achieve than you think. Put autocrlf = false in your global git config and no file will be checked out with CRLF. Editors can be configured (if not, go looking for a real one). Of course, I would always recommend using LF as default line ending. There is also .editorconfig to have interoperable settings per repo. Crystal's default .editorconfig already includes end_of_line = lf, so if your editor supports editorconfig, it shouldn't even save a file using CRLF.

To me, developer friendly means to make life easier avoiding unnecessary choices and less things to worry about.

@ysbaddaden IMHO it's a different issue. It shows that the implementation has to keep track of different line endings in many places (not just regarding line continuation). This would be easier if only LF is supported.

@straight-shoota Requiring Windows users to either use EditorConfig or change their global git config is most definitely not "developer friendly".

I think it is, because it's a setting you have to set once and can forget it. Otherwise you will inevitably run into issues with non-matching line endings, mistakenly commit wrong line endings etc.
There is generally no reason to use CRLF on Windows at all.

Otherwise you will inevitably run into issues with non-matching line endings, mistakenly commit wrong line endings etc.

You won't if Crystal supports them properly. Instead of having it work only for people who have manually set global settings, why not have it work for everyone?

Sigh... is this so hard to understand? Whether Crystal supports CRLF or not has nothing to do with the issues I mentioned.
My argument is about having a sane codebase that is easily compatibly with any platform. This is completely independent of which programming language you use. Most languages support both CRLF and LF, but many code conventions recommend using LF only.

Hey, there's no need to be condescending.

So, you're thinking Crystal should just error out if it encounters a CRLF linebreak while parsing?

Essentially, yes. This is obviously not entirely thought through, just an idea to contemplate. Don't know what core members think.

CR-only newlines for example are not supported and result in an syntax error: expected '\n' after '\r'. What if CR was never valid whitespace token in Crystal source code even if followed by LF?

I really am not at all into the idea of having to add *.cr text eol=lf to every Crystal project I ever do with git. Even if I were to set my global git config, I can't really tell everyone who might clone the repo to do so.

We should either support all like endings in the compiler or error out if the line endings are wrong.

There's an actionable item either way.

We should either support all like endings in the compiler or error out if the line endings are wrong.

What do Rust, Go, Nim in this case?

Probably support them, and so should crystal.

Interesting, I had a similar issue some time ago

End of line sequence CRLF do not allow to detect symbols, use LF instead.

https://github.com/crystal-lang-tools/vscode-crystal-lang/wiki/Known-Issues

Yes! I have the same issue! I'm also developing on Windows and I can't create these Strings.

Another related consideration - with \r\n newlines, consider a typical multiline string:

"A multiline
string"

Currently, it evaluates to "A multiline\r\nstring". Regardless of the line endings in your file, newlines in multiline strings should evaluate to \n.

On Nim:

https://nim-lang.org/docs/manual.html

Encoding

All Nim source files are in the UTF-8 encoding (or its ASCII subset). Other encodings are not supported. Any of the standard platform line termination sequences can be used - the Unix form using ASCII LF (linefeed), the Windows form using the ASCII sequence CR LF (return followed by linefeed), or the old Macintosh form using the ASCII CR (return) character. All of these forms can be used equally, regardless of platform.


I suppose they all support CR LF so it may be useful for crystal to support the other variants it as-is (also, the docu mentions it, so users may assume that \r works fine; if it does not work then I would assume users to expect that the documentation reflects this as well),

Aside from this, I personally agree with straight-shoota (I myself never use \r, ever) but I think if "everyone else treats CR LF like newlines", such as the mentioned programming languages above for windows, then perhaps crystal should do so as well. You could also say that this is one small step for crystal to regard windows as a main platform ... :)

\r, \r\n and \n should just be converted to \n as early as practical in the lexer.

Using CRLF causes major pains with codebases with devs on Mac, Linux and Windows. Someone inevitiably messes things up.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lgphp picture lgphp  路  3Comments

pbrusco picture pbrusco  路  3Comments

jhass picture jhass  路  3Comments

lbguilherme picture lbguilherme  路  3Comments

nabeelomer picture nabeelomer  路  3Comments