Nim: Make newlines sane again

Created on 15 Jan 2018  路  7Comments  路  Source: nim-lang/Nim

As https://nim-lang.org/docs/manual.html#lexical-analysis-string-literals points out, '\n' can be used to get a newline in a string literal.

It doesn't say what a newline is however, except later where it's pointed out that it might not be a single character.

Per the compiler source, it's platform specific, and in practice it will mostly be translated to \x0a or \x0d\x0a in the C code.

Coming from other languages, this is a unwelcome surprise, where in practice, the \n sequence has a well established meaning of a line feed character (py, java, others) or "whatever-c-does" which at least guarantees a single character, and in practice ends up being a line feed (see https://en.wikipedia.org/wiki/Newline).

I'd like to suggest nim goes the way of py and friends, and defines the escape characters in terms of the ascii codes they generate - https://docs.python.org/2.0/ref/strings.html:

\a | ASCII Bell (BEL)
\b | ASCII Backspace (BS)
\f | ASCII Formfeed (FF)
\n | ASCII Linefeed (LF)
\r | ASCII Carriage Return (CR)
\t | ASCII Horizontal Tab (TAB)
\v | ASCII Vertical Tab (VT)

This is well-defined, easy to understand and follows the principle of least surprise. It is also the closest to what Nim is doing today, with exact codes rather than C-style free-for-all.

A platform specific newline can sometimes be useful - this can instead be mapped to a new escape character, for example \N to signal to the reader that there's unusual stuff going on.

RFC

Most helpful comment

I vote for \p as the platform specific newline.

All 7 comments

I think the question here becomes what is most common to use. If I write log statements on my Linux machine as echo "Hello\nWorld" they would break on Windows. And Windows users using CRLF instead would just annoy people. \n as platform specific newline makes a bit of sense in this case but I can see how it could surprise new users, especially when parsing input. The Nim compiler however makes sure to not allow \n in character literals, even on platforms where it is just a line-feed, which helps alleviate some pitfalls.

The other case is as stated above parsing input. If I do a split on \n it would not properly split a string using only line-feeds on Windows platforms. I think adding a new symbol for platform specific newline would just make people forget to use it and I feel that it's more likely that someone parsing input to read the documentation on escape characters. Therefore I support keeping the \n as a platform specific newline and leave \l to mean line-feed as I think it would lead to less platform related bugs overall.

Just because everyone else is doing something doesn't mean it's the right thing to do

  1. '\n' implemented as it is now, but user either:

    • doesn't know about it (as it came from other languages), and uses it incorrectly, or
    • knows, but forgets it and uses it incorrectly.
  2. '\n' is implemented as one character (like in C), but user:

    • doesn't know that it's not automatically translated to CRLF while outputting to file (like in other languages) in windows, or
    • knows, but forgets it and uses it incorrectly.

The question is what of those 2 scenarios is more frequent/harmful. For me it seems that scenario 1 is.

If you are newcomer in Nim, and Nim is your first programming language, then ambiguous meaning of \n is pretty good.

But if you came to Nim from other programming languages where \n is exactly mean LF, then ambiguous meaning of \n is more harmful then useful.

But because i'm pretty old, i'm voting to have meaning of \n be equal to C, Java, Python. Newcomers must not forget about difference between Windows encoding CRLF and Unix encoding LF only, they must remember it.

My opinion: let's change \n to mean the same thing it means in other languages and let's introduce a new escape character. So I agree with OP, up to a point.

Where I disagree is with the naming of this new escape character. Don't name it \N, that's a bad idea. It's too similar to \n and Nim is a case insensitive language.

I vote for \p as the platform specific newline.

Agree here with dom96 and Araq.
\N is a bad idea, \p will be much better.

I've nothing against \p - either is fine

Was this page helpful?
0 / 5 - 0 ratings