Toml: Clarify whether keys can contain a newline

Created on 10 Jun 2021  路  5Comments  路  Source: toml-lang/toml

The text of the spec and the grammar for toml keys is ambiguous with regard to newlines in a key. The spec says:

The key, equals sign, and value must be on the same line (though some values can be broken over multiple lines).
-- https://toml.io/en/v1.0.0#keyvalue-pair

However, the grammar seems to allow a newline inside of a quoted key:

quoted-key = basic-string / literal-string
basic-string = quotation-mark *basic-char quotation-mark
basic-char = basic-unescaped / escaped
escaped = escape escape-seq-char
escape-seq-char =/ %x6E ; n line feed U+000A
-- https://github.com/toml-lang/toml/blob/master/toml.abnf#L51

I've found two libraries which emit {"a\nb": 1} differently; either as

"a
b" = 1

or

"a\\nb" = 1
# Note: The library reads this in as {"a\nb": 1}

and trying to figure out which one (or both!) I should file a bug on. I'm thinking both of them might be wrong. The first one because newlines are not allowed in keys and the second one because it decodes the toml key "a\\nb" as "a\nb" instead of "a\\nb" (both libraries should, instead, throw an error when they try to emit a key with a newline in it).

Let me know what the spec and grammar really means so I can open the appropriate bugs.

Most helpful comment

Cool, I'll open bug against the library which is emitting actual newlines instead of the escape sequence. Thanks!

All 5 comments

The spec says:

Quoted keys follow the exact same rules as either basic strings or literal strings and allow you to use a much broader set of key names. Best practice is to use bare keys except when absolutely necessary.

So to me that says anything that's legal in a string is legal in a quoted key since they're equivalent, thus this should be legal:

"a\nb" = 1

Physical newlines would be illegal of course, just as they would be in a non-key string, though it's worth noting that the spec doesn't explicitly draw attention to the (non-)validity of their multi-line variants. The ABNF appears to prohibit them being used as keys (by omission), though this is likely an oversight; I can't see any reason to prohibit a multi-line string here so long as it didn't contain actual physical newlines.

both libraries should, instead, throw an error when they try to emit a key with a newline in it

Why should that be an error? Surely the desired behaviour is to re-emit the key as quoted key with a newline escape code, to exactly round-trip the input (i.e. "a\nb"). This of course requires the implementation to examine the contents of the string before emitting it to decide on the right way to do so, but that's a fairly standard expectation I'd think.

So, to summarize: yes, quoted keys can contain logical newlines, but physically, in the file, these newlines must be escaped, just like in any other non-multiline string.

Cool, I'll open bug against the library which is emitting actual newlines instead of the escape sequence. Thanks!

But isn't the behavior of the other library wrong as well? It should treat "n" as a literal and hence serialize it as \n, with just a single backslash.

You're right. I made a mistake, though. I mixed the actual string being output and the language's representation of the string in my initial test for the second library. So when I looked at it again, the second library was doing what is suggested here.

Was this page helpful?
0 / 5 - 0 ratings