Toml: Should floats allow omission of integer or fractional/exponent?

Created on 30 Jun 2020  路  5Comments  路  Source: toml-lang/toml

I recently did a conformance pass over toml++ to ensure it passed any remaining test cases from both BurntSushi's and iarna's test suites, and was surprised to find that I wasn't passing some of the floating-point tests. Specifically, I was accepting some inputs that would be valid as literals in C and C++, but are apparently not in TOML. Excerpt from my test suite before I updated it:

// omitting leading integer part
parse_expected_value(FILE_LINE_ARGS, "     .1 "sv,       .1 );
parse_expected_value(FILE_LINE_ARGS, "    +.1 "sv,      +.1 );
parse_expected_value(FILE_LINE_ARGS, "    -.1 "sv,      -.1 );
parse_expected_value(FILE_LINE_ARGS, "   .1e1 "sv,     .1e1 );
parse_expected_value(FILE_LINE_ARGS, "  .1e+1 "sv,    .1e+1 );
parse_expected_value(FILE_LINE_ARGS, "  .1e-1 "sv,    .1e-1 );
parse_expected_value(FILE_LINE_ARGS, "  +.1e1 "sv,    +.1e1 );
parse_expected_value(FILE_LINE_ARGS, " +.1e+1 "sv,   +.1e+1 );
parse_expected_value(FILE_LINE_ARGS, " +.1e-1 "sv,   +.1e-1 );
parse_expected_value(FILE_LINE_ARGS, "  -.1e1 "sv,    -.1e1 );
parse_expected_value(FILE_LINE_ARGS, " -.1e+1 "sv,   -.1e+1 );
parse_expected_value(FILE_LINE_ARGS, " -.1e-1 "sv,   -.1e-1 );

// omitting trailing fractional part
parse_expected_value(FILE_LINE_ARGS, "     1. "sv,       1. );
parse_expected_value(FILE_LINE_ARGS, "    +1. "sv,      +1. );
parse_expected_value(FILE_LINE_ARGS, "    -1. "sv,      -1. );
parse_expected_value(FILE_LINE_ARGS, "   1.e1 "sv,     1.e1 );
parse_expected_value(FILE_LINE_ARGS, "  1.e+1 "sv,    1.e+1 );
parse_expected_value(FILE_LINE_ARGS, "  1.e-1 "sv,    1.e-1 );
parse_expected_value(FILE_LINE_ARGS, "  +1.e1 "sv,    +1.e1 );
parse_expected_value(FILE_LINE_ARGS, " +1.e+1 "sv,   +1.e+1 );
parse_expected_value(FILE_LINE_ARGS, " +1.e-1 "sv,   +1.e-1 );
parse_expected_value(FILE_LINE_ARGS, "  -1.e1 "sv,    -1.e1 );
parse_expected_value(FILE_LINE_ARGS, " -1.e+1 "sv,   -1.e+1 );
parse_expected_value(FILE_LINE_ARGS, " -1.e-1 "sv,   -1.e-1 );

All of those representations are valid in C and C++ (and I assume other languages, too), but not legal in TOML.

Upon re-reading the TOML spec I realized that it is fairly clear that the integer part is always required (it's never subject to an _'or'_), so my knee-jerk assumption that followed the same rules as C and C++ wasn't particularly clever. Despite that, prefixing or suffixing an integer with a '.' character is a common short-hand for expressing that the value is in fact a float of some sort, and until encountering these test case violations, my assumption was that it was allowed in TOML seemed like a fairly natural one.

I guess the question is: should these representations, or some subset of them, be legal? If the answer is ultimately 'No', I think it would be wise to add a few invalid examples to the spec doc to explicitly call them out.

clarification question

Most helpful comment

I guess the question is: should these representations, or some subset of them, be legal?

No. I think nearly all of the mentioned examples are equally (if not more) clear with a "0" in the appropriate location.

If the answer is ultimately 'No', I think it would be wise to add a few invalid examples to the spec doc to explicitly call them out.

Please do. :)

I'd suggest keeping it very limited though -- perhaps just a single sentence + code block w/ 2 examples (one for trailing, one for leading).

All 5 comments

The fractional part is optional if an exponent is used; the spec contains several examples of this (5e+22, 1e06, -2E-2). The trick, of course, is that the spec expects you to have a digit on both sides of the decimal dot. I believe that's pretty reasonable to keep numbers readable for "normal people" (read: non-programmers). 2.e-3 is just a less readable way of saying 2e-3, so why not use the latter? Likewise, .7 is a less readable way of saying 0.7, frowned upon (= forbidden) by the spec.

In my viewpoint, these restrictions help to keep TOML files readable. TOML is not a programming language and doesn't have to support all the shortcuts (or longer cuts in the case of the unnecessary dot) sometimes used by programmers.

Yeah, I get that _at most one_ of the fractional and exponent can be omitted currently. I'm only talking about cases where both are omitted simultaneously (7.), or omitting the integer part and keeping one or both of the others (.7).

Certainly it is the case that my surprise at these being illegal was borne from my background in programming, and I have no objection to it remaining illegal in TOML. I do, however, want to acknowledge that a large proportion of TOML's current user base would be programmers (particularly from Rust and Python backgrounds), so think a few clarifying examples in the spec would go a long way. Go where the audience is, etc. Currently it's only 'frowned-upon' by omission, as far as I could tell.

Does that seem reasonable? I'd be happy to make a corresponding PR myself, since this would be a pretty small change.

@marzer You mean a little PR that adds some examples of things that are forbidden? That would be fine with me.

@ChristianSi Yup.

I guess the question is: should these representations, or some subset of them, be legal?

No. I think nearly all of the mentioned examples are equally (if not more) clear with a "0" in the appropriate location.

If the answer is ultimately 'No', I think it would be wise to add a few invalid examples to the spec doc to explicitly call them out.

Please do. :)

I'd suggest keeping it very limited though -- perhaps just a single sentence + code block w/ 2 examples (one for trailing, one for leading).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Suhoy95 picture Suhoy95  路  4Comments

hukkin picture hukkin  路  4Comments

clarfonthey picture clarfonthey  路  4Comments

chillum picture chillum  路  4Comments

emilmelnikov picture emilmelnikov  路  4Comments