Chapel: Improve TOML library

Created on 24 Aug 2017  路  12Comments  路  Source: chapel-lang/chapel

To compare against TOML spec -> TOML Spec

  • [x] Promote TOML to packages (#7104)

TOML Spec to add support for

  • [ ] Array of tables (hardest to implement, most needed)
  • [ ] Exponent reals
  • [ ] Underscore notation for integers and reals
  • [x] Date (#12802)
  • [x] Time (#12869)
  • [ ] Datetime with offset

Test Suites

  • [ ] test/library/packages/TOML/BurntSushi/
  • [ ] test/library/packages/TOML/toml-spec/

Known bugs

  • [ ] Quoted keys with periods such as "this.key" = value
  • [ ] Comment on the last line of the file (easy)
  • [ ] Inline comment treated as if it's in written in a different line (#12843)

User errors currently not caught

  • [ ] Table name repeated (overwrites table)
  • [ ] key name repeated in same table (overwrites value)
  • [ ] Arrays of mixed data types e.g. [2, "string", 4.5]

Testing

  • [ ] More futures with user errors like key or table name repeated
  • [x] Tests for handled errors (once error handling is put in)

Other

  • [ ] plug some of the remaining memory leaks
  • [x] error handling instead of halt()
Libraries / Modules Feature Request

Most helpful comment

So there was a issue(which could be resolved now I haven't tested it recently) where having a comment on the last line of the TOML file would produce a rather unhelpful TomlError.

All 12 comments

Added date support in #12802 :arrow_up:

Known Bug: #12843 (Inline comment treated as if it's in written in a different line)

@ben-albrecht

Quoted keys with periods such as "this.key" = value

Is this related to #13332 ?

Comment on the last line of the file (easy)

Could you elaborate this issue ?

Quoted keys with periods such as "this.key" = value

Is this related to #13332 ?

No, it is supporting quoted keys, such as described in the TOML spec.

Comment on the last line of the file (easy)

Could you elaborate this issue ?

I'm actually not sure about this one. Does having a comment on the last line of the file currently produce an error? @Spartee may have a better idea.

So there was a issue(which could be resolved now I haven't tested it recently) where having a comment on the last line of the TOML file would produce a rather unhelpful TomlError.

@ben-albrecht I have started to work on Datetime with offset. Should the regex validate for leap years too ?
Also there's a case mentioned in TOML Spec for replacing T in 1979-05-27T00:32:00-07:00 with a space. Should I consider that case?
If yes, our regex would fail there ! Additionally, the TOML module would also consider 1979-05-27 and 00:32:00-07:00 as different tokens if a space exists.

This is the regex I plan to use for this feature:

\d{4}-[01]{1}\d{1}-[0-3]{1}\d{1}T[0-2]{1}\d{1}:[0-6]{1}\d{1}:[0-6]{1}\d{1}[+|-][0-1][0-9]:[0-5][0-9]$

Should the regex validate for leap years too ?

I am not sure if this needs to be validated within the regex itself, but it should be validated elsewhere if not.

Also there's a case mentioned in TOML Spec for replacing T in 1979-05-27T00:32:00-07:00 with a space. Should I consider that case?

Ideally, yes.

Additionally, the TOML module would also consider 1979-05-27 and 00:32:00-07:00 as different tokens if a space exists.

Would we need to update our tokenizer then?

This is the regex I plan to use for this feature:

Thanks. I have found a good way to share regexes is to link to a online regex evaluator, such as https://regex101.com, with your regex expression and test cases plugged in. That makes it easier for a reviewer to catch an edge case you may have missed in your regular expression.

Feel free to split off a separate issue for this, if you think it will warrant more discussion.

Would we need to update our tokenizer then?

@ben-albrecht Modifying the tokenizer wouldn't be wise because it might break already implemented features. I'll try to handle it with the regex.

Thanks. I have found a good way to share regexes is to link to a online regex evaluator,

This is great ! Thanks.

@ben-albrecht @Spartee Would like to report an issue that showed up in the development of #16141 (comment).
There should be some differentiation between . found in table name which are used to form subtables. Currently, . encountered within "" are also considered to create subtables.
eg:

[LocalAtomics."0.1.0"]
score = "4"

creates a roottable

[LocalAtomics]
[LocalAtomics."0]
[LocalAtomics."0.1]
[LocalAtomics."0.1.0"]
score = "4"

instead of

[LocalAtomics]
[LocalAtomics."0.1.0"]
score = "4"

Thanks @ankingcodes - Do you understand the source of this issue? Are we blindly splitting on . somewhere in the TOML module?

@ben-albrecht Yes, that's exactly the issue. We need to find a way to ignore . inside of quotes. I'm trying to fix it but I wish the documentation of the module was better.

Yes, that's exactly the issue. We need to find a way to ignore . inside of quotes

OK, great. Let us know if you run into issues with this.

While on the topic, I am wondering - does the regex handle escaped quotes? I haven't checked the TOML spec to know what is expected there myself...

I wish the documentation of the module was better.

Be the change you want to see in the ~world~ TOML module :)

Was this page helpful?
0 / 5 - 0 ratings