Cylc-flow: parsec library: improvements & extensions

Created on 10 Jan 2019  路  11Comments  路  Source: cylc/cylc-flow

Master listing recording all improvements & extensions intended to be made, eventually, to the parsec library (cylc/lib/parsec/).

Originally moved from plain-text notes in the codebase in #2913. Please update in line with changes & extend with ideas.

To do:

  • [ ] Some suite-definition-specific special behaviour (namely combining identical graph sections, and use of OrderedDicts for environment and directives sections) is encoded in parsec - we need a way to specify this sort of thing in file-specific spec modules.
  • [ ] Figure out how to use OrderedDicts only for environment (and directives?) sections from the outset. Currently all sections are added as OrderedDicts by the fileparser.

    • [ ] Document standard usage:

    • [ ] a) define a file spec with validators

    • [ ] b) derive a parsec config object to add any additional file-format-specific functionality (i.e. to transform the parsed data, or to parse the basic item values, or new methods to return derived information).

    • [ ] c) do lazy load on import to parse the file(s)

    • [x] Expand the test battery, e.g. test that things that shouldn't validate don't. [completed: #2839]

    • [ ] Move section heading lists from cylc suite config into parsec?

    • [ ] Move inheritance from cylc suite config into parsec?

    • [x] Test repeated section and item add-or-override [completed: #2839]

    • [x] Test site/user style add-or-override [completed: #2839]

    • [ ] Abort if compulsory items not found (currently prints a warning)

Most helpful comment

While we are into parsec and its future, we may also want to consider:

  • Strategy to combine Rose configuration file format with Cylc/Parsec.
  • Adopt popular alternate configuration file formats, e.g. TOML, YAML
  • #2392 cylc.config strategy.
  • #1962 Python API - how much configuration file functionality do we still need in a future where users will be using more Python and less configuration file?

All 11 comments

@kinow: I believe in the PR #2839 you addressed a number of the items on this checklist, as tentatively noted above. Can you please confirm if this is correct? I looked through the PR code changes but since I am not very familiar with parsec it was not immediately clear what broad test cases were added.

Indeed, if there are any other points you know have been addressed already or are no longer applicable, please update the list. Thank you.

@sadielbartholomew I think the points you marked as done are indeed right. Added the tests mainly to learn parsec, and also preparing for the move to Python3.

I also created #2880, which is related to #2775, and consists in basically moving the code under lib/cylc/parsec. Not sure if necessary to be under this list. Other issues related to parsec too: https://github.com/cylc/cylc/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+parsec

Thanks for clarifying @kinow.

I am aware of #2880, but since it describes a clear-cut objective which concerns parsec as a whole, I think it should definitely stay as its own Issue. I wouldn't count it as an "improvement" (at least to parsec itself) or extension as such.

I've yet to have a proper look at other open Issues concerning parsec. There may be some that would be appropriate to re-home under to this list (closing the original but cross-linking back to it so that any comments there would still be viewable); feel free to consider this & move issues in this way yourself, by editing the comment. It's not worth agonising about, though. The main drive for creation of this Issue was to move text files giving 'to do' work for parsec from the codebase itself into the issue tracker.

While we are into parsec and its future, we may also want to consider:

  • Strategy to combine Rose configuration file format with Cylc/Parsec.
  • Adopt popular alternate configuration file formats, e.g. TOML, YAML
  • #2392 cylc.config strategy.
  • #1962 Python API - how much configuration file functionality do we still need in a future where users will be using more Python and less configuration file?

Python API - how much configuration file functionality do we still need in a future where users will be using more Python and less configuration file?

I'm keen for a Python API myself at this point, but I still think - unfortunately - many (most?) of our users are not sufficiently expert at programming to write suites as programs.

So my feeling is, we draw a line in the sand and continue support existing functionality via config file, but if you want certain advanced functionalities, or have very complex workflows, you have to be (or become) sufficiently expert at Python.

Also, I'd really like to modernize to YAML, say, but we can't realistically ditch the existing file format can we? That would cause a lot of trouble. And how much work to support multiple config file formats and a Python API?? (As a matter of fact, I'd personally be happy to say "to use cylc-9 (say) you must convert to YAML or Python" ... but I don't see all of us agreeing with that :grimacing: !)

(oops, unintentional close!)

I have crossed out the Ordered Dict item above. That was about OrderedDict being less efficient than plain dict in Python 2. I believe in Python 3, plain dict is now ordered, and there's no performance hit.

We might still want to modify the home-built "ordered dict with defaults" code in Cylc.

I agree (and I am definitely not suggesting that we remove support for our current configuration format any time soon).

I would say that we have 2 main issues with our current file format:

  • It does not have native list/array support as part of the data structure like TOML and YAML do.
  • It does not have very good semantics for append/substract - #1363 (as well as the ignored functionality in Rose).

I cannot see a simple path to fully move away from our current configuration format either. Jinja2 preprocessing makes it almost impossible to automate the process. Otherwise, we should be able to read the data structure that represents the current configuration file, and re-dump it as whatever configuration file format(s) that our future selves prefer.

However, we should still keep our options open - as the technology world moves fast.

Was reading RSS feed today and this post about YAML appeared with some cases where YAML can be unsafe or with undesired parse results. More discussion on this Hacker News thread.

More discussion on this Hacker News thread.

... the true horror starts when people start using text template engines to generate YAML.

Uh-oh, sounds familiar :rofl:

Was this page helpful?
0 / 5 - 0 ratings

Related issues

kinow picture kinow  路  4Comments

kinow picture kinow  路  4Comments

kinow picture kinow  路  3Comments

hjoliver picture hjoliver  路  5Comments

sadielbartholomew picture sadielbartholomew  路  4Comments