Libelektra: inconsistent "comment" handling for ini and hosts plugin

Created on 24 Feb 2017 · 15Comments · Source: ElektraInitiative/libelektra

The ini and hosts plugin handle "comments" differently.

The hosts plugin conforms to the metadata defined in METADATA.ini, however, ini handles comments differently.
So for the following ini-file

# header

# comment
setting = value

the _ini_ plugin will generate this meta data for setting key:

key 'user/test/puppet-test-ini/setting'
  string value: value
  meta data keys: 8
    comments: #2
    comments/#0: # header
    comments/#1: 
    comments/#2: # comment
    internal/ini/key/last: #0
    internal/ini/key/number: #0
    internal/ini/order: #0
    internal/ini/parent: user/test/puppet-test-ini

Issues/differences to METADATA.ini:

comments, not comment
comment/start not implemented/used
comment character # included in the value of the comment key

It would be very nice, if the _ini_ plugin also implements the behaviour as defined in METADATA.ini, especially not adding the comment char '#' to the actual value of the metadata key.

Maybe related to #1355

bug low priority usability

Source

BernhardDenner

All 15 comments

Thanks for unveiling that the issue is still open. I already reported it in #597 but we did not look into the details at that time.

@tom-wa Can you update ini to the latest comment spec? You should also add your plugin in METADATA.ini that it supports the spec. If you also extract some comment functions to the meta library, it would be a good thing. Then we would have an easier life to support comments in the next plugin and users could easily access it.

We should also have a kdb-tool feature to easily access comments (and also (meta) arrays in general). ( @e1528532 any suggestions? )

We should also add generic tests for storage plugins with comments in the fashion of tests/shell/check_get_set.sh.

@BernhardDenner Are you happy with the comment spec or do you also have suggestions/improvements for the spec itself?

markus2330 on 24 Feb 2017

Are you happy with the comment spec or do you also have suggestions/improvements for the spec itself?

Yes, I think there is nothing wrong with it.
I found this problem, since I'm implementing a 'comments' parameter for my puppet module.

BernhardDenner on 24 Feb 2017

👍1

i don't really understand how preserving blank lines with the comment spec works. what would


#comment1
#

#
#comment2
#

key = value

look like ?

tom-wa on 24 Feb 2017

I would say


#comment1
#

#
#comment2
#

key = value

should be:

 comment/#0 =     ; nothing inline, no space or start
 comment/#1 =     ; first line is empty, no space or start
 comment/#2 = comment1
 comment/#2/start = #  ; here we have a start symbol, but no space
 comment/#3 = 
 comment/#3/start = #
 comment/#4 = 
 comment/#5 = 
 comment/#5/start = #
 comment/#6 = comment2
 comment/#6/start = #
 comment/#7 = 
 comment/#7/start = #
 comment/#8 =

hosts agrees with my interpretation, except that it also adds a metakey space with a number. I would argue that you do not need a metakey space here, because there is no space (its even done inconsistently: the inline comment does not have a space metakey, but others have), and I would expect a string with the spaces that are present. A number would limit its expressibility. (There are many different spaces and tabs, which would get lost).

@fberlakovich What do you think? Can you fix this for hosts?

@tom-wa Thanks for pointing this out! Please add this as unit test.

markus2330 on 24 Feb 2017

@fberlakovich Another question. Does keytometa already use the new comment syntax?
Do you still remember what the issue with the old syntax was? (We should have written a rationale in doc/METADATA.ini.)

It basically was everything above key = value literally, with one additional newline in the beginning to state that there is no inline comment. Was it only about that we disliked that it is not fully parsed?

markus2330 on 24 Feb 2017

If you also extract some comment functions to the meta library, it would be a good thing. Then we would have an easier life to support comments in the next plugin and users could easily access it.

Note that some comment utility functions are already present in keymetaformatting.c in the hosts plugin. This file was intended as an inital seed for an elektra formatting parsing library. However, no plugin except the hosts plugin ever fully adopted the comment formatting proposal.

What do you think? Can you fix this for hosts?

What exactly is broken with the implementation? Inserting the following comment block in a hosts file

#comment1
#

#
#comment2
#

results in the following comment metadata

comment/#0              ; empty string as there is no inline comment
comment/#1              ; "comment1"
comment/#1/space        ; "0" as there are no spaces BEFORE the comment start
comment/#1/start        ; "#" as the comment was started with the # sign
comment/#2              ; empty string because of newline
comment/#2/space        ; "0" as there are no spaces at all
comment/#2/start        ; "#" as the comment was started with the # sign
comment/#3              ; empty string because of newline (no start or space here because it is an empty line)
comment/#4            
comment/#4/space
comment/#4/start
comment/#5
comment/#5/space
comment/#5/start
comment/#6
comment/#6/space
comment/#6/start
comment/#7

The only obvious flaw I can see is the lack of documentation.

I would argue that you do not need a metakey space here, because there is no space (its even done inconsistently: the inline comment does not have a space metakey, but others have)

The space metakey captures the number of spaces before the comment start sign (i.e. the number of spaces at the beginning of a line for a regular comment and the number of spaces between the payload and the comment for an inline comment). This information could be merged into the comment start metakey. However, when we designed the specification we intended to better structure the formatting information.

and I would expect a string with the spaces that are present. A number would limit its expressibility. (There are many different spaces and tabs, which would get lost).

I agree with that. A number is probably not the best way to capture whitespace information. However, I think that simply counting the whitespaces was way easier to implement than to preserve the exact whitespace used.

Another question. Does keytometa already use the new comment syntax?
Do you still remember what the issue with the old syntax was? (We should have written a rationale in doc/METADATA.ini.)

Unfortunately not as this would require a special case. keytometa does not convert to comments in specific, but to metakeys with a configurable name. I think the special use case of converting to comment metakeys would require a specialized plugin, probably based on keytometa.

Was it only about that we disliked that it is not fully parsed?

I cannot remember the exact shortcomings, but I know that at least the comment parsing code present in the hosts plugin before the rewrite did not preserve all details of formatting.

I hope I could clear up things a bit ;).

fberlakovich on 24 Feb 2017

Note that some comment utility functions are already present in keymetaformatting.c in the hosts plugin. This file was intended as an inital seed for an Elektra formatting parsing library. However, no plugin except the hosts plugin ever fully adopted the comment formatting proposal.

INI tried to, see above ;)

The only obvious flaw I can see is the lack of documentation.

Yes, the specification is quite sloppy.

I agree with that. A number is probably not the best way to capture whitespace information. However, I think that simply counting the whitespaces was way easier to implement than to preserve the exact whitespace used.

We should change this then. Is it really a big implementation difference to remember a string of spaces vs. counting whitespace?

Even if it is, the specification should do the right thing (preserving everything). If some plugins are sloppy we can simply describe the deficits in their infos/status. See #666 for the proposal.

@BernhardDenner What is your opinion on preserving whitespaces in files?

I cannot remember the exact shortcomings, but I know that at least the comment parsing code present in the hosts plugin before the rewrite did not preserve all details of formatting.

Actually that was only a bug introduced later and not caught because we had no unit tests with comments ;)

@sanssecours Maybe you can also give us your opinion on the comment specifications, thus it will also be important for YAML plugins.

What I am missing in the comment specification (actually only in the new one, the other one takes everything literally anyway) is how it would work in non-line oriented comments.

@e1528532 is currently writing an XML plugin. So if we have something like:

<!--comment1--> <!--comment2
--> <a/> <!--comment3--> <!--comment4--> <b/>

What would happen? Do we have 4 comment nodes? Or 2 (2 lines)? Or 5 because of the line break?

And what about line-continuations that do not have a start character but obviously also are not empty lines? Do we add an empty start to indicate the line-continuations? Or should we simply add a new-line within the comment in the first comment node?

Space preserving in XML seems to be a non-issue, because the parsers do not report spaces before/after comments anyway.

@fberlakovich Can you add documentation for the augeas plugin how comments are handled when keytometa is present?

markus2330 on 24 Feb 2017

The XML example is quite interesting. From my point of view it's better to treat blocks of comments as _one_ comment. So generate one 'comment' metakey for each comment block. I think XML parsers also treat it this way.
So for the above XML example, I would suggest to generate something like:

comment/#0 = 'comment1'
comment/#1 = 'comment2
'
...

Line wise parsers could treat comments line wise.

But I think XML has an additional interesting case:

<xy>
  <!--comment1-->
  <yz>some value</yz>
  <!-comment2-->
</xy>

where to place 'comment2' comment?

If I understood @fberlakovich correctly, space describes the whitespace before the comment char e.g. # and is therefore more important for inline comments?
But I think it's simpler to more intuitive to preserve the whitespace chars as they are, instead of counting them.

And the start key holds the comment char + whitespace _between_ this comment char and the first non-whitespace comment?

BernhardDenner on 26 Feb 2017

👍1

I only wanted to point out that we should clarify if we split per-line or per-comment.
Should it be allowed to have new-lines within a comment?

Or said differently: What is the underlying meaning of having an array of comments?

With respect to comments, XML parsers will have a number of limitations anyway.

But I think it's simpler to more intuitive to preserve the whitespace chars as they are, instead of counting them.

Agreed, we should do this then! Let us consider the current hosts behavior as bug.

markus2330 on 26 Feb 2017

i have one issue with the current comment specification: Comment keys that are not above any key (comments in the last lines of files) are added to the parentkey. doesn't work for INI because the first key could also be the parent key, so comments above the first key could belong to the parent key

tom-wa on 8 Mar 2017

Where would you add these comments instead? Do we need extra meta data for these comments?

markus2330 on 8 Mar 2017

i'd argue that comments below the last key are less important than comments above the first key, but i honestly have no idea. and in most cases INI doesn't even use the parent key, only if something is explicitly written to it.

tom-wa on 8 Mar 2017

i'd argue that comments below the last key are less important than comments above the first key

I would not assume anything there. There might be a single key in the first line and the rest of the file is the complete documentation of the whole system.

in most cases INI doesn't even use the parent key, only if something is explicitly written to it.

There are also other plugins where the parentKey really exists. We could:

simply add all comments to the parentKey, everything at the end would be moved to the parent key if the key really exists
have a pseudo key for metadata that refers to the (end of the) file or infos otherwise not assignable to a specific key (such a key would also be handy if a file contains exclusively comments/metadata)
differentiate between comments above and below a key (or even inline?)

Option 3. would basically mean that we need to rethink the whole comments specification. Option 2. should be discussed best at a new issue, it is not limited to comments.

What do you think?

markus2330 on 8 Mar 2017

We should fix this issue for LCDproc (0.8.21) release.

There is a huge inconsistency: INI uses "comments", but doc/METADATA.ini says its called "comment".

@tom-wa Shouldn't the consistency checker for Metadata detect such a problem?

markus2330 on 2 Nov 2017

low priority, as toml plugin will be preferred in future.

markus2330 on 15 Nov 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

remove entry barriers

markus2330 · 3Comments

homepage: build fails

markus2330 · 4Comments

typechecker: build fails on travis

markus2330 · 4Comments

code generation for errors

markus2330 · 4Comments

bindings java: HelloElektra.java errors

dmoisej · 3Comments