openHAB comes with special .items, .rules, .sitemap and .things file formats. Most of them are rather configuration files than programming code, but .rules consist of Xtend scripts (this language is already supported by linguist). For an example (not mine), see https://github.com/yfaway/openhab-rules/blob/master/rules/plugs.rules. Basically, rules files are scripts that contain some special syntax for declaring a rule, but you can write normal Xtend statements at any place.
Would be nice if linguist could detect them.
Question: Should they be detected as Xtend or as a separate language "openHAB"? Not sure how you decide this here usually.
This issue has been automatically marked as stale because it has not had activity in a long time. If this issue is still relevant and should remain open, please reply with a short explanation (e.g. "I have checked the code and this issue is still relevant because ___."). Thank you for your contributions.
I would still desire support for openHAB files. As I'm not close to the linguist implementation, I would be pleased if someone could answer my initial question.
Should they be detected as Xtend or as a separate language "openHAB"?
Whoops. Sorry I missed this the first time. Definitely _not_ "openHAB" as it isn't a language, but rather a project. It would be like implementing a language for openHAB's friendly neighbour: HomeAssistant (which really uses YAML for all it's config files).
Xtend would probably be the better language to use, however .rules is a very generic and popular extension so will also need a heuristic to clearly identify it as Xtend.
The current Xtend grammar seems to do a reasonable job at highlighting the sample your linked to here too.
A quick search query like this: https://github.com/search?q=extension%3Arules+rule+AND+when+AND+then&type=Code could indicate it's popular enough to add the extension to Xtend too.
The current Xtend grammar seems to do a reasonable job at highlighting the sample your linked to here too.
This would be great!
A quick search query like this: https://github.com/search?q=extension%3Arules+rule+AND+when+AND+then&type=Code could indicate it's popular enough to add the extension to Xtend too.
8K+ hits, I think it is popular enough. So a heuristic could simply search for something like .rules and matches .*rule.*when.*then.*?
So a heuristic could simply search for something like
.rulesand matches.*rule.*when.*then.*?
Yeah, something along those lines. You may need to be a little more precise as those * are very greedy 😉. You can probably gain some inspiration from the other heuristics.
Feel free to open a PR and we can help fine tune the regex - @Alhadis is quite the regex whizz.
Hm, maybe you could say:
(?<=^\s*?)rule\s*?(\S+?|\"[\s\S]+?\")\s*?when\s*?([\s\S]+?)\s*?then\s*?([\s\S]+?)\s*?end(?=\s+?)
See https://regex101.com/r/Q5Kswi/1
This is still quite generic, I know, but afaik, you could actually write a whole .rules file in one line, so I do not see any alternative at the moment ...
Btw, here is the best reference of the .rules format I could find: https://github.com/openhab/openhab1-addons/wiki/Rules#the-syntax
However, any help by wizards is highly welcome! :-)
@LinqLover That expression is far too open-ended. Here's a cleaned-up version of what I think you were trying to match:
~regexp
(?x) ^
\s* rule \s* (\w+|".+?")
\s* when \s* (.+?)
\s* then \s* (.+?)
\s* end \s
~
The (.+?) groups are what present the most trouble. Consider a file containing something like:
~py
def hypothetical_rule_related_something():
"""
FIXME: This rule's behaviour isn't obvious. We should use this
rule only when the user requests it. If they haven't, then the
program should end with an error message.
"""
~
Here, the expression matches the segment:
~
rule only when the user requests it. If they haven't, then the
program should end
~
… which could have also been 1,600 lines of source code that contain the words then and end somewhere. A more accurate way to match these constructs might be:
~regexp
(?xm) ^
([ \t]) rule \s+ (\w+|".+?") \s
\R .?
\R \1 when \s+ .?
\R \1 then \s+ .*?
\R \1 end \s
~
… which matches rule, when, then, and end only if they each begin a line at the same indentation level. This assumes that indentation is significant in openHAB and a line like rule "Foo" foo(); when true then bar(); end isn't valid. Correct me if I'm wrong. 😉
Hi @Alhadis, thanks for your help!
This assumes that indentation is significant in openHAB and a line like
rule "Foo" foo(); when true then bar(); endisn't valid. Correct me if I'm wrong. 😉
Unfortunately, it's not that easy - I just tried it out: You can also write one rule in one line or mess it up with comments between the keywords. No special indentations are required at all.
Hm, what can we do then? Would it be wise to develop a pattern that tries to match every possible expression of the Xtend language?
Would it be wise to develop a pattern that tries to match every possible expression of the Xtend language?
No, not at all: a valid Xtend expression could easily be valid in other languages too. It's important to note that a heuristic need not match every file; only the most obvious cases. Anything which doesn't match gets passed down to the classifier, which is preferable to an overzealous heuristic that matches more than it should.
That being said, we can still use the heuristic I proposed earlier, provided it matches most openHAB files. It might be easier to add heuristics for competing languages that match constructs which aren't valid Xtend syntax, however.
This issue has been automatically marked as stale because it has not had activity in a long time. If this issue is still relevant and should remain open, please reply with a short explanation (e.g. "I have checked the code and this issue is still relevant because ___."). Thank you for your contributions.
This issue has been automatically closed because it has not had activity in a long time. Please feel free to reopen it or create a new issue.
Most helpful comment
Yeah, something along those lines. You may need to be a little more precise as those
*are very greedy 😉. You can probably gain some inspiration from the other heuristics.Feel free to open a PR and we can help fine tune the regex - @Alhadis is quite the regex whizz.