What is the filetype extension for script files? I was thinking of making a syntax highlighting file
Most of us use .jq
I'm not aware of any other extensions in use, but I've been fairly absent
recently.
On Fri, Nov 27, 2015 at 3:16 PM vito-c [email protected] wrote:
What is the filetype extension for script files? I was thinking of making
a syntax highlighting file—
Reply to this email directly or view it on GitHub
https://github.com/stedolan/jq/issues/1025.
sounds legit. I did a quick google search and didn't see .jq in use anywhere else. Are there any syntax highlighting files currently available?
Not to my knowledge. I'm excited to see what you come up with, though!
On Fri, Nov 27, 2015 at 3:20 PM vito-c [email protected] wrote:
sounds legit. I did a quick google search and didn't see .jq in use
anywhere else. Are there any syntax highlighting files currently available?—
Reply to this email directly or view it on GitHub
https://github.com/stedolan/jq/issues/1025#issuecomment-160199668.
Is there a list of all keywords somewhere? Could people post up some lengthy jq filters so I can test how this is going to look? So I don't end up making unicorn rainbow barf :D
should I include these as functions or keywords?
https://github.com/stedolan/jq/blob/master/src/builtin.jq
My preference would be to color builtin functions different from user-defined functions different from keywords different from formatting operators (@sh and friends).
The builtin functions are all the jq-coded ones in builtin.jq, all the c-coded ones in function_list in builtin.c: https://github.com/stedolan/jq/blob/0d177d240dc06adfb676716d5adc849b326c21f5/src/builtin.c#L1249 and the two bytecoded ones in builtin_defs in builtin.c: https://github.com/stedolan/jq/blob/0d177d240dc06adfb676716d5adc849b326c21f5/src/builtin.c#L1325
The keywords are if, then, elif, else, end, and, or, reduce, as, try, catch, label, break, def, foreach, import, include, module, modulemeta.
For formatting operators just use @[a-zA-Z0-9_]+ like the lexer does, no need to enumerate them.
Thanks for your work on this! What editor(s) is this going to work in?
@vito-c asked:
Could people post up some lengthy jq filters ...?
See e.g. https://github.com/joelpurra/jqnpm/wiki
And in particular https://github.com/joelpurra/jq-bigint --main.jq definitely qualifies as lengthy.
By the way, __loc__ is also a keyword (de facto); and of course "true", "false" and "null" have a special status.
Good luck!
jq's src/builtin.jq is a pretty sizeable jq code file.
Thanks a lot for the lengthy scripts they have been helpful! @dtolnay I use Vim/Neovim so that is the editor I'm supporting. I'm open to making highlighting for some other editors later as well.
I'm going to break down some tasks for myself here so you guys can give feed back. I wanted to get your opinions on how a documentation commenting block should look.
I can definitely highlight builtin c and builtin jq functions differently there might be a bit of challenge with user functions though. User functions will probably end up being the default color.
Questions:
jq -n -c the fastest way to compile a script and get errors (think linter)Syntax Highlighting for Vim/Neovim
@[a-zA-Z0-9_]+Almost forgot. Progress!

Looking good! And I'm a vim user, so, yay!
Now, what i think might be helpful is to color commas and semicolons
differently, and to color parenthesis and pipe. I'm not sure i care too
have all builtins highlighted differently than other identifiers.
The unicorn rainbow barf is definitely beginning. Here is a mockup of what I would like in my color scheme. The main differences are numbers being uncolored and strings being a single color (although make sure to handle string interpolations).
As I commented earlier (though not very explicitly), I don't think you should color jq-coded and c-coded builtins differently from each other.

@dtolnay I misread you at first then! I was hoping builtin c and jq would be the same color :+1: I rather like having string quotes different colors I can easily make this a configuration variable for you though so that it can be toggled. I will edit the task list inline.
That will run the script, not just compile. What is this for? We may need to add a way to compile without running.
I was going to use this for syntax checking which is doable but there might be side effects if the script also runs? I also noticed there is no column for errors https://github.com/stedolan/jq/issues/1027 which means best I could do was mark the error at the line.
@dtolnay can tokens be emitted? I was wondering how it would look
No, the closest is jq --debug-dump-disasm which dumps the bytecode.
@dtolnay I will make highlighting numbers a variable as well. I like being able to highlight numbers because it makes it easier to see the range.
@nicowilliams commas, semicolons, pipes, parenthesis do you want these the same as other operators (same color as == - +)? I was thinking semicolons and pipes should be special.
@pkoppstein did you mean $__loc__?
@vito-c - If you look in lexer.l you'll see __loc__ has the status of a keyword.
To see the effect, compare:
jq -n 'def __loc__: 0;'
with
jq -n 'def _loc__: 0;'
jq does have an internal AST-like representation of jq programs called
"blocks", complete with links to line number and column number
information. It has no serialization for this data structure, but I've
wanted one for a different reason: to help me reason about the compiler.
Armed with two reasons for it, it's safe to say we should add a
serialization of the block representation of jq programs. The
serialization should use JSON, naturally, though it's not a perfect fit
because of the need to represent "binders", but that's easy to solve.
Anyways, I'll think about it.
@nicowilliams sweet! Neovim also supports async operations and remote plugins. I have seen a go plugin for neovim that uses an AST representation to perform syntax highlighting. I was thinking how cool it would be if you could have your jq blocks highlighted differently. It was more of just an idea I was pondering though.
It's be particularly awesome if we could support selection of
sub-expressions as a way to handle cases where operator precedence hurts.
I am going to highlight the builtin.c & builtin.jq functions as functions. path will also fall under function but empty and not strike me more as keywords.
| 1 | 2 | 3 | 4 | 5 |
| --- | --- | --- | --- | --- |
| _plus | _negate | _minus | _multiply | _divide |
| _mod | tojson | fromjson | tonumber | tostring |
| keys | keys_unsorted | startswith | endswith | ltrimstr |
| rtrimstr | split | explode | implode | _strindices |
| setpath | getpath | delpaths | has | _equal |
| _notequal | _less | _greater | _lesseq | _greatereq |
| contains | length | utf8bytelength | type | isinfinite |
| isnan | isnormal | infinite | nan | sort |
| _sort_by_impl | _group_by_impl | min | max | _min_by_impl |
| _max_by_impl | error | format | env | get_search_list |
| get_prog_origin | get_jq_origin | _match_impl | modulemeta | _input |
| debug | stderr | strptime | strftime | mktime |
| gmtime | now | input_filename | input_line_number | |
| 1 | 2 | 3 | 4 | 5 |
| --- | --- | --- | --- | --- |
| _assign | _flatten | _modify | _nwise | add |
| all | any | arrays | ascii_downcase | ascii_upcase |
| booleans | bsearch | capture | combinations | del |
| error | finites | first | flatten | from_entries |
| fromdate | fromdateiso8601 | fromstream | group_by | gsub |
| in | index | indices | input | inputs |
| inside | isfinite | iterables | join | last |
| leaf_paths | limit | map | map_values | match |
| max_by | min_by | normals | nth | nulls |
| numbers | objects | paths | range | recurse |
| recurse_down | repeat | reverse | rindex | scalars |
| scalars_or_empty | scan | select | sort_by | split |
| splits | strings | sub | test | to_entries |
| todate | todateiso8601 | tostream | transpose | truncate_stream |
| unique | unique_by | until | values | walk |
| while | with_entries | | | |
I'd say any of the things starting with an _ shouldn't be highlighted.
Those are mostly internal implementations of things that we don't
necessarily want people playing with. Especially the ones in builtin.c
which are primarily the implementations of operators.
Also, things that have _impl on them aren't really for public consumption
and are used behind-the-scenes by other exposed functions.
On Mon, Nov 30, 2015 at 3:22 PM vito-c [email protected] wrote:
I am going to highlight the builtin.c & builtin.jq functions as functions.
path will also fall under function but empty and not strike me more as
keywords.
builtins.c 1 2 3 4 5 _plus _negate _minus _multiply _divide _mod tojson
fromjson tonumber tostring keys keys_unsorted startswith endswith ltrimstr
rtrimstr split explode implode _strindices setpath getpath delpaths has
_equal _notequal _less _greater _lesseq _greatereq contains length
utf8bytelength type isinfinite isnan isnormal infinite nan sort
_sort_by_impl _group_by_impl min max _min_by_impl _max_by_impl error
format env get_search_list get_prog_origin get_jq_origin _match_impl
modulemeta _input debug stderr strptime strftime mktime gmtime now
input_filename input_line_number builtins.jq 1 2 3 4 5 _assign _flatten
_modify _nwise add all any arrays ascii_downcase ascii_upcase booleans
bsearch capture combinations del error finites first flatten from_entries
fromdate fromdateiso8601 fromstream group_by gsub in index indices input
inputs inside isfinite iterables join last leaf_paths limit map map_values
match max_by min_by normals nth nulls numbers objects paths range recurse
recurse_down repeat reverse rindex scalars scalars_or_empty scan select
sort_by split splits strings sub test to_entries todate todateiso8601
tostream transpose truncate_stream unique unique_by until values walk
while with_entries—
Reply to this email directly or view it on GitHub
https://github.com/stedolan/jq/issues/1025#issuecomment-160750330.
Yes. empty and not are very much like keywords. path() is rather special, but since one can and probably would write useful functions as special as path() using path(), I'm not sure how much benefit one might get from treating path() as special in the syntax highlighter, but I tend to think it's better to treat it as special.
We should be guided here by what people do for highlighting LISPs. As with LISP we have syntax, special forms, and everything else. And because arguments are closures, all jq functions can behave in ways more akin to LISP macros. If a LISP loop macro is highlighted differently than user-defined function/macro calls, then perhaps we should do the same for any jq functions that are "special" in some sense that relates to how likely it is that it will help to do so.
I saw a paper this weekend [http://ppig.org/sites/default/files/2015-PPIG-26th-Sarkar.pdf] about syntax highlighting that showed that it is useful indeed. In particular it helps reduce the need for a reader's eyes to fall on keywords very often, and helps the reader reduce the amount of context they need to parse a program/expression. Formatting/indentation seems to play a similar role (though the authors of that paper didn't look into that, but their code samples were all properly indented).
Unless we do our own studies of different variations of syntax highlighting for jq :), there is going to be a lot of subjectivity in your choice of what builtins to highlight as special and which ones not to.
@wtlangford Builtins with _ prefixes should be made not available for binding outside the builtins themselves. However, until we implement that, we should definitely highlight such builtins as erroneous tokens -- it won't bother jq developers.
@nicowilliams Sounds good, but what do you mean by erroneous tokens?
Though, of course, jq users should be allowed to define their own private functions for their modules... We might want to think more carefully about visibility rules. We might need a package-like way to group symbols from multiple related files.
@wtlangford I mean: angry red :)
Angry red is certainly one way to do it. I was going to comment on how to approach that in a more generic, module-oriented way of doing it, but I felt this thread was perhaps not the right place.
type path | and ;
What about paths, delpaths, getpath, and setpath?
| 1 | 2 | 3 | 4 | 5 |
| --- | --- | --- | --- | --- |
| if | then | elif | else | and |
| or | not | empty | try | catch |
| null | end | reduce | as | label |
| break | foreach | import | include | module |
| modulemeta | | | | |
true false
Things that start with _ in the builtins file. I noticed that everything hat has _impl at the end also starts with _ so that is two birds with one stone.
in while error stderr del debug
Should these have keyword or function highlighting
repeat recurse until recurse_down iterables range
These feel like keywords to me
with_entries to_entries from_entries nth has env
get_jq_origin get_prog_origin
| 1 | 2 | 3 | 4 | 5 |
| --- | --- | --- | --- | --- |
| add | all | any | arrays | ascii_downcase |
| ascii_upcase | booleans | bsearch | capture | combinations |
| del | error | finites | first | flatten |
| _from_entries_ | fromdate | fromdateiso8601 | fromstream | group_by |
| gsub | in | index | indices | input |
| inputs | inside | isfinite | _iterables_ | join |
| last | leaf_paths | limit | map | map_values |
| match | max_by | min_by | normals | _nth_ |
| nulls | numbers | objects | _paths_ | _range_ |
| _recurse_ | _recurse_down_ | _repeat_ | _reverse_ | rindex |
| scalars | scalars_or_empty | scan | select | sort_by |
| split | splits | strings | sub | test |
| _to_entries_ | todate | todateiso8601 | tostream | transpose |
| truncate_stream | unique | unique_by | _until_ | values |
| walk | while | _with_entries_ | tojson | fromjson |
| tonumber | tostring | keys | keys_unsorted | startswith |
| endswith | ltrimstr | rtrimstr | split | explode |
| implode | setpath | getpath | delpaths | _has_ |
| contains | length | utf8bytelength | type | isinfinite |
| isnan | isnormal | infinite | nan | sort |
| min | max | error | format | _env_ |
| get_search_list | _get_prog_origin_ | _get_jq_origin_ | modulemeta | debug |
| stderr | strptime | strftime | mktime | gmtime |
| now | input_filename | input_line_number | | |
del should be in the same category as paths, delpaths, getpath, and setpath, which aren't very special (certainly not as special as path).
I don't think the repeaters should be special at all. Ditto with*, from*, to*. So perhaps just four colors for idents/keywords: one for keywords and keyword-alikes, one for specials, one for all other builtins, and one for user-defined idents. $ident should probably get its own color, so that's five colors. Maybe ; should get keyword status, maybe a separate color.
Once you ship it we can start discussing improvements to that breakdown.
Can't wait! :)
@vito-c -- It seems to me that a syntax highlighter for jq should clearly distinguish between ordinary filters (anything that can be written as NAME or NAME(...)) on the one hand, and filters having a special
syntactic form on the other, notably:
and; or; if/then/elif/else/end; reduce/as; foreach/as
Notice that, from a syntactic point of view, "not", "recurse", "repeat", "while" and "until" are quite ordinary, as indeed are "debug", "empty", etc.
Conversely, lumping "null" into the keyword category is problematic for at least two reasons:
In summary, it seems to me that whatever else it does, the coloring scheme:
Thanks!
true, false, and null are rather special, being IDENTs in the parser, but also immediately turned into the corresponding constants in the parser:
$ jq -n 'def null: true; null'
null
How about that.
Whether something should be highlighted one way or the other strikes me as a subjective call that should be based on what minimizes context switching for jq programmers (see linked paper). To me true, false, and null are a lot more special than foo or fromjson, and it's useful to call attention to them as in other languages. But also, too many colors would be distracting, and too many "keywords" would be as well. We'll have to iterate over this to find something that works well for most.
Submitted for feedback :)
https://github.com/vito-c/jq.vim
Beautiful! Thanks!
For src/builtin.jq this is a bit painful because all the functions there are builtins, so it's a sea of yellow and light-blue; for other .jq files it's better. Either way it's better than no highlighting! I'll play with the color scheme at some point. Green for semicolons is a bit weird, as I want either those or commas to stand out. Hmmm, I guess commas should stand out more. Well, we should all play with it and see.
Should we link to your repo from the jq README and site?
I'll play with the color scheme at some point.
I used molokai when I was creating the theme
Should we link to your repo from the jq README and site?
sure thing! I just wanted to try and give back to the wonderful tool you all have created :+1:
See #1032.
@nicowilliams https://github.com/vito-c/jq.vim if you want to put it somewhere on the site for people to find feel free :)
Many thanks!
I ignored the availability of this syntax file :-(
@nicowilliams can we post it on the wiki somewhere? :D
I can post on the vim mailing list to see how I can get it added to the default install too if that would help
@vito-c - A Q has been added to the FAQ: https://github.com/stedolan/jq/wiki/FAQ#editor-bindings
Most helpful comment
@vito-c - A Q has been added to the FAQ: https://github.com/stedolan/jq/wiki/FAQ#editor-bindings