This ticket is related to #1781 about adopting a syntax highlighter specific to vimtex.
Since February 2020 (more or less) the LaTeX3 programming layer (expl3) has been incorporated in the default LaTeX format (at least for pdflatex, lualatex and xelatex). It is used more and more and, for people that are programming packages, it is quite _the_ layer to use nowadays.
Syntax highlight of expl3 code snippets is quite awful:

There is a plugin available at https://github.com/wtsnjp/vim-expl3 that makes things quite better:

But it has the problems that:
set ft=expl3)The expl3 syntax should be enabled only:
\ExplSyntaxOn and \ExplSyntaxOff;\ProvideExplPackage, \ProvideExplClass,\ProvideExplFile.A quick-and-dirty stopgap solution I use now is having in my .vimrc:
"
" latex3 syntax highlight fix
"
nnoremap <leader>3 :syn match texStatement "\\[a-zA-Z_:@]\+"<CR>
nnoremap <leader>2 :syn match texStatement "\\[a-zA-Z@]\+"<CR>
and manually switching when needed, but at this point doing the same with the @wtsnjp package is quite better.
"
nnoremap <leader>3 :set ft=expl3<CR>
nnoremap <leader>2 :set ft=tex<CR>
I'll be very happy to adapt this into #1834, but it would help to understand the main differences for the latex3 syntax. Could you provide one or two simple examples (similar to your screenshots, but preferably simpler/smaller), as well as perhaps a few lines about the main differences? Then I'll add it to the todo for the PR.
The main difference is that the underscore and the colon are treated as letter, basically (and the underscore is not the subscript math command); there are other differences but they are minor. For example:
\documentclass[fleqn]{article}
\usepackage{amsmath}
\usepackage{expl3}
\usepackage{xparse}
\ExplSyntaxOn
\seq_new:N \l_test_A_seq
\seq_new:N \l_test_B_seq
\seq_new:N \l_test_C_seq
\cs_new:Npn \fillmyseq #1 #2 {
\seq_set_split:Nnn #1 {,} {#2}
}
\fillmyseq{\l_test_A_seq}{A_1, A_2, A_3}
\fillmyseq{\l_test_B_seq}{B_1, B_2, B_3}
\fillmyseq{\l_test_C_seq}{C_1, C_2, C_3}
\cs_new:Nn \__test_do: {
\seq_show:N \l_test_A_seq
\seq_show:N \l_test_B_seq
\seq_show:N \l_test_C_seq
}
\NewDocumentCommand\mydo {} {\__test_do:}
\ExplSyntaxOff
\begin{document}
\mydo
\end{document}
This should look more or less like this:

A nice glitch could be to have the trailing argument specifier (the :Nn or whatever --- basicall a colon, a sequence of letters, terminated by a space) with a different highlight color.
Again, this special syntax highlight should be enabled only when expl3 is active (I commented that before...).
Thanks. I closed the issue, but I'm adding this as a TODO for the PR - closing only means this is now a part of #1834.
I've added initial support now in #1834. I've assumed that the LaTeX3 regions accept the "old" latex syntax as well, and I think the current version looks OK for the sample you provided.
However, I have not added anything for for the case when the file starts with one of \ProvideExplPackage, \ProvideExplClass, or \ProvideExplFile. If I should add support for this, please give one or more (simple) examples of such files.
Note, I think the PR #1834 is getting quite stable, so feel free to test drive it and provide feedback!
HI @lervag, thanks! I am a bit in an overload in my job now --- but I'll try to test it if I can in the weekend. Sorry to be of little help lately!
No problem. No pressure, look at it when you have time. There's no rush :)
Files will not start with \ProvidesExpl\(Package\|Class\|File\) but with some \RequirePackage. For example, take a look at siunitx (regarded as one of the role-model expl3-using packages), fontspec (being one of the older ones) or ducksay (just to give an example of a package that I wrote :)
The problem is that expl3 isn't part of the kernel for very long yet, so all those packages support "old" LaTeX kernels by explicitly loading expl3 or xparse even though this isn't necessary nowadays.
One short example could be unicode-math.sty:
%%
%% This is file `unicode-math.sty',
%% generated with the docstrip utility.
%%
%% The original source files were:
%%
%% unicode-math.dtx (with options: `base')
%%
%% ------------------------------------------------
%% The UNICODE-MATH package <wspr.io/unicode-math>
%% ------------------------------------------------
%% This package is free software and may be redistributed and/or modified under
%% the conditions of the LaTeX Project Public License, version 1.3c or higher
%% (your choice): <http://www.latex-project.org/lppl/>.
%% ------------------------------------------------
%% Copyright 2006-2018 Will Robertson, LPPL "maintainer"
%% Copyright 2010-2017 Philipp Stephani
%% Copyright 2011-2017 Joseph Wright
%% Copyright 2012-2015 Khaled Hosny
%% ------------------------------------------------
%%
%%^^A%% unicode-math.dtx -- part of UNICODE-MATH <wspr.io/unicode-math>
%%^^A%% Metadata for the package code, including files and versioning
\RequirePackage{expl3}
\ProvidesExplPackage{unicode-math}
{2020/01/31} {0.8q} {Unicode maths in XeLaTeX and LuaLaTeX}
\sys_if_engine_luatex:T
{
\RequirePackageWithOptions{unicode-math-luatex}
\endinput
}
\sys_if_engine_xetex:T
{
\RequirePackageWithOptions{unicode-math-xetex}
\endinput
}
\msg_new:nnn {unicode-math} {unsupported-engine}
{ Cannot~ be~ run~ with~ \c_sys_engine_str!\\ Use~ XeLaTeX~ or~ LuaLaTeX~ instead. }
\msg_error:nn {unicode-math} {unsupported-engine}
\endinput
I've assumed that the LaTeX3 regions accept the "old" latex syntax as well
Good assumption. Again as an example I'd pick siunitx. Fourth line of code is
\@ifpackagelater { expl3 } { 2020/01/12 }
This directly follows \ProvidesExplPackage. The package uses other non-expl3 syntax here and there as well.
@lervag could you post a couple of lines on top of PR #1834 telling how to test it? I remember there was a variable to set and a thing to do about branches in Plug...
@lervag could you post a couple of lines on top of PR #1834 telling how to test it? I remember there was a variable to set and a thing to do about branches in Plug...
Done. I'll respond to the new comments later today/this week.
Done. I'll respond to the new comments later today/this week.
Thanks! Where? I can't find it (probably my bad). No hurry, really...
In the top post of #1834. But I forgot to click "submit"... sorry! It's there now!
Files will not start with
\ProvidesExpl\(Package\|Class\|File\)but with some\RequirePackage.
For me, it seems like we could activate the expl3 syntax as a package specific addon. That would be nice, as it means we don't need to load it unless it is relevant (i.e. when there is a \RequirePackage or \usepackage).
For example, take a look at
siunitx(regarded as one of the role-modelexpl3-using packages),fontspec(being one of the older ones) orducksay(just to give an example of a package that I wrote :)
I like the name :)
One short example could be
unicode-math.sty:
Thanks! This seems to confirm my observation above that we can parse \RequirePackage to check if the expl3 syntax should be loaded. Also, it seems as if the command \ProvidesExplPackage implies that the expl3 syntax should be recognized at "top level" (i.e., not just between the two commands mentioned earlier).
I've assumed that the LaTeX3 regions accept the "old" latex syntax as well
Good assumption. Again as an example I'd pick
siunitx. Fourth line of code is
Great, thanks.
The remaining question, it seems, is when we should consider the expl3 syntax to be a top level syntax, and when it is confined between two tex commands.
The remaining question, it seems, is when we should consider the
expl3syntax to be a top level syntax, and when it is confined between two tex commands.
I think that the rules should be: (but @Skillmon is surely more knowledgable than me here):
expl3 syntax is confined between \ExplSyntaxOn and \ExplSyntaxOff\ProvidesExpl* commands (they should be near the top, skipping comments of course).I'd just treat \ProvidesExpl... the same way as \ExplSyntaxOn is treated. In the end it is just the LaTeX2e \Provides... macro but uses \ExplSyntaxOn as well. And while the \Provides... macros are usually closer to the top of a file, they don't have to.
Imho, it is really better to just treat it the same way as \ExplSyntaxOn and don't assume any top level syntax except when the file type is explicitly set (one could also add another g:tex_flavor).
I think that the rules should be: (but @Skillmon is surely more knowledgable than me here):
by default
expl3syntax is confined between\ExplSyntaxOnand\ExplSyntaxOffunless the file contains one of the three
\ProvidesExpl*commands (they should be near the top, skipping comments of course).
I'd just treat
\ProvidesExpl...the same way as\ExplSyntaxOnis treated. In the end it is just the LaTeX2e\Provides...macro but uses\ExplSyntaxOnas well. And while the\Provides...macros are usually closer to the top of a file, they don't have to.Imho, it is really better to just treat it the same way as
\ExplSyntaxOnand don't assume any top level syntax except when the file type is explicitly set (one could also add anotherg:tex_flavor).
When you use \ProvidesExpl.., where would expl3 syntax be considered valid? Is it from that command to the end of the file? Or can you mark an "ending"? E.g., could you use \ProvidesExpl... ... \ExplSyntaxOff?
That is: With \ProvidesExpl..., how do I know what is the region of the file in which the expl3 syntax is valid?
Reg. g:tex_flavor, I'd prefer not to mix things with this now. I'm focussing on the .tex extension, and support for other types is not a priority.
Also: am I right that expl3 syntax is always predicated by a \RequirePackage{expl3} or \usepackage{expl3} for any .tex and .sty file?
That is: With
\ProvidesExpl..., how do I know what is the region of the file in which theexpl3syntax is valid?
l am quite sure that yes, it's the end of the current file. @Skillmon, can you confirm?
Also: am I right that
expl3syntax is always predicated by a\RequirePackage{expl3}or\usepackage{expl3}for any.texand.styfile?
Well, now it is like that, for backward compatibility with LaTeX core that have no expl3 preloaded, but in principle since 2020 it's not strictly needed. And also the xparse package can trigger the loading of the new core, so I wouldn't assume it.
Like @Rmano said, you can't rely on expl3 being explicitly loaded for \ExplSyntaxOn or \ProvidesExpl... to be used.
\ProvidesExpl... will turn on the expl3 syntax after its four arguments are read in (so, no expl3 syntax inside of the arguments), and this can either be ended by \ExplSyntaxOff or by the end of the file (both is possible). You should really just treat it like \ExplSyntaxOn.
@Skillmon To treat it like \ExplSyntaxOn implies that I need to know _where_ the region ends, which is why I'm asking these followup questions. The implementation of support for expl3 relies on the syntax region elements (see :help syn-region), which may contain other elements. This means nested syntax elements. When I say "top level" syntax, I mean syntax that is not necessarily contained inside these syntax regions. This is why my question is important.
So, if I understand correctly, I can add a syntax region for the \ProvidesExpl... commands that starts with the regular expression pattern \\ProvidesExpl\w* (or similar), and ends with either \\ExplSyntaxOff or at end of file (regular expression atom \%$ in Vim).
@Rmano Ok, then I'll not move this stuff into a package specific "optional" syntax addon, and instead keep it in the core elements.
So, it seems that the main thing missing now is to support the \ProvidesExpl... stuff; or did I miss anything?
I understand that your question is important, and sorry if my answers are not really helpful :)
There are a few files which might not contain \ExplSyntaxOn or \ProvidesExpl... but still be entirely in expl3 syntax, that's why I said that there should be another way to toggle this. Those files are mostly files which are contained in the LaTeX kernel/in expl3, so only few users will ever edit those, but it is possible.
So, there are 2 cases for the end of expl3 syntax:
the region ends at \ExplSyntaxOff
until the end of the file
And 3 cases for the start of expl3 syntax:
\ExplSyntaxOn
\ProvidesExpl... (after the next four arguments)
the file is in expl3 syntax from the very beginning (but this could still be ended by \ExplSyntaxOff)
Does this clear things up?
Of course there could be pitfalls, for example, in the following, expl3 syntax is only active for a single line, even though there is no \ExplSyntaxOff:
\begingroup \ExplSyntaxOn \cs_new_eq:NN \tlifemptyTF \tl_if_empty:nTF \endgroup
Parsing TeX without using TeX is quite hard... So I guess if your syntax highlighter doesn't understand when \ExplSyntaxOn is used only in the scope of a local group can be forgiven.
Ok, thanks. I think I've got it now, and I think the current support is at least an OK beginning. It is not perfect yet, but I'll prioritize to get the new syntax feature at a mature level first before improving this further. In particular, I only added support for regions that start with the commands mentioned above, and I do not support the single line stuff yet. To do that I need to understand the TeX syntax better.
The single line syntax was just an example, I don't think any package really uses something like that on a single line, but the \endgroup could be further down, and the \begingroup doesn't have to be on the same line as \ExplSyntaxOn. Also instead of \begingroup...\endgroup one could also use \bgroup...\egroup or {...}, or anything else that ends a group. TeX is really impossible to correctly parse with a VIM syntax highlighter, so maybe don't even try to get everything correct.
TeX is really impossible to correctly parse with a VIM syntax highlighter, so maybe don't even try to get everything correct.
Exactly this. I want to get _enough_ things correctly for the user experience to be good.
By the way, it seems your knowleagable; what is the difference between \begingroup, \bgroup, and {?
TeX has the two category codes Beginning of group (1) and End of group (2). By default { has category code 1, so is a beginning of a group, and } has the category code 2, so is the end of a group. The result is, that {...} forms a group, but also forms a single argument when passed to a macro (in which case the braces are stripped).
\bgroup and \egroup are tokens which have the same meaning as { and } (they are created using \let\bgroup{\let\egroup}). The result is somewhat strange. They form a group just like {...}, but they can't form a single argument for a macro, so a macro taking one parameter would only absorb \bgroup. But they can be used to delimit arguments to TeX primitives like \hbox, so \hbox\bgroup abc\egroup is valid.
Groups opened by { can be closed by \egroup (not as arguments, though) and vice versa. Also you could assign any other character to either category 1 or 2, so after \category\(=1you could also use(instead of{`.
\begingroup and \endgroup are special primitives. They form a group, and a group opened by \begingroup has to be closed by \endgroup. They can't delimit arguments, neither to macros nor to primitives like \hbox. Of course you could assign other names to those as well using \let, e.g., in expl3 you have \group_begin: which is \let\group_begin:\begingroup. Then a group opened by \group_begin: could be closed by \endgroup, because \group_begin: is the same as \begingroup.
Thanks for the explanation! It makes some sense, and it seems to me that the current behaviour where we "ignore" the \brgoup and \begingroup variations and just match them as texCmd is a reasonable simplification.
Most helpful comment
Done. I'll respond to the new comments later today/this week.