Rust: Alternative highlighting for Rust code block in rustdoc

Created on 10 Nov 2020  Â·  22Comments  Â·  Source: rust-lang/rust

Context https://twitter.com/imperioworld_/status/1325916530257833984 cc @GuillaumeGomez

Description

I'm the author of https://github.com/Hywan/inline-c-rs/. It allows to write C code inside Rust. It's a super basic proc_macro that transforms the code into a string (with some extra manipulations), which is saved inside a temp file that is compiled by a C/C++ compiler, and finally run. That's useful for testing C API of a Rust program inside Rust with the Rust tooling (like cargo test). It is super useful for us at https://github.com/wasmerio/wasmer.

Recently, we pushed the concept further: Using inline-c-rs inside documentation, like this:

/// Blah blah blah.
///
/// # Examples
///
/// ```rust
/// # use inline_c::assert_c;
/// # fn main() {
/// #    (assert_c! {
/// # #include "tests/wasmer_wasm.h"
/// #
/// int main() {
///     wasm_engine_t* engine = wasm_engine_new();
///     wasm_store_t* store = wasm_store_new(engine);
///
///     // etc.
///    
///     return 0;
/// }
/// #    })
/// #    .success();
/// # }
/// ```
#[no_mangle]
pub unsafe extern "C" fn wasm_module_new(/* skipped */) -> /* skipped */ {
    // skipped
}

This documentation is:

  • âś… written in Rust, and contains C,
  • âś… is tested by cargo test --doc because the code block is tagged with ``rust</code> … so we can test C withcargo test` 🤪!
  • âś… is compiled to HTML by rustdoc.

The result of cargo doc looks like this:

Screenshot 2020-11-06 at 11 40 44

That's excellent! All the “Rust decoration” is “removed” (thanks to # …), and only the C code stays. That's insane I know, but it works and that's super fun.

One problem though:

  • ❌ The highlighting is incorrect.

Because it's a ``rust</code> code block, there is a special handler for that I assume. I tried to writerust,language-cbut thelanguage-c` part is ignored and is not present in the generated HTML code, as it can be seen here:

<pre class="rust rust-example-rendered">

Expectation

I would expect ``rust</code> to be a special keyword that unfolds torust+language-rust`:

  • The former describes that it's a Rust code block, and consequently is a candidate for a test,
  • The second describes the syntax highlighting ruleset to apply for the HTML documentation.

This keyword could have a different meaning in the presence of language-<lang_id> which “cancels” language-rust.

In other words, writing rust,language-c would keep the actual behavior of rust but it will disable the default highlighting to allow the user to define another one.

How does it sound?

Motivation

The technique used by inline-c-rs can be ported to other languages. It's just super fun to see C code inside Rust documentation that is also tested by cargo doc. I'm sure this technique can be used by other languages in the future.

A-markdown-parsing C-feature-request T-rustdoc

Most helpful comment

So I thought a lot about this after our discussion, and I reached the following conclusion: people might want to give special classes to some code blocks without following a pattern. Therefore, I had the following idea: adding a new code "tag" (like rust, ignore, etc) which would look like class-some-class-name and that would generate the block with it (but without the prepending class-). So it would give <code class="some-class-name">.

What do you think? Like that, you can support highlight.js using class-language-C or any other syntax highlighter (or for some other use as well).

All 22 comments

So I thought a lot about this after our discussion, and I reached the following conclusion: people might want to give special classes to some code blocks without following a pattern. Therefore, I had the following idea: adding a new code "tag" (like rust, ignore, etc) which would look like class-some-class-name and that would generate the block with it (but without the prepending class-). So it would give <code class="some-class-name">.

What do you think? Like that, you can support highlight.js using class-language-C or any other syntax highlighter (or for some other use as well).

That's a good idea yes!

We should also disable the Rust “HTML highlighting” (lexing + creating the HTML spans around the tokens). Note that this last part is optional: If we have a @class attribute, we can run Javascript to collect the text only, remove all the HTML tags, and do something with that.

As I understand the workflow, there is one step to remove the # <line>s, and one step that tokenize + create the HTML spans, is that correct? Then the first step must be kept, and the second should be skipped.
How to take that decision? I don't know for the moment. Maybe it's not a good idea to go that path.

I just assumed that if the class-whatever tag was used, we were disabling default syntax highlighting, but you did well to bring it up, at least it makes it clear this way.

As for removing the # <line>, we still want to "keep it" (understanding, remove the lines from the display).

As for the final decision, I guess we just need someone else from the @rust-lang/rustdoc team to approve what the final feature should looks like. And at first, it'll only be available on nightly version for a while until we're sure the feature is good to be stable.

Of course! That's fine for me to get it on nightly for a moment. Feel free to ping me if you need anyhelp.

custom class names seems fine. I don't want this feature to get too complicated though

Instead of adding a new syntax, could we treat rust,c as 'rust for the purposes of tests, but with C highlighting'?

rust,c seems unclear to me. What about rust,syntax-c or something like that, to make it clear that it's just for syntax highlighting?

That's why I suggested what I did which allows to add whatever class you want to the code block. Like that, it's not necessarily about syntax highlighting but about whatever you want and it's pretty simple to handle for rustdoc.

Instead of adding a new syntax, could we treat rust,c as 'rust for the purposes of tests, but with C highlighting'?

I don't think this is a good idea because it'll mean adding support C syntax. And no one wants that. Also, what happens if other people wants more languages? The simplest way is to just allow to generate classes and let people do whatever they want with it I think.

it'll mean adding support C syntax. And no one wants that.

I wouldn't say that :)

I'm sure there are a lot of people that would like to have C syntax highlighting in their docs for their FFI code.

I didn't mean it this way. More like "no one wants to add the C syntax highlighting in rustdoc directly". Even myself wants to have C support. :p

So I thought a lot about this after our discussion, and I reached the following conclusion: people might want to give special classes to some code blocks without following a pattern. Therefore, I had the following idea: adding a new code "tag" (like rust, ignore, etc) which would look like class-some-class-name and that would generate the block with it (but without the prepending class-). So it would give <code class="some-class-name">.

@GuillaumeGomez I like this idea as it's something that can have multiple uses. I was thinking we could try and do something slightly less magical than plain prefixes: we could have a colon syntax that would look like rust,class:clang and would also allow us to eventually add other things to the html code node (id, data- attrs, etc) for further customization.

_Removing_ the rust syntax highlighting seems like it might be more challenging to do in a neat way, but maybe we can just have another modifier like rust,plain,class:clang for that.

Oh, I like the syntax with the ":"!

As for removing the rust syntax highlighting, I just assumed that if you used "class:*", we wouldn't run it.

Actually, using plain makes things less magical, which is good. It's a little bit more verbose though.

On the other side, one may argue that everytime a user will add a class, it's very likely to customize the CSS style, so… it may imply plain.

Anyway, if plain is not a feature, i.e. if class:<class-names> disables the Rust highlighting, a new feature must be provided to re-enable the Rust default highlighting in case that's not the expected behavior for the user.

I wonder: in which cases one could want to add a custom class but still want the rust syntax highlighting?

I've no idea, but if there is one way to disable a feature, there should be a way to re-enable it.

Also, I like the <attr>:<value> notation, for instance to add lines highlighting with prism.js, such as data-line:1-2 -> @data-line="1-2".

Be sure to allow : in the value part too.

Well, we don't split on ":" so the goal is to keep everything following "class:". As for re-enabling, I don't see the point for the moment so unless someone brings up a need for it (that's why we don't want to make it stable right away!), I think it'd be better not to add it.

Both options are good to me :-).

What about extending the idea of class:<class-names> to <attr-name>:<attr-value> in general?

Let's say I'm "open" to the idea. But as always, I need a case for it to be useful otherwise I think it's better to focus on one thing only. After all, it's only about code blocks.

I've presented a usecase in https://github.com/rust-lang/rust/issues/78917#issuecomment-725401622 with prism.js.

However, I agree we should focus on one feature at a time.

It can be included as a class (or multiple ones hehe). :stuck_out_tongue:

It isn't the API of Prism.js, but let's keep class:<class-names> for the moment :-)! It'll be a really awesome new feature :-).

Instead of adding a new syntax, could we treat rust,c as 'rust for the purposes of tests, but with C highlighting'?

There are tons of PLs out there, and we shouldn't lock ourselves out of future modifiers.

I think class:classname is good.

Was this page helpful?
0 / 5 - 0 ratings