When using html-pipeline (both the latest released version and master) the generated CSS classes for syntax highlighting is different than those generated on github.com. Based on recent announcements it would seem that this is because github is internally no longer using pygments but instead using linguist + textmate/Atom bundles for syntax highlighting.
Looking at the linguist code though, it's extremely unclear how to use Linguist to syntax highlight at all, as Language#lexer is never set, yet is checked by html-pipeline. Is the expectation to set Language#lexer somehow from my own code, using something other than Pygments? Or are there relevant Linguist changes that have not been released to this repo yet?
To recap, my goal is to syntax highlight markdown code which results in the same CSS classes as those on github.com :)
html-pipeline seems to be using an old version of Linguist when Pygments was still used. That said, I think html-pipeline is only used for markup files.
@bkeepers and @jch may know more about this.
Hi @suan,
Your best option right now might be to copy the SyntaxHighlightFilter, remove the Linguist dependency and only use Pygments. Here are CSS themes that will work with Pygments.
Sorry there is not a better option right now. We have hight hopes to open source the rest of our new syntax highlighting code and stylesheets that we use on GitHub. I can't promise when that will happen, but I will keep pushing to make it happen.
Let me know if you have any questions or concerns.
Thanks @pchaigno and @bkeepers for your replies. I guess I should clarify a little. When using the html-pipeline SyntaxHighlightFilter to convert a (github-flavored) markdown file to HTML, the generated CSS classes themselves (_not_ the stylesheets) differ for Ruby code. As an example gotten from https://github.com/suan/github-flavored-markdown-test, the HTML for the ```ruby section as rendered on github.com is
<div class="highlight highlight-ruby"><pre><span class="pl-k">def</span> <span class="pl-en">this_is</span>
puts <span class="pl-s1"><span class="pl-pds">"</span>some <span class="pl-pse">#{</span><span class="pl-s2">colored</span><span class="pl-pse"><span class="pl-s2">}</span></span> ruby code with ruby syntax highlighting<span class="pl-pds">"</span></span>
<span class="pl-k">end</span></pre></div>
whereas from html-pipeline and Pygments this is rendered:
<div class="highlight highlight-ruby"><pre><span class="k">def</span> <span class="nf">this_is</span>
<span class="nb">puts</span> <span class="s2">"some </span><span class="si">#{</span><span class="n">colored</span><span class="si">}</span><span class="s2"> ruby code with ruby syntax highlighting"</span>
<span class="k">end</span>
</pre></div>
Apart from the pl- prefix (which I know how to achieve) the classes used to surround the tokens are different. In particular, the en, pds, and pse classes don't seem to be pygments classes at all? Would you know how I could achieve this behavior? My goal is to convert markdown files and have them render exactly as they would on github.com, and if I can't get even the classes to match it's going to be very difficult. Thanks.
What about using our Markdown API? That would give you markup identical to what we render on GitHub.com, and you wouldn't have to worry about keeping it up to date as we make improvements to our markup rendering.
Never knew that was available! but unfortunately it wouldn't work for my case as its for a "realtime" markdown conversion tool, vim-instant-markdown to be precise, which potentially has multiple conversions a second
Sent from my iPhone
On Jan 12, 2015, at 12:58 PM, Brandon Keepers [email protected] wrote:
What about using our Markdown API? That would give you markup identical to what we render on GitHub.com, and you wouldn't have to worry about keeping it up to date as we make improvements to our markup rendering.
—
Reply to this email directly or view it on GitHub.
@suan unfortunately there's not a great option to get identical syntax highlighting right now. I'll keep pushing to get the rest of our new syntax highlighting code open sourced.
@bkeepers Gotcha. Thanks anyways for your speedy help though!
On Tue, Jan 13, 2015 at 3:33 PM, Brandon Keepers [email protected]
wrote:
@suan https://github.com/suan unfortunately there's not a great option
to get identical syntax highlighting right now. I'll keep pushing to get
the rest of our new syntax highlighting code open sourced.—
Reply to this email directly or view it on GitHub
https://github.com/github/linguist/issues/1984#issuecomment-69814414.
Was this ever open sourced @bkeepers? Or is there any further progress on this?
Thanks!
Hey @marcamillion, thanks for the nudge. We had some movement on this recently. We're trying to simplify the licensing so it's easier for people to reuse, but ran into a few hurdles. We'll post some updates to this repo as soon as we're able to open source it.
Any update on the open sourcing of the new code, @bkeepers? :)
@bkeepers I've been trying to figure out how the hell Liguist does syntax highlighting, and of course the code isn't here 🤦🏻♂️ Would you accept a pull request telling others in the README that syntax highlighting isn't done by linguist any more?
Alternatively releasing the source would be excellent!
@aphillipo Is this not sufficient?
I agree that the best would be to open source it :-)
Hi @pchaigno - that implies the set of grammars is used by linguist to do the syntax highlighting but that's not true is it. I was looking where liguist loads these grammars and the code that takes some file and turns it via the grammars into HTML. Until I found this bug I assumed that I was being unable to follow the code, rather than the components I mention simply not existing.
@aphillipo The set of grammars is used by github.com to do the syntax highlighting.
They are included as submodules so that Linguist can perform some checks (correct license, compatible format, etc.) and maintain the language to grammar associations.
@bkeepers what about css styling? regarding Github Markdown API.
@sarbull if you hadn't found it already, there's the syntax theme generator and the light and dark generated themes. Also, any further work on open sourcing linguist @bkeepers?
@bkeepers I know this issue is a bit dated now, but have you ever open-sourced GitHub's syntax highlighting CSS?
@nicholasmccullum yup, it was open sourced quite a few years ago when Primer was announced. The specific repo is https://github.com/primer/github-syntax-light. There's also an Atom theme at https://github.com/primer/github-atom-light-syntax
@lildude That's dope, thanks!
@lildude Any idea if there's a way to easily port that repo into a Rouge theme?
Why not use the one already included in Rouge? https://github.com/rouge-ruby/rouge/blob/master/lib/rouge/themes/github.rb
@lildude Turns out I was wrong and need a Pygments theme, not a Rouge theme. I found this repository of Pygments themes, but there's none that match GitHub
I don't know of any off the top of my head as I've not used Pygments in a long time.
Most helpful comment
Any update on the open sourcing of the new code, @bkeepers? :)