Gitea: External markup renderer [$15]

Created on 11 Dec 2016  Â·  22Comments  Â·  Source: go-gitea/gitea

See gogits/gogs#211 and gogits/gogs#2097.

Most Python-related projects use reStructuredText .rst instead of markdown files, for README.rst and documentation. Sometimes other markup languages are used, or markdown with extensions currently not supported by Gitea. GitHub supports the following markups: https://github.com/github/markup/.

I propose the following feature:

[markup]
; List of file extensions that should be rendered by an external command
FILE_EXTENSIONS = .rst,.rest,.restx
; External command to render all matching extensions
RENDER_COMMAND = "rst2html.py --no-raw"

If, for example, the file README.rst exists, its first 1024 bytes will be passed into stdin of rst2html.py --no-raw and stdout will be displayed as html for the file preview.

An optional %s will be replaced by the matched file extension (so that one can write a script that handles both .rst and .asciidoc).

As such a feature would be important to me, I am willing to implement this proposal, assuming there is a good chance it will be merged into master. Let me know if I can start working on it.


There is a $15 open bounty on this issue. Add to the bounty at Bountysource.

bounty kinfeature

Most helpful comment

I would prefer to integrate a real plugin system instead of increasing the current config more and more

All 22 comments

Depending on external tools for rendering content is not really our goal

The default can be empty, then it's not a dependency and more like plugin interface.

We could find a go lib to do that

It seems there isn't any mature Golang library to do that. So maybe give a config like @plemp said, but default is closed currently.

IMO adding external renderers isn't a bad idea, could be made with a simple list.
Have a simple schema to identify arguments, basically injecting .InFile and .OutFile

Example:

EXTERNAL_RENDER = .rst:/usr/bin/rst2html5 {{ .InFile }} > {{ .OutFile }}

given

renderParam := struct {
  InFile  string
  OutFile string
}{"foo.rst", "/tmp/gitea-1337/foo.html"}

would generate /usr/bin/rst2html5 foo.rst > /tmp/gitea-1337/foo.html

I would prefer to integrate a real plugin system instead of increasing the current config more and more

Well I guess I'm just here to vote for .asciidoc the rich format and feature is offers have make it a defacto tools in our product chain.

@tboerger maybe external renderer is better choice, because asciidoc is an extensible format. the renderer itself should have a plugin system.

I recently did a quick-and-dirty hack on Gogs to add some basic .rst support for our own use, but maybe you will also find it useful:

https://github.com/AlphaGriffin/gogs

@lannocc great! I will try to merge it on v1.3

After #2525 merged, I will try to add reStructuredText support based on github.com/hhatto/gorst and https://github.com/AlphaGriffin/gogs. And I think that should be easy since that a new document type only implementation markup.Parser.

~And for asciidoc, since github.com/VonC/asciidocgo could be imported.~

@plemp, It seems there is no golang lib to handle asciidoc well. don't know how to use github.com/VonC/asciidocgo.

Any update on this one?
@lunny https://godoc.org/github.com/VonC/asciidocgo documentation looks good to me

@lenisko #2570 will support external render commands.

@lunny: Bountysource keeps telling me "Your application has been suspended" and throws various {"error":"Internal server error."} responses, keeping me from releasing the bounty. The mails I got from them uniquely say that if I "don't respond" you will get the bounty paid out at 2017-11-21. Sorry for that delay!
Could you check back at that date and - please - get in touch with me[1], should Bountysource fail to pay the bounty.
Thank you - and all contributors - for your continuous work on Gitea!


[1] via GitHub or [email protected]

@tantegerda1 OK. Thanks for your bounty.

Getting this working was as simple as

  1. Installing docutils (didn't even need root for that!)
  2. Ensuring PATH contains $HOME/.local/bin
  3. Appending a small snippet to app.ini:

[markup.restructuredtext] ENABLED = true FILE_EXTENSIONS = .rst RENDER_COMMAND = rst2html.py IS_INPUT_FILE = false

  1. Restart the server 👍…and DONE! Readme.rst renders just as prettily on my selfhost as on GitHub! 🎉

However, what I am wondering is: isn't this a security nightmare*? How quick+easy would it be for some sufficiently clever person to monkey into existence a Readme.rst that, when uploaded, would allow _doing things not otherwise doable_ on the server?

*—in a situation involving potentially untrusted users having write access to repositories

Although, a small P.S. to the previous comment:

  1. Installing…
  2. Ensuring…
  3. Adding a small snippet…
  4. Restart…!

Is the above explicitly documented _anywhere_ beyond thread #2570 in GitHub's walled-garden forum, associated with (but not documented in) commit 62d0a4d8829f214af4f9c00ecf8a81907d86ef06 ?

@JamesTheAwesomeDude it is definitely a security problem in that situation. I wouldn't say it is a complete nightmare, however, because you can write a wrapper for your external renderer that takes one or more steps to isolate it from your server environment, e.g.:

  • Run it in a firejail
  • Run it in an ephemeral KVM VM
  • Run it on another computer entirely

You're still left with the problem of the returned html being untrusted, but you can at least prevent attacks on the external renderer(s) from being able to perform arbitrary actions on the server.

@Shados Just to expand on that a bit,

since rst2html.py is seemingly only having stdio used, I guess it'd pretty simple to set up something like the way FastCGI servers work, just assign the wrapper/offloader script to RENDER_COMMAND.

But I really am concerned at the scarcity of documentation on this; as I mentioned before, you've got to manually hunt down the information yourself to make this work..I might have to stick something in the Wiki, maybe...


Although, re:"the problem of the returned html being untrusted",

It does look like Gitea (and Github itself, for that matter) actually sanitizes it pretty violently...for instance,

.. note:: HTML is sanitized

renders to the browser merely to:

<div><p>Note</p><p>HTML is sanitized</p></div>

with absolutely no styling at all, not even an indication that the word "Note" is a header for the following content, no indent for the note itself....anything

(same with warning, etc)

@JamesTheAwesomeDude yes, output from markup modules (external renderer included) are passed through a sanitizer in an attempt at solving that part of the problem.

Of course, if you want the output to be useful, you would likely need to relax the sanitizer slightly to allow more class names (whatever ones your renderer is using), and then separately supply a manually-audited or created stylesheet that defines those classes.

You're still left with two possible avenues of attack, however:

  • Attacks on the sanitizer; it's running as part of the gitea server so a successful attack on it or its parser could potentially allow an attacker to access anything gitea can
  • Attacks on browsers viewing the sanitized html; I don't know what someone could possibly do with just HTML and a set of known-good classes, so maybe this is not worth worrying about
Was this page helpful?
0 / 5 - 0 ratings

Related issues

BNolet picture BNolet  Â·  3Comments

internalfx picture internalfx  Â·  3Comments

thehowl picture thehowl  Â·  3Comments

flozz picture flozz  Â·  3Comments

Fastidious picture Fastidious  Â·  3Comments