Pandoc: Support --defaults=URL to load default options via HTTP(s) locations

Created on 19 Dec 2019  路  5Comments  路  Source: jgm/pandoc

For @manubot, we're interested in moving our pandoc configuration to default files using the new --defaults option. It would be helpful for us to specify default files via URLs (like this). This way, users could achieve a customized pandoc behavior without having extra files on their computer.

With pandoc 2.9, if I run the following:

pandoc --verbose \
  --defaults=https://github.com/manubot/rootstock/raw/6486c3be806f34cb5944a32bdaf28815cee8f462/build/pandoc-defaults.yaml \
  --to=html5 \
  --output=output/manuscript.html

I get the following error that includes does not exist (No such file or directory):

pandoc: https://github.com/manubot/rootstock/raw/6486c3be806f34cb5944a32bdaf28815cee8f462/build/pandoc-defaults.yaml: openBinaryFile: does not exist (No such file or directory)
out of scope? more-discussion-needed

Most helpful comment

This might cause the execution of arbitrary code, i.e., option 3.

Consider this scenario: an attacker tricks the user to run pandoc with an attacker-controlled defaults file and an attacker controlled input file, both of which could be URLs. The attacker can then set the output file to ${HOME}/.profile, or any other file which is frequently parsed by during program start-up. Next time the user opens a shell, the attacker gains full control of the system. All the user would notice is that his command didn't produce any output. The only difficult part is guessing the home path, but, since relative paths work too, ../../.profile would be an excellent guess.

Even simpler, the attacker could specify pdf-engine-opts: '--shell-escape'and then run any command via LaTeX. The attacker wouldn't even have to control the input file for this to work, as variables and metadata can be controlled via the defaults file.

Also note that any program can be specified in filters, including, e.g., rm. But it's not possible to control the arguments passed to the program, so it's a little more difficult to exploit.

All 5 comments

This would involve changes to applyDefaults in T.P.App.CommandLineOptions.
In principle it should be straightforward to check for a URL and fetch it. One might want to think about security implications.

One might want to think about security implications.

Not a security expert by any means. Would the security implications fall under any of the following categories:

  1. Exposing document content to the --defaults file author (anything that Pandoc is processing)
  2. Exposing arbitrary files on the host system
  3. Allowing installation of programs on the host system?

Are the implications similar to those from --include-in-header=URL as per https://github.com/jgm/pandoc/issues/5248, or worse?

This might cause the execution of arbitrary code, i.e., option 3.

Consider this scenario: an attacker tricks the user to run pandoc with an attacker-controlled defaults file and an attacker controlled input file, both of which could be URLs. The attacker can then set the output file to ${HOME}/.profile, or any other file which is frequently parsed by during program start-up. Next time the user opens a shell, the attacker gains full control of the system. All the user would notice is that his command didn't produce any output. The only difficult part is guessing the home path, but, since relative paths work too, ../../.profile would be an excellent guess.

Even simpler, the attacker could specify pdf-engine-opts: '--shell-escape'and then run any command via LaTeX. The attacker wouldn't even have to control the input file for this to work, as variables and metadata can be controlled via the defaults file.

Also note that any program can be specified in filters, including, e.g., rm. But it's not possible to control the arguments passed to the program, so it's a little more difficult to exploit.

Defaults files make management of options easier than through CLI, but the feature is not a full substitute for scripting. Does support for an internal HTTP operation have greater value than relying on shell features?:

$ pandoc --defaults=<(curl https://some.url/)

For this approach to work, the current behavior of adding a .yaml extension to file names would need to changed. Also, adding support for defaults files from standard input would eliminate the need to use process substitution:

$ curl https://some.url/ | pandoc -d -

It seems that security risks are too grave. Closing.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dashed picture dashed  路  107Comments

jgm picture jgm  路  62Comments

uvtc picture uvtc  路  47Comments

GeraldLoeffler picture GeraldLoeffler  路  143Comments

kevinushey picture kevinushey  路  79Comments