Pandoc: Relative images are relative to working directory, not file

Created on 21 Jun 2017  Â·  23Comments  Â·  Source: jgm/pandoc

I'm using 1.19.2.1.

I have a setup like:

  • src

    • 001

    • 001.md

    • 001.jpg

    • 002

    • 002.md

    • 002.jpg

In the .md files I'm trying to include the images via relative links like ![](001.jpg), however when I do this and build from the top level (passing all the .md files as arguments), Pandoc cannot find the images. Instead I must supply the path as relative to the working directory (so src/001/001.jpg), which is a bit clunky.

So: a) is there any way to get my desired behaviour, and b) is the current behaviour intended? I would have expected paths to be relative to the file that they appear in.

Thanks!

more-discussion-needed

Most helpful comment

All 23 comments

Yes, current behavior is intended. Pandoc just acts on a stream of text that may come from files (possibly several files in different directories) or from stdin; it doesn't keep track of what directory the text came from. Thus pandoc foo/bar.txt is equivalent to cat foo/bar.txt | pandoc.

See #852, which added a --resource-path command line option. This should help in your case, though it's not released. (You could try compiling from source or using pandoc-nightly, but be aware there are significant changes in 2.0.) You may have problems with this, though, if you have files with the same name in different directories.

I can think of a somewhat complex way in which this might be improved. Not sure it's worth it, though. Instead of having the reader take a Text as argument, we could have it take something like a list of pairs of filenames and Texts:

data Source = Stdin | File FilePath | Url String
newtype Sources = Sources  [(Source, Text)]
readMarkdown :: PandocMonad m => ReaderOptions -> Sources -> m Pandoc

Most of the readers use parsec parsers; we could define a custom Stream instance for Sources by defining

uncons :: s -> m (Maybe (t, s))

We could store the name of the current source file in the "common state" of the pandoc monad. The readers could then check for the image file, first in the local directory and then in the working directory or resource path, and adjust the path accordingly. (Note: we don't normally do anything like this until the writers.)

This would help with your use case, at the expense of making the pandoc API considerably more complicated. Not sure it's worth it.

@mb21 @jkr I'd be curious if you have any thoughts about this. @jkr, can --file-scope help with this? Perhaps when --file-scope is used we could automatically add the input file's path to resource path. But this wouldn't really help, since file-scope only affects parsing, and currently we don't load resources until the writing phase.

That's a hard one. I've run into this (kind of unexpected behaviour) myself. Then again, it's really nice to have pandoc behave consistently when input is piped to in and when read from a file. With that in mind, I don't think making those intrusive and complicating changes is worth it.

The unfortunate thing is that this is inconsistent with the way GitHub processes Markdown. So when I make my paths relative to the working directory the images render as broken in the online view.

Maybe I'll just write a script to pre-process all my files into another directory before running pandoc...

another thought: we could abstract the file handling from the readers, so they would only get a mediabag or similar interface of files to query which could be instantiated with either files from the working directory or current source file directory – depending on a command line setting for example.

@mb21 Any change that would search for images in the working directory of the source file (when multiple files are specified on the command line) would have to keep track of which source file includes the given image, so we'd need the more complex interface I sketched above.

I did have one thought for a more limited change: perhaps we could automatically set the resource path for images to include the directory of the first file argument. When --file-scope is used, we could set this for each file argument. I think that would work for this use case.

@mb21 isn't this (specifying the image path via command line option) helpful also for multi-target publishing scenarios (HTML vs. PDF) where you need images in different resolution, which could be accomplished by changing folders? I find using the --default-image-extension= for managing image resolutions somewhat cumbersome.

Was this page helpful?
0 / 5 - 0 ratings