Pandoc: Support SVG images in multiple formats

Created on 8 Dec 2014  Â·  28Comments  Â·  Source: jgm/pandoc

I'm testing pandoc's markdown to pdf conversion using this readme and command ~/.cabal/bin/pandoc README.md -o test.pdf. However, I'm getting this error:

pandoc: Unable to convert image `/tmp/tex2pdf.26140/0056733ea8a4c9b7dab70a5b455f59437d34f134.svgz':
Cannot load file
Jpeg Invalid marker used. Failed reading at byte position 2
PNG Invalid PNG file, signature broken. Failed reading at byte position 8
Bitmap Invalid Bitmap magic identifier. Failed reading at byte position 2
GIF Invalid Gif signature : <svg x. Failed reading at byte position 6
HDR Invalid radiance file signature. Failed reading at byte position 11
Tiff Invalid endian tag value. Failed reading at byte position 2

pandoc: Unable to convert image `/tmp/tex2pdf.26140/8ba8f2c25e51f37aa52f6286040bc2a2ca70cda4.svgz':
Cannot load file
Jpeg Invalid marker used. Failed reading at byte position 2
PNG Invalid PNG file, signature broken. Failed reading at byte position 8
Bitmap Invalid Bitmap magic identifier. Failed reading at byte position 2
GIF Invalid Gif signature : <svg v. Failed reading at byte position 6
HDR Invalid radiance file signature. Failed reading at byte position 11
Tiff Invalid endian tag value. Failed reading at byte position 2

! Font T1/cmr/m/n/10=ecrm1000 at 10.0pt not loadable: Metric (TFM) file not fou
nd.
<to be read again> 
                   relax 
l.100 \fontencoding\encodingdefault\selectfont

pandoc: Error producing PDF from TeX source

Additionally, there's a missfont.log generated:

mktextfm ecrm1000
mktextfm ecrm1000

At first I thought this was because of the relative links for images, but replacing them with raw.githubusercontent.com URLs gives the same error.

Pandoc version is 1.3.11, installed from cabal, on Kubuntu 14.10.

enhancement

Most helpful comment

This would also be solved by switching the default PDF engine to
ConTeXt or wkhtmltopdf...

I don't think I'd want to change the default. But it's good
to emphasize that people who use SVGs can already use -t context or -t html with -o output.pdf and use these
engines, which support SVG already.

I should mention another workaround, for those who need to
support SVGs. Just use a program of your choice to convert
the SVG to PDF, so you have two images, one with an svg
extension, one with a pdf extension. Then leave off the
extensions in your Markdown source and use
--default-image-extension as appropriate for the output
format.

A filter would be another approach.

All 28 comments

I can't reproduce this exact error, this is my output.

pandoc: Unable to convert image `/var/folders/3z/_vqy7kmx4pd90sg_v80zpk340000gn/T/tex2pdf.70131/0056733ea8a4c9b7dab70a5b455f59437d34f134.svgz':
Cannot load file
Jpeg Invalid marker used
PNG Invalid PNG file, signature broken
Bitmap Invalid Bitmap magic identifier
GIF Invalid Gif signature : <svg x
HDR Invalid radiance file signature
Tiff Invalid endian tag value

pandoc: Unable to convert image `/var/folders/3z/_vqy7kmx4pd90sg_v80zpk340000gn/T/tex2pdf.70131/8ba8f2c25e51f37aa52f6286040bc2a2ca70cda4.svgz':
Cannot load file
Jpeg Invalid marker used
PNG Invalid PNG file, signature broken
Bitmap Invalid Bitmap magic identifier
GIF Invalid Gif signature : <svg v
HDR Invalid radiance file signature
Tiff Invalid endian tag value

! LaTeX Error: Unknown graphics extension: .svgz.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...                                              

l.63 ...33ea8a4c9b7dab70a5b455f59437d34f134.svgz}}

pandoc: Error producing PDF from TeX source

Yeah apparently the font error is a seperate problem (I'm also getting it for .md files without images).

Can you compile normal tex documents?

It works using pdflatex in the shell. Although I don't know what pandoc uses.

Pandoc uses pdflatex with the following arguments from what I can see "-halt-on-error", "-interaction", "nonstopmode","-output-directory"

Same with epub/html to pdf.
Apparently, the problem is with SVG images <img src="image.svg"> which I use in the text.

P.S. Just to note: there's no svgz in my HTML page that I try to convert, only svg, which is sent compressed-on-the-fly by the server though, headers:

Content-Encoding:gzip
Content-Type:image/svg+xml

Can you please post the exact output you are getting in a new ticket?

I don't think svg is supported by pdflatex. The svg (or svgz) would need to be converted to a supported image format. We use JuicyPixels for bitmap image format conventions, but it won't help with svg.

I've improved the confusing error messages in a recent commit, but I don't think there's much else that can be done now.

Your path forward is to convert the svgs to pdfs with an external program.

As a matter of fact, there is a possibility to render SVG, not directly through JuicyPixels, but I have a library called rasterific-svg for this. I was thinking of putting together a proposition to allow the use of SVG in pandoc, but I don't really know where to start.

@Twinside, rasterizing svg for pdf output seems somewhat suboptimal, when it could be converted straight to pdf (both being vector formats and all). But I don't think there are libraries for _that_...

I think it's doable, I'll have a shot at it.

rasterific-svg 0.2.3 now support PDF output (for a better function documentation see renderSvgDocument, it uses the same parameters).

@Twinside Do you think SVG support could be added to pandoc using your library?

There is #2211 proposed a long time ago, providing svg rasterization or svg conversion to PDF depending on the need. Conversion to PDF didn't handle inclusion of images though.

I'm sorry, I hadn't noticed. Thanks for the pointer. Looks like the main obstacle is the indirect dependency on lens. @jgm and @mpickering both have raised their concern on the size of the _lens_ library, maybe this discussion can be revisited to a definite conclusion?

From a quick look at the PR, it looks to me like gif and png would no longer be supported. Did I misread the change? I'll try to spend more time on this, it would be nice if we could get the issue at hand resolved.

PNG & gif are still supported, it's the way to determine their DPI that would change, by using JuicyPixels's metadata information. But maybe I missed something (it was a loooong time ago).

Figured I'd be missing something there, thanks.

it was a loooong time ago

:pensive:

+1 for SVG conversion. Having real trouble finding a way of converting github markdown to PDF because of the lack of SVG support in pandoc.

Putting this in the pandoc 2.0 milestone, so I don't forget about it.
Not promising to implement it (the lens issue is still one to consider), but at least a decision should be made.

This would also be solved by switching the default PDF engine to ConTeXt or wkhtmltopdf...

@JamesH65 you can already use pandoc -t html5 -o foo.pdf (to use wkhtmltopdf) or pandoc -t context -o foo.pdf (to use ConTeXt) for PDF generation now.

This would also be solved by switching the default PDF engine to
ConTeXt or wkhtmltopdf...

I don't think I'd want to change the default. But it's good
to emphasize that people who use SVGs can already use -t context or -t html with -o output.pdf and use these
engines, which support SVG already.

I should mention another workaround, for those who need to
support SVGs. Just use a program of your choice to convert
the SVG to PDF, so you have two images, one with an svg
extension, one with a pdf extension. Then leave off the
extensions in your Markdown source and use
--default-image-extension as appropriate for the output
format.

A filter would be another approach.

Another approach is to rely on an external dependencies, if including a big library is a concern. And since LaTeX is already an external dependency of pandoc, I guess this will be fine for many potential users of this feature.

e.g. in graphics - How to include SVG diagrams in LaTeX? - TeX - LaTeX Stack Exchange, it's mentioned that there's a LaTeX package relying on external dependencies that automatically choose from a few external dependencies depending on availabilities of the system.

This way, pandoc can still provide a native support without committing to a large dependency.

Did some tests. Had good results with rsvg-convert (installed with librsvg), converting to both png and pdf. Bad results with svg2pdf. Bad results with rasterific. Tempted to shell out to rsvg-convert.
I'd rather not use the svg package since that requires -shell-escape.

Did you had specific bad example files?

This one: https://upload.wikimedia.org/wikipedia/commons/3/30/Vector-based_example.svg

+++ Vincent [Aug 10 17 00:50 ]:

Did you had specific bad example files?

—
You are receiving this because you modified the open/close state.
Reply to this email directly, [1]view it on GitHub, or [2]mute the
thread.

References

  1. https://github.com/jgm/pandoc/issues/1793#issuecomment-321478306
  2. https://github.com/notifications/unsubscribe-auth/AAAL5Hh4X4ezeukXRYZpRkBhZ-2Dm0Fvks5sWrZUgaJpZM4DFm9V

Thanks, I've fixed rasterific-svg to render it correctly, even if it doesn't matter in this case.

would it make sense to use the same approach for Docx output?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

naught101 picture naught101  Â·  5Comments

transientsolutions picture transientsolutions  Â·  3Comments

guifh picture guifh  Â·  4Comments

RLesur picture RLesur  Â·  3Comments

timtroendle picture timtroendle  Â·  3Comments